How much overhead does the recording have?

I want to debug custom kernel filesystem issues on a 96 CPUs machine. My benchmark is building the Linux Kernel def config with make -j96. I have tried ftrace before but it's making everything 100x slower...

> How much overhead does the recording have?

Very much a symptom of what you are trying to record! See below

> I have tried ftrace before but it's making everything 100x slower...

If ftrace is making things 100x slower, I'm not sure that Perfetto is going to help you very much: fundamentally, for kernel stuff, it uses ftrace under the hood!