I've been trying to tell people to use perf for years now. I think there was a lot of disservice done by early versions that were buggy, crash and freeze prone. Still better then I oprofile before it.
It only got better once they added trace points to perf events. Now I can simply trace all futex syscall enters to look for lock/synchronization contention.
One extra tool I recommend is dynamic logging which you can enable from kernel build. Create any log mask (file/like me/module) set a level and you're off to the races. Invaluable in certain scenarios (like debugging bugs in fail path of fscache module).
It only got better once they added trace points to perf events. Now I can simply trace all futex syscall enters to look for lock/synchronization contention.
One extra tool I recommend is dynamic logging which you can enable from kernel build. Create any log mask (file/like me/module) set a level and you're off to the races. Invaluable in certain scenarios (like debugging bugs in fail path of fscache module).