trishume hits the nail on the head here. The mutation strategy is well aware of how this system works.
Each core gets a completely unique fuzz case, where each lane of the vector gets a small mutation. In _many_ cases this mutation doesn't even affect flow (eg, the mutated parts are skipped over or never parsed due to errors). Meaning all 16 run to completion. What's really important here is that when the small modification you made to an individual lane does actually cause it to diverge, you now know where and when that part of the input is used in the program. This information is huge and can be used to tweak weights and other parameters of mutators/generators, making them learn which fields to use when and how often.
With some better logic (covered in a later blog) I'll talk more about handling fully divergent cases by having graph analysis to find post dominators in functions and run VMs until they can sync up. Rather than the current model of "sync if you can", this will be a smart forward-looking sync that will ensure that by the end of every function all VMs will be running again (even if that means I have to insert artifical post dominator nodes to graphs).
What kind of application are you running such that you don't get large path differences based on the input?
All fuzzing I have done (with AFL and the like) the paths vary wildly and will end up in completely different parts of the stack.
Surely the advantage you're gaining by parallelising is completely wiped out when you lose sync (which to me must be most of the time).
I just can't see how you can possibly keep sync between parallel runs in anything but the most trivial application-under-test.
To me, it seems that this will necessarily degrade to a single path being active. Have you done any analysis on how many paths are active simultaneously over a non-trivial run?
Each core gets a completely unique fuzz case, where each lane of the vector gets a small mutation. In _many_ cases this mutation doesn't even affect flow (eg, the mutated parts are skipped over or never parsed due to errors). Meaning all 16 run to completion. What's really important here is that when the small modification you made to an individual lane does actually cause it to diverge, you now know where and when that part of the input is used in the program. This information is huge and can be used to tweak weights and other parameters of mutators/generators, making them learn which fields to use when and how often.
With some better logic (covered in a later blog) I'll talk more about handling fully divergent cases by having graph analysis to find post dominators in functions and run VMs until they can sync up. Rather than the current model of "sync if you can", this will be a smart forward-looking sync that will ensure that by the end of every function all VMs will be running again (even if that means I have to insert artifical post dominator nodes to graphs).