Yes, it does, although it's almost like horizontal microcode; it can do several things in a clock cycle other than the instruction itself. I didn't mean to imply that you could bitbang 100BaseT with a Padauk PFS150 or a PY32.
The Padauk FPPA chips are probably a bit better at bitbanging strange protocols than any ARM, but not in the same class as the Pi's PIO.
After consulting the datasheet, I think the most things you can do in a single pioasm instruction would be a conditional decrementing JMP with side-set, wrap, and delay:
• decrement the X or Y register;
• conditionally jump to a specified target address if it was nonzero;
• otherwise, jump (by "wrapping") to some other specified target address, usually an outer infinite loop;
• change the state of four or less GPIO pins to immediate constants;
• delay 1–15 clock cycles.
Arguably IN can compete here, replacing the first two items with:
• set one or more GPIO pins from bits shifted out of a shift register;
• conditionally "autopull" a 32-bit word from an input FIFO if the shift register is empty;
• conditionally stall the pioasm program if the FIFO is empty;
• conditionally initiate a DMA request to refill the FIFO if it's not full.
The OUT instruction has similar "autopush" capabilities.
Most of these are somewhat independent choices, but if you don't pull from the FIFO you won't have the other effects, and several of the options are state-machine-wide.
The Padauk FPPA chips are probably a bit better at bitbanging strange protocols than any ARM, but not in the same class as the Pi's PIO.