There actually are PDFs out there for the various GPU IPs on how to write best for them (Adreno, PowerVR, etc.). Sometimes they even disagree, so using triangle strips with degenerate triangles to connect separate portions can be better than using all separate triangles on another, depending on their optimizations. Apple also has recommendations:
http://developer.apple.com/library/ios/#documentation/3DDraw...
Although I don't recall off hand if any of them have mentioned sorting commands by state and deduping, which I suppose is one of the most basic optimizations for OpenGL * APIs.
Although I don't recall off hand if any of them have mentioned sorting commands by state and deduping, which I suppose is one of the most basic optimizations for OpenGL * APIs.