Almost nobody has SHA-3 capable hardware though, and "getting it" is a lot of work (either pay money or work on an FPGA I guess). I'd argue SHA-3 being efficient in hardware is almost completely irrelevant for 99% of all users of cryptographic software. For the vast majority of people, I think fast software implementations are way more important -- especially as systems like NFV and SDN come into play at large scale (people want bog-standard x86 boxes for this stuff).
I almost wish SHA-3 had been a dual pick between a fast software hash and a fast hardware hash. As it stands, Keccak being so slow in software is majorly limiting IMO. The more interesting aspect of SHA-3 is the sponge, so you can really turn Keccak into an entire swiss-army knife of crypto tools, if you know what you're doing.
But as it stands, if I have to pick a modern hash, I almost always pick BLAKE2 instead of SHA-3, primarily because I rarely need the sponge design and also because it's dramatically faster in software. Stuff like this is really important on my Cortex-M4...