I would be surprised if Qualcomm only did reference designs after the 820. They were forced to switch to the reference 64 bit ARM design after Apple leapfrogged mobile chipmakers and it really hurt them. They had a lot of trouble with the big.LITTLE architecture and power management which led to 810 power issues which somewhat led to Samsung dropping them for the S6.
With the 820 Qualcomm was able to make their own custom cores (I think they went with 2 or 4 instead of 8) and generally overcome the 810 trainwreck. Qualcomm's ability to integrate the radio, GPU, DSP, crypto chip and some other things with a custom CPU into a true SOC is really a massive advantage within mobile and I expect them to build on that with more custom designs.
One last note about big.LITTLE. It seems like a panacea - just move all the big jobs to the big processors and the little ones to the little processors, yay! But this is much, much harder in practice. Doing this wrong leads to disastrous UX, screen tears because rendering shifts off of the big cores being one prime example. Process scheduling is already very tough and current schedulers are the result of many years of experimentation and heuristic best practices. Of course Apple controls the whole stack (i.e. they don't need to worry about where Kernel maintainers want to go with the schedulers) and I'm sure they have some amazing engineers working on Darwin so maybe they were able to overcome problems here. If you can schedule correctly it does seem like a big power savings win. But (and I may be behind the times here) that is a big if...
Well at least for screen renders you can make sure render threads stay on the big cores and with GCD you can use thread priorities to determine what to put on which core. iOS has concepts of active/inactive and is mostly a one-app-at-a-time os. They also have a well defined background processing modes and extension API. With all of this structure, I don't think it will be as hard as we think it will be.
Even I feel that by not doing custom designs they will be consumed by lower priced competition like Mediatek. This piece of info was by someone who claimed works/worked at Qualcomm so I didn't dismiss it initially. We will probably know about it by end of the year.
With the 820 Qualcomm was able to make their own custom cores (I think they went with 2 or 4 instead of 8) and generally overcome the 810 trainwreck. Qualcomm's ability to integrate the radio, GPU, DSP, crypto chip and some other things with a custom CPU into a true SOC is really a massive advantage within mobile and I expect them to build on that with more custom designs.
One last note about big.LITTLE. It seems like a panacea - just move all the big jobs to the big processors and the little ones to the little processors, yay! But this is much, much harder in practice. Doing this wrong leads to disastrous UX, screen tears because rendering shifts off of the big cores being one prime example. Process scheduling is already very tough and current schedulers are the result of many years of experimentation and heuristic best practices. Of course Apple controls the whole stack (i.e. they don't need to worry about where Kernel maintainers want to go with the schedulers) and I'm sure they have some amazing engineers working on Darwin so maybe they were able to overcome problems here. If you can schedule correctly it does seem like a big power savings win. But (and I may be behind the times here) that is a big if...