That's my bias as well. To me, it seems like every day someone releases a new AI toy, but the thing you would actually want is for a real software engineer to take the LLM or whatever, put it inside a black box, and then write actually useful software around it. Like off the top of my head, LLM + Google Calendar = useful product for managing schedules and emailing people. You could make it in a day of tinkering as a langchain demo, but actually making a real product that is useful and doesn't suck will require good old fashioned software engineering.
Based on the multitask generalisation capabilities shown so far of LLMs I’m kinda in the opposite camp - if we can figure out more data efficient and reliable architectures base language models will likely be enough to do just about anything and take general instructions. Like you can just tell the language model to directly operate on Google calendar with suitable supplied permissions and it can do it no integration needed
Exactly this. There is a reasonable chance the GUI goes the way of the dodo and some large (75% or something) percentage of tasks are done just by typing (or speaking) in natural language and the response is words and very simple visual elements.
People are building toy demos in a day that are not actual useable products. It’s cool, but it’s the difference between “I made a Twitter clone in a weekend” and real Twitter.
1 - companies are deploying real products internally for productivity, especially technical and customer support, and in data science to enable internal people to query their data warehouse in natural language. I know of 2 very large companies with the first in production and 1 with the second, and those are just ones I'm aware of.
2 - you are conflating the problems of engineering a system to do a thing for billions of users (an incredibly rare situation requiring herculean effort regardless of the underlying product) with the ability of a technology to do a thing. The above mentioned systems couldn't handle billions of users. So what? The vast majority of useful enterprise saas could not handle a billion users.
OpenAI from a research point of view haven’t really had any “big innovations”. At least I struggle to think of any published research they have done that would qualify in that category. Probably they keep the good stuff for themselves
But Ilya definitely had some big papers before and he is widely acknowledged as a top researcher in the field.
I think the fact that there are no other systems publicly available that are comparable to GPT-4 (and I dont think Bard is as good), points to innovation they havent released