Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm quite surprised at how little progress FAANG companies have made in recent years, as I believe much of what's happening now with ChatGPT was predictable. Here's a slide from a deck for a product I was developing in 2017: https://twitter.com/LechMazur/status/1644093407357202434/pho.... Its main function was to serve as a writing assistant.


Scaling up an LM from 2017 would not achieve what GPT-4 does. It's nowhere near that simple. Of course companies saw the potential of natural language interfaces, there has been billions spent on it over the years and a lot of progress was made prior to ChatGPT coming along.


You're making incorrect assumptions. This project wasn't about scaling any published approaches. It was original neural net research that produced excellent results with a new architecture without self-attention, using a new optimizer, new regularization and augmentation ideas, sparsity, but with some NLP feature engineering, etc. Scaling it up to GPT-2 size matched its performance for English (my project was English-only and it was bidirectional unlike GPT so not a perfect comparison), and very likely scaling it up to GPT-3 size would have matched it as well, since GPT-3 wasn't much of an improvement over GPT-2 besides scale. Unclear for GPT-4 since there is very little known about it. Of course, in the meantime, most of these ideas are no longer SOTA and there has been a ton of progress in GPU hardware and frameworks like PyTorch/TF.

You can check out my melodies project from a year ago as a current example. There is nothing matching it yet: https://www.youtube.com/playlist?list=PLoCzMRqh5SkFPG0-RIAR8.... And that's just my personal project.

What you're saying about companies recognizing the commercial potential is clearly wrong. It's six years later and Siri, Alexa, and Google Home are still nearly as dumb as they were back then. Microsoft is only now working on adding a writing assistant to Word, and that's thanks to OpenAI. Why do you think Google had to have "code red" if they saw the potential? Low-budget startups are also very slow - they should've had their products out when the GPT-3 API was published, not now.

One thing I didn't expect is how well this same approach would work for code. I haven't even tried to do it.


Do you have any publications to back up your claims about your work? They seem more than a bit grandiose. If you're ideas are as novel and useful as you say then you should publish them.

And I'm sorry, but you're completely wrong about companies recognizing commercial potential. I worked on Alexa for five years, it is a far harder problem than you think. It is nowhere near as simple as "we just weren't looking at the right NN architecture or optimizer!" You're acting like it was a novel idea to think LMs would be extremely useful if the performance was better (in 2017). I'm just trying to tell you that isn't the case.


No, I have no plans to openly publish any of it. Some of my researcher employees have published their own stuff. I've previously written about how it was a huge commercial mistake for Google and others to openly publish their research, and they should stop. Indeed, now OpenAI has not published a meaningful GPT-4 paper, and DeepMind has also become more cautious. This mistake has cost them billions, and for what? Recruiting? Now they lost many people to OpenAI and wasted time and effort on publishing. Publishing is fine for university researchers or those looking to get hired. I did record some research ideas in an encrypted format on Twitter for posterity: https://twitter.com/LechMazur/status/1534481734598811650.

If any of the FAANG companies recognized the commercial potential and still accomplished so little, they must be entirely incompetent. When this 2017 deck was created, I had 50k LOC (fewer would be needed now using the frameworks and libraries) plus Word and Chrome plugins. The inference was still too slow and not quite feature-complete, and it was just a writing assistant with several other features in early testing, but it seems more than enough for me to know quite well how difficult is the task.


The fact that you think creating a writing assistant plugged into Word is equivalent to building a general purpose, always-on voice assistant tells me all I need to know.


What? We were talking about making a language model. I mentioned the plugins in relation to the question of commercializing. I'm very clear about what my project was and was not doing. I get that you're bitter because Alexa became a joke with how little progress was made and the struggles of Amazon in getting the top talent are well known. How much did it cost Amazon to fail like this? Is the whole team gone? Is that the billions you mentioned?


"It's six years later and Siri, Alexa, and Google Home are still nearly as dumb as they were back then". You can't even keep a coherent discussion and you are delusional about the significance of your work. You shared a slide with nothing but generic pie-in-the-sky use-cases and you act like it gives you some credibility on the subject ("let's make an AI system that can do the work of your non-professional employees!"). And to top it off you act like you've been successful here! Again, you shared nothing but a slide with generic use-cases that a 12 year old could think up. I don't know what you think you proved. Enjoy your imaginary pedestal.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: