> I am yet to see any reasoned argument for why it is far more difficult and will take far longer.
For language models specifically, they are trained on data and have historically been improved by increasing the size of the model (by number of parameters) and by the amount and/or quality of training data.
We are basically out of new, non-synthetic text to train models on and it’s extremely hard work to come up with novel architecture that performs well against transformers.
Those are some simple reasons why it will be far more difficult to improve general language models.
There are also papers showing that training models on synthetic data causes “model collapse” and greatly reduces output quality by magnifying errors already present in the model, so it’s not a problem we can easily sidestep.
It’s an easy mistake to see something like chatgpt not exist, then suddenly exist and assume a major breakthrough happened, but behind the scenes there has been like 50 years of R&D that led to it, it’s not like suddenly there was a breakthrough and now the gates are open.
A general intelligence for CS is like the elixir of life for medicine.
Even assuming there is a ton of data companies are just now getting access to, the logarithmic curve of LLM improvements is clearly visible (granted that our LLM evaluation frameworks are not very good)
For language models specifically, they are trained on data and have historically been improved by increasing the size of the model (by number of parameters) and by the amount and/or quality of training data.
We are basically out of new, non-synthetic text to train models on and it’s extremely hard work to come up with novel architecture that performs well against transformers.
Those are some simple reasons why it will be far more difficult to improve general language models.
There are also papers showing that training models on synthetic data causes “model collapse” and greatly reduces output quality by magnifying errors already present in the model, so it’s not a problem we can easily sidestep.
It’s an easy mistake to see something like chatgpt not exist, then suddenly exist and assume a major breakthrough happened, but behind the scenes there has been like 50 years of R&D that led to it, it’s not like suddenly there was a breakthrough and now the gates are open.
A general intelligence for CS is like the elixir of life for medicine.