Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I mean papers come with abstracts, but yeah:

> Open AI

> We collect a large, high-quality dataset of human comparisons between summaries, train a model to predict the human-preferred summary, and use that model as a reward function to fine-tune a summarization policy using reinforcement learning. We apply our method to a version of the TL;DR dataset of Reddit posts and find that our models significantly outperform both human reference summaries and much larger models fine-tuned with supervised learning alone.

tl;dr^2 they did ChatGPT to summaries

https://openai.com/blog/chatgpt#methods



Your tl;dr shows how text+reader_context can generate the best summaries. Those 5 words are perfect if you know what they did for ChatGPT.

This makes me think that to get high quality summaries: 1) they have to be generated for each individual reader 2) the AI should know what the reader knows

You achieved 2 by imagining what HN people may already know, but the ultimate goal would be to know what the individual reader knows.

And this chain of thought leads to - Perhaps all AI output should be generated on the fly for the end user with full (relevant and compressed) context. - Giving AI what we know is extremely dangerous if someone wants to use it for something bad (so we really, really need local AI).

Sure, these are all things people in the industry already know, but there must be a lot of people like me who are just now thinking about it.

A note: "^2" added humor to make the reading light with very few bytes. Perhaps the machine would do this too, if the reader wants.

Only few things make me very very excited and sometimes very very sad like some AI developments. But the exciting part wins for now!


Yeah, I think context is really important for future summarizers. If you think of it like information theory, the goal is what information to convey _for the message to be successfully decoded_. So the amount of information needed is not universal but context-dependent on the decoder (here the reader).

Being able to generate summaries at various levels of depth would be a great efficiency to consuming a lot of content. No more skimming through articles and books written like the author was paid by the character. But if you want more depth, it’s there. Like an LoD slider for information




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: