I'm not clear as to how a conversation block is turned into one (or many?) samples. Is the first message in the block the input, and the remaining messages prefixed with sender names and concatenated as output? I know the code is all there but instead of picking it apart I would have preferred a more complete example mapping a block to sample, because I don't have a mental model of how LLMs learn from context. On the one hand I doubt individual input/output prompts from just two messages contain enough context, but I would have imagined that inserting names and concatenating multiple messages would be be equally misleading. Does the model generate an entire conversation from a single prompt, which is split before being sent by the chat app?
Yeah, I wished I could have included more but I didn't have the fortitude to redact larger blocks of the chat db.
For training, I created many samples that looked like this, where I take n messages from the database, pop off the nth one and use the text of that last one as the "output", then specify in the "instruction" who the sender of that message is. I provide the remaining messages in order as context, so the model learns what to say in certain situations, based on who is speaking.
{
"instruction": "Your name is Izzy. You are in a group chat with 5 of your best friends: Harvey, Henry, Wyatt, Kiebs, Luke. You all went to college together. You talk to each other with no filter, and are encouraged to curse, say amusingly inappropriate things, or be extremely rude. Everything is in good fun, so remember to joke and laugh, and be funny.. You will be presented with the most recent messages in the group chat. Write a response to the conversation as Izzy.",
"input": "Izzy: im writin a blog post about the robo boys project\nIzzy: gotta redact tbis data HEAVILY\nKiebs: yeah VERY heavily please!\nKiebs: of utmost importance!",
"output": "yeah don't worry i will i will"
}
So yes, the model does generate an entire conversation from a single prompt. In the generation code, however, I have some logic that decides whether or not it should generate completions based off just the user provided prompt, or if it should also include some "context" based on the previous messages in the conversation. You can see this here: https://gist.github.com/izzymiller/2ea987b90e6c96a005cb9026b...