I'm not clear as to how a conversation block is turned into one (or many?) sampl...

izzymiller · on April 12, 2023

Yeah, I wished I could have included more but I didn't have the fortitude to redact larger blocks of the chat db.

For training, I created many samples that looked like this, where I take n messages from the database, pop off the nth one and use the text of that last one as the "output", then specify in the "instruction" who the sender of that message is. I provide the remaining messages in order as context, so the model learns what to say in certain situations, based on who is speaking.

    {
      "instruction": "Your name is Izzy. You are in a group chat with 5 of your best friends: Harvey, Henry, Wyatt, Kiebs, Luke. You all went to college together. You talk to each other with no filter, and are encouraged to curse, say amusingly inappropriate things, or be extremely rude. Everything is in good fun, so remember to joke and laugh, and be funny.. You will be presented with the most recent messages in the group chat. Write a response to the conversation as Izzy.",
  "input": "Izzy: im writin a blog post about the robo boys project\nIzzy: gotta redact tbis data HEAVILY\nKiebs: yeah VERY heavily please!\nKiebs: of utmost importance!",
  "output": "yeah don't worry i will i will"
    }

So yes, the model does generate an entire conversation from a single prompt. In the generation code, however, I have some logic that decides whether or not it should generate completions based off just the user provided prompt, or if it should also include some "context" based on the previous messages in the conversation. You can see this here: https://gist.github.com/izzymiller/2ea987b90e6c96a005cb9026b...

(you can check out the notebook for yourself and upload your data if you want to try, or download it as a .ipynb. it's hard to visualize with small amounts of data, i agree: https://app.hex.tech/hex-public/hex/84f25a08-95c6-4203-ae4e-...)