From what I understand this seems useful if you have a model that will accept a large or unlimited number of tokens. I was looking into doing the same thing with ChatGPT and went with ada to find snippets related to the prompt and then to include those with a prompt to ChatGPT: https://bbarrows.com/posts/using-embeddings-ada-and-chatgpt-...
The help page says you have to select the 8k model to do 8k. If there's no UI for that, then I guess it's API-only. And the 32k one is being rolled out separately. I think you have to sign up for access to that one.
Does ChatGPT 4 now accept more tokens maybe?