Not directly related to what Ollama aims to achieve. But, I’ll ask nevertheless....

jerpint · on Jan 25, 2024

For now RAG is the best “hack” to achieve this at very low cost since it doesn’t require any fine tuning

I’ve implemented a RAG library if you’re ever interested but they are a dime a dozen now :)

https://www.github.com/jerpint/buster

jampekka · on Jan 25, 2024

Sounds like Retrieval Augmented Generation. This is the technique used by e.g. most customized chatbots.

emmanueloga_ · on Jan 25, 2024

I don’t know if Ollama can do this but https://gpt4all.io/ can.

reacharavindh · on Jan 25, 2024

Basically, I want to do what this product does, but locally with a model running on Ollama. https://www.zenfetch.com/

rex123 · on Jan 25, 2024

Hey - Akash from Zenfetch here. We’ve actually tested some of our features with local models and have found that they significantly underperform compared to hosted models. With that said, we are actively working on new approaches to offer a local version of Zenfetch.

In the meanwhile, we do have agreements in place with all of our AI providers to ensure none of our users information is used for training or any other purpose. Hope that helps!

reacharavindh · on Jan 25, 2024

Hey. Congratulations on your product. I’m guessing it will be greatly useful for your target audience.

I don’t have a serious need that I’d think worth paying for. So, I’m probably not in your target. I wanted to do this for a personal use case.

Throw all my personal documents at a local model and ask very personal questions like “the investment I made on that thing in 2010, how did I do against this other thing?” Or “from my online activity, when did I start focusing on this X tech?” Or even “find me that receipt/invoice from that ebike I purchased in 2021 and the insurance I took out on it”.

There is no way I’m taking the promise of a cloud product and upload all my personal documents to it. Hence my ask about the ability to do this locally - slowly is perfectly fine for my cheap need :-)

rex123 · on Jan 25, 2024

Makes a lot of sense. This might work for your use case: https://khoj.dev/. It's local, free, and open-source.

camillomiller · on Jan 25, 2024

Interactive smart knowledge bases is such a massively cool direction for LLMs. I’ve seen Chat with RTX at the NVIDIA preview at CES and it’s mindblowingly simple and cool to use. I believe that interactive search in limited domains is gonna be massive for LLMs

NetOpWibby · on Jan 25, 2024

Ooh, I want this too.

asterix_pano · on Jan 25, 2024

Llama_index basically does that. You have even some tuto using Streamlit that creates a UI around it for you.

BeetleB · on Jan 25, 2024

> I’m never going to upload some data to ChatGPT for them to train on.

If you use the API, they do not train on it.

(However, that doesn't mean they don't retain it for a while).

As others have said, RAG is probably the way to go - although I don't know how well RAG performs on local LLMs.

martin82 · on Jan 26, 2024

Data is the new oil.

You can be 100% sure that OpenAI will do whatever they want whenever they want with any and every little bit of data that you upload to them.

With GPTs and their Embeddings endpoint, they encourage you to upload your own data en masse.

sciolist · on Jan 25, 2024

There's two main ways to "add documents to LLMs" - using documents in retrieval augmented generation (RAG) and training/finetuning models. I believe you can use RAG with Ollama, however Ollama doesn't do the training of models.

hobofan · on Jan 25, 2024

You can "use RAG" with Ollama, in the sense that you can put RAG chunks into a completion prompt.

To index documents for RAG, Ollama also offers an embedding endpoint where you can use LLM models to generate embeddings, however AFAIK that is very inefficient. You'd usually want to use a much smaller embedding model like JINA v2[0], which are currently not supported by Ollama[1].

[0]: https://huggingface.co/jinaai/jina-embeddings-v2-base-en

[1]: https://github.com/ollama/ollama/issues/327

CubsFan1060 · on Jan 25, 2024

Maybe take a look at this? https://github.com/imartinez/privateGPT

It's meant to do exactly what you want. I've had mixed results.