Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Not directly related to what Ollama aims to achieve. But, I’ll ask nevertheless.

Local LLMs are great! But, it would be more useful once we can _easily_ throw our own data for them to use as reference or even as a source of truth. This is where it opens doors that a closed system like OpenAI cannot - I’m never going to upload some data to ChatGPT for them to train on.

Could Ollama make it easier and standardize the way to add documents to local LLMs?

I’m not talking about uploading one image or model and asking a question about it. I’m referring to pointing a repository of 1000 text files and asking LLMs questions based on their contents.



For now RAG is the best “hack” to achieve this at very low cost since it doesn’t require any fine tuning

I’ve implemented a RAG library if you’re ever interested but they are a dime a dozen now :)

https://www.github.com/jerpint/buster


Sounds like Retrieval Augmented Generation. This is the technique used by e.g. most customized chatbots.


I don’t know if Ollama can do this but https://gpt4all.io/ can.


Basically, I want to do what this product does, but locally with a model running on Ollama. https://www.zenfetch.com/


Hey - Akash from Zenfetch here. We’ve actually tested some of our features with local models and have found that they significantly underperform compared to hosted models. With that said, we are actively working on new approaches to offer a local version of Zenfetch.

In the meanwhile, we do have agreements in place with all of our AI providers to ensure none of our users information is used for training or any other purpose. Hope that helps!


Hey. Congratulations on your product. I’m guessing it will be greatly useful for your target audience.

I don’t have a serious need that I’d think worth paying for. So, I’m probably not in your target. I wanted to do this for a personal use case.

Throw all my personal documents at a local model and ask very personal questions like “the investment I made on that thing in 2010, how did I do against this other thing?” Or “from my online activity, when did I start focusing on this X tech?” Or even “find me that receipt/invoice from that ebike I purchased in 2021 and the insurance I took out on it”.

There is no way I’m taking the promise of a cloud product and upload all my personal documents to it. Hence my ask about the ability to do this locally - slowly is perfectly fine for my cheap need :-)


Makes a lot of sense. This might work for your use case: https://khoj.dev/. It's local, free, and open-source.


Interactive smart knowledge bases is such a massively cool direction for LLMs. I’ve seen Chat with RTX at the NVIDIA preview at CES and it’s mindblowingly simple and cool to use. I believe that interactive search in limited domains is gonna be massive for LLMs


Ooh, I want this too.


Llama_index basically does that. You have even some tuto using Streamlit that creates a UI around it for you.


> I’m never going to upload some data to ChatGPT for them to train on.

If you use the API, they do not train on it.

(However, that doesn't mean they don't retain it for a while).

As others have said, RAG is probably the way to go - although I don't know how well RAG performs on local LLMs.


Data is the new oil.

You can be 100% sure that OpenAI will do whatever they want whenever they want with any and every little bit of data that you upload to them.

With GPTs and their Embeddings endpoint, they encourage you to upload your own data en masse.


There's two main ways to "add documents to LLMs" - using documents in retrieval augmented generation (RAG) and training/finetuning models. I believe you can use RAG with Ollama, however Ollama doesn't do the training of models.


You can "use RAG" with Ollama, in the sense that you can put RAG chunks into a completion prompt.

To index documents for RAG, Ollama also offers an embedding endpoint where you can use LLM models to generate embeddings, however AFAIK that is very inefficient. You'd usually want to use a much smaller embedding model like JINA v2[0], which are currently not supported by Ollama[1].

[0]: https://huggingface.co/jinaai/jina-embeddings-v2-base-en

[1]: https://github.com/ollama/ollama/issues/327


Maybe take a look at this? https://github.com/imartinez/privateGPT

It's meant to do exactly what you want. I've had mixed results.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: