For coding use cases you may want a way to search for symbols themselves or do a plain text exact match for the name of a symbol to find the relevant documents to include. There is more to searching than building a basic similarity search.
Sorry but who mentioned coding as a use-case? My comment was general and not specific to the coding use-case, and I don't understand where did you get the idea from that I am arguing that building a similarity search engine would be a substitute to the symbol-search engine or that symbol-search is inferior to the similarity-search? Please don't put words into my mouth. My question was genuine without making any presumptions.
Even with the coding use-case you would still likely want to build a similarity search engine because searching through plain symbols isn't enough to build a contextual understanding of higher-level concepts in the code.
I mentioned coding as a use case in my comment you replied to. You were asking for an example for when one wouldn't use vector search and I provided one. I did not say similarity search would be a substitute. I said that for the coding case you do not need it.
>you would still likely want to build a similarity search engine
In practice tools like Claude Code, Codex, Gemini, Kimi Code, etc are getting away with searching for code with grep / find and understanding code by loading a sufficient amount of code into the context window. It is sufficient to understand higher level concepts in the code. The extra complexity of maintaining vector database top of this is not free and requires extra complexity.
In your point you said "There is more to searching than building a basic similarity search." which assumed and implied all kinds of things and which was completely unnecessary.
> In practice tools like Claude Code, Codex, Gemini, Kimi Code, etc are getting away with searching for code with grep / find and understanding code by loading a sufficient amount of code into the context window
Getting away is the formulation I would use as well. "Sufficient amount" OTOH is arguable and subjective. What suffices in one usage example, it does not in another, so the perception of how sufficient it really is depends on the usage patterns, e.g. type and size of the codebases and actual queries asked.
The crux of the problem is what amount and what parts of the codebase do you want to load into the context while not blowing up the context and while still maintaining the capability of the model to be able to reason about the codebase correctly.
And I find it hard to argue that building the vector database would not help exactly in that problem.
But we need contracts that go way further what static typing provides. If they add dependant types + ability to enforce the types at runtime so that you can use it on various inputs, then maybe it will be truly useful.
This looks nice! Wish they had a no-credit-card-required version for educational purposes. For the course I teach we use Spring Boot, and life was good with Heroku till they discontinued the no-credit-card version, and then the only choice we had (with support for Spring Boot) was to move over to Azure, which works but is a bit overkill and complicated for our purposes. I guess we could just use Docker and then many more platform would become available, but I'd rather not add one more step to the pipeline if possible.
Don't know how much you have used ember, but I disagree, it's quite sane as a programming model and ember data is still ahead in terms of developper comfort for client apps.
100% this! I'm amazed at how most issues with React are non-issues with Ember, and still saddened by how often React dev are completely unaware of how these issues have been solved elsewhere.
reply