Hacker News new | past | comments | ask | show | jobs | submit | namin's comments login

Harvard SEAS, ONSITE (Boston area, MA, US)

I have one opening in a new project around AI for precision medicine. One thrust is about combining biomedical knowledge graphs and large language models into a biomedical-fluent core platform. Another thrust is about UI for data-rich programmable systems. The opening is for either a research engineer or a postdoc, with different criteria in each case. The research engineer would be responsible for quarterly demo-driven deliverables. The postdoc would be expected to develop a research agenda with these applications in mind.

Contact: namin@seas.harvard.edu


This seems cool! Is there a way to try it locally with an open LLM? If you provide a way to set the OpenAI server URL and other parameters, that would be enough. Is the API_URL server documented, so a mock a local one can be created?

Thanks.


Definitely, it is something we are super focused on as it seems to be a use case that is important for folks. Opening up the proxy server and adding local LLM support is my main focus for today and will hopefully update on this comment when it is done :)


I just added the ability to run the proxy locally: https://github.com/squaredtechnologies/thread/commit/7575b99...

Will update once I add Ollama support too!


Ollama support would be amazing. There's a stack of people in organizations (data rich places) who would likely love something like this, but who cannot get to OpenAI due to organizational policies.


Awesome, thank you! I'll check it out.


From https://news.ycombinator.com/item?id=39363115 :

> https://news.ycombinator.com/item?id=38355385 : LocalAI, braintrust-proxy; [and promptfoo, chainforge]

(Edit)

From "Show HN: IPython-GPT, a Jupyter/IPython Interface to Chat GPT" https://news.ycombinator.com/item?id=35580959#35584069


Not dumb! The verifier lets slide an error on the last line, while it's still in progress.


Yes. The verifier check already ensures syntactic correctness, but the search could goes faster if the underlying LLM doesn't generate bad syntax to begin with.


Interesting! Is the end goal similar to the outlines library: https://github.com/outlines-dev/outlines ?


Yes basically to create some level control for guided program generation.


Thanks. I adapted this old MCTS library: https://github.com/ImparaAI/monte-carlo-tree-search which is also very short and understandable.


Thanks for taking a look!

This is not the main loop. This just generates one completion that succeeds (or gives up) deferring to the Dafny checker to decide.

The main loop is the simulate function of the MCTS library. https://github.com/namin/llm-verified-with-monte-carlo-tree-...

The main advantage of MCTS is that it takes care of the exploitation/exploration trade off based on the scores propagated by the child finder.

I hope this helps. Let me know if this addresses your question.


I mean... isn't the task that you're optimizing 'has a function and a set of lemma that all pass'.

How does the MCTS distinguish between 'generated a stupid lemma that is true' and 'generated a valid lemma'?

Is there any reason to expect that a 'good' partial subtree will result in a 'good' output?

Why do you think this approach will generate valid lemmas?

(Not structurally valid; valid as in, they assert a solution to the given problem).

It seeeems a lot like going like this:

"Generate me a function for X and a bunch of tests that verify X".

"If the tests pass, the function is good"

...but, there's no specific reason to expect that to be true?

How is what you're doing different?

Clearly (from the examples) in a trivial case, it is true, but generally speaking as the task complexity increases, this type of 'auto validation' seems to struggle...?

Using grammars to generate a structured output seems like a similar (and successful) approach, used by many people, but because it doesn't have the associated auto-validation, it's robust against LLM randomness.

I guess I'm struggling to see how this is superior / novel to existing approaches in that space.


> How does the MCTS distinguish between 'generated a stupid lemma that is true' and 'generated a valid lemma'?

It does not, though I would like to add validation as part of the check as future work.

> Is there any reason to expect that a 'good' partial subtree will result in a 'good' output?

Yes. I am using Phind Code Llama and it generates a lot of invalid Dafny out of the box. Now, all these false starts get pruned.

I do suspect that if we try to scale this to Coq proof, we will run in local optima rabbit holes.

> Why do you think this approach will generate valid lemmas? > (Not structurally valid; valid as in, they assert a solution to the given problem).

Because the LLM tries to follow instructions.

> "Generate me a function for X and a bunch of tests that verify X". > "If the tests pass, the function is good" > ...but, there's no specific reason to expect that to be true?

Indeed, there are complementary approaches like Speculyzer that help validate specs and code at once.

> How is what you're doing different?

That's not the problem I am solving at the moment. What I solve is that I am able to generate verified code (that still need to be inspected) while the LLM struggles to do in one go.

> I guess I'm struggling to see how this is superior / novel to existing approaches in that space.

It's not that novel, see the planning in LLM work that I credit in the README.md. This transfers from a setting of checks by tests to a setting of checks by a verifier. The code is very simple, and it actually brings open models to reliably solve problems that even ChatGPT only sometimes solve. I think it's a good foundation to build more sophisticated verification with LLMs.


You can play with the software online here: http://io.livecode.ch/learn/namin/scheme-mechanics

The notation chapter with live snippets is reproduced here: http://io.livecode.ch/learn/namin/scheme-mechanics/chapter9


What browsers are you using? Can you file an issue?

Thanks.


Tried in latest Edge, Firefox (50) and Chrome (54) on Windows.

Clicking Scala gives me Scala. Clicking Java again gives me Java. But none of the variants underneath make any difference to the code.



Latest Chrome on Linux. Clicking on source code links doesn't change the example. Clicking on Java / Scala does change to the corresponding language.


Chrome on MacOS and Chrome on iOS


Apart the character of physical laws and the easy pieces, what's the gateway book to understand feynann more advanced work?


I don't really know, because I don't understand the more advanced work like QED. But http://www.feynmanlectures.caltech.edu/I_toc.html is one of my favorite books ever.


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: