More

byt143 · on Jan 26, 2024

So Israel can't enjoy the same right to self defense that any other state would? They can't conduct a war in an urban environment with an actual intentionally genocidal enemy, and must resort to targeted assassinations? That standard is absurd. Surely you can admit some middle ground ,if you're discussing in good faith.

byt143 · on Jan 25, 2024

Second !

byt143 · on Aug 4, 2023

How about the misery of a loved one going untreated, undiagnosed and dismissed by physicians with a facile understanding of chronic multi system disease dynamics. People who do that are usually ones who cannot find physical relief elsewhere.

byt143 · on Aug 4, 2023

Don't expect a good answer. That brand of skepticism is performative rationality devoid of actual critical thinking.

byt143 · on July 27, 2023

What tasks?

treprinum · on July 27, 2023

For processing trillion documents for example NER can be done much better.

chaxor · on July 28, 2023

This tradeoff is ridiculous, even if it is "better" by .01% F score. I would much rather have a dataset created in 1 day from BERT at 98% F-score than 1000 years at 98.01% F-score from a 540B parameter model, or even a 33B parameter model. The performance in million parameter models for NER is still excellent, and works at speed that are usable. Running things through OpenAI is also useless, as it would cost a few million $.

treprinum · on July 28, 2023

It's more like 100% accuracy vs 95% accuracy, and the super large models are now able to extract non-trivial derived info from a regular human speech as well. While cost-wise it's not efficient right now, this will change over time (you skate to where puck will be, not where it is now), making the current fine-tuning way obsolete. Academically I am not thrilled as I built my research on fine-tuning, but as a producer of a product this solves so many issues at the same time, making me pretty happy.

byt143 · on July 28, 2023

It's really depressing that a handful of big corporations will be able to exert such control over labor and productivity

treprinum · on July 28, 2023

That's why the LLaMA 2 release is so significant. You can run the full 70B in 8-bit on two prosumer A6000 Ampere (cost around $10k together) which is within the reach of most companies and some devs. This could further accelerate all research to make it both efficient and available even to regular folks.

amkkma · on July 28, 2023

But it's still not comparable to GPT 4, nor will it likely be for some time at least. And by the time we have GPT 4 class open source models, I'd imagine there'd be significant advancements in closed source models, such as inference time symbolic reasoning using MCTS, something google is working on for gemini...or just bigger/better architectures, data etc

treprinum · on July 28, 2023

Yeah, I don't really see a solution. Even Stanford can't keep up with the latest AI research. Still, having LLaMA 2 is better than not having it.

jerrygenser · on July 28, 2023

You are literally using trillion documents? Or are you exaggerating?

treprinum · on July 28, 2023

chaxor above mentioned it so I quickly recalled a task I saw a super large LLM demolishing fine-tuned models on documents.

byt143 · on July 6, 2023

What's the difference between chat and instruction tuning?

tudorw · on July 6, 2023

no expert, but from my messing around I gather the chat models are tuned for conversation, for example, if you just say 'Hi', it will spit out some 'witty' reply and invite you to respond, it's creative with it's responses. On the other hand, if you say 'Hi' to an instruct model, it might say something like, I need more information to complete the task. Instruct models are looking for something like 'Write me a twitter bot to make millions'... in this case, if you ask the same thing again, you are somewhat more like to get the same, or similar result, this does not appear so true with a chat model, perhaps a real expert could chime in :)

msp26 · on July 6, 2023

System/assistant/user prompting

byt143 · on May 2, 2023

How does mojo handle function polymorphism and abstraction? Julia uses multiple dispatch, Haskell type classes etc

Edit: I see you're going to have protocols/ traits. Can those be specializes/monomoprhized at function call time like Julia abstract types?

And how about function specialization? Will functions be attached to structs in a single dispatch fashion or free floating multimethods?

byt143 · on April 18, 2023

Interesting. What's your prompt?

hn_throwaway_99 · on April 20, 2023

Details here, https://news.ycombinator.com/item?id=35390644

funnbot · on April 20, 2023

Here's a web app I found recently that should work way better for you. Idk what model it uses (its also free, it feels like chatgpt3.5, so I guess they are funding it out of pocket?)

https://goblin.tools/

byt143 · on March 20, 2020

If you're only looking for one novel view, can it use less views that are close to the novel one?

byt143 · on Feb 17, 2020

Even then shape errors would require a dependent type system not found in most static languages.

m0zg · on Feb 17, 2020

There are levels of survival I'm prepared to accept.