Hacker Newsnew | past | comments | ask | show | jobs | submit | zerop's commentslogin

The explanation of "hallucination" is quite simplified, I am sure there is more there.

If there is one problem I have to pick to to trace in LLMs, I would pick hallucination. More tracing of "how much" or "why" model hallucinated can lead to correct this problem. Given the explanation in this post about hallucination, I think degree of hallucination can be given as part of response to the user?

I am facing this in RAG use case quite - How do I know model is giving right answer or Hallucinating from my RAG sources?


I incredibly regret the term "hallucination" when the confusion matrix exists. There's much more nuance when discussing false positives or false negatives. It also opens discussions on how neural networks are trained, with this concept being crucial in loss functions like categorical cross entropy. In addition, the confusion matrix is how professionals like doctors assess their own performance which "hallucination" would be silly to use. I would go as far to say that it's misleading, or a false positive, to call them hallucinations.

If your AI recalls the RAG incorrectly, it's a false positives. If your AI doesn't find the data from the RAG or believes it doesn't exist it's a false negative. Using a term like "hallucination" has no scientific merit.


So you never report or pay heed to the overall accuracy?


"Hallucination" is just to term we use to say "this result is not what it should be". The model always uses the very same process, it does not do one thing for "hallucinations" and something else for "correct" results.

In a nutshell it is always predicting the next token from a joint probability distribution. That's it.

All other interpretations are speculative.


The use of the term "hallucination" for LLMs is very deceptive, as it implies that there is a "mind".

In ordinary terms, "hallucinations" by a machine would simply be described as the machine being useless, or not fit for purpose.

For example, if a simple calculator (or even a person) returned the value "5" for 2+2= , you wouldn't describe it as "hallucinating" the answer....


"Hallucination" happened because we got AI images before AI text, but "confabulation" is a better term.


Currently exploring cube for a "natural language to SQL" solution.

My schema is - 90+ Tables, 2500+ Columns, well documented

From your experience, does Cube look a fit? My use cases will definitely have JOINS.


yes, that shouldn't be a problem.

with that many tables, you might want to use Views: https://cube.dev/docs/reference/data-model/view


Thanks. sorry, asking more question - Do we need human in the loop with Cube to define the views for all kinds of queries.

In my use case, it's going to be exposed to various kind of stakeholders and there will be versatility of user queries. I can't pre-create views/aggregations for all scenarios.


How people are unearthing knowledge from ancient libraries using Claude

https://x.com/PrintedPathways/status/1865637416637231406


This looks great, thanks for building this.

Something on similar lines which many may link, Research Rabbit - https://www.researchrabbit.ai/


I am glad you liked it!

I wanted PaperMatch to be open-source so that the users can understand the workflow behind it and hack it to their advantage instead of grumbling away when the results aren't to their liking.


Can we give reference of these articles to LLMs and get them to write articles like this for educational contents and produce similar WebGL graphics code to render images. I mean, just use this style and produce educational content using AI. that might make the studies more interesting.


I'm guessing it will get nowhere close to as well considered, written, and structured as what Bartosz makes himself.

I don't know how people don't see how poor quality so much AI writing is, even when referencing good quality work.

Also making effective visualizations that do a good job of illustrating a concept is not just a matter of being able to write the code.


Can we give reference of these articles to LLMs and get them to write articles like this for educational contents and produce similar WebGL graphics code to render images. I mean, just use this style and produce educational content using AI. that might make the studies more interesting.


I fear that in the goal of going from "manual coding" to "fully automated coding", we might end up in the middle, where we are "semi manual coding" assisted by AI, which would need different software engineer skill.


Is it possible to build similar to anthropic computer use feature with Qwen vision model.

Someone open sourced it with langchain

https://x.com/1littlecoder/status/1856397375704576399


Browser use is very easy. Can even do that headless. That way, you can also do bulk processing. For a client, I did some 16k websites with a simple LLM agent. With “computer use” how long would that take, and what would it cost? For me, it was ~$20 (I used Gemini for this task).


Great, congrats on your launch.

1. Does it take care of Bot detection. Most sites will have it.

2. Is this something similar to Firecrawl - https://www.firecrawl.dev/


Yes, it has an extensive proxy IP and retry system in place to bypass bot detection.

I’m also trying to gather more feedback to identify the killer feature:

- Adding vectorization to Pinecone out of the box? - Adding multiple integrations like n8n, etc.?

Any crucial pain points to avoid?


Are you concerned about making a product that does this? The legal aspect of accessing a computer system that is intending to block your use seems worrisome.


It is the responsibility of the user. Everyone should be responsible for their own actions. We still allow knives to be sold, and most people use them for good.


Now imagine that knife stabbings became so common that almost everyone started wearing body armor and you start selling body armor defeating knives explicitly. I can honestly see why most people would be upset about that.


I don't see that as a good analogy. There's very limited space for this functionality to be used legitimately / legally - anyone permitted to scrape content is likely able to access the data without the protection measures in the way.

I'm fairly sure circumvention is a (prosecuted!) crime in several countries - curious if you're across that angle, and/or have legal advice/direction you can share?


Honest question - Why use interfaces like this, but not regular HTML client?


There are many choices for email client interfaces. HTML for email does not have a good reputation among hackers. After all, email can be considered an ancient technology and is historically based on plain text - HTML breaks not only the philosophy but also many of the tools developed around email.

I have found a sweet spot for an email client between a pure CLI and a full-featured (HTML) GUI client - I use Emacs Gnus, which takes full advantage of Emacs' text-based interface. As always with Emacs, the learning curve is a bit steep at first, but the rewards can be reaped afterwards.


> As always with Emacs, the learning curve is a bit steep at first

For any Emacs users who are interested in using Emacs for mail but don't want to deal with the learning curve of Gnus, check out mu4e, which is easier.

https://www.djcbsoftware.nl/code/mu/mu4e/

https://www.emacswiki.org/emacs/mu4e


The main reason I chose Gnus instead of mu4e or notmuch is that I did not want to sync all my mailboxes to local disk. What is perhaps not so well known is that IMAP provides its own server-side search engine. Searching mail with Gnus search queries [1] works really well, and I do not have to manage any overhead to get my mail synchronized, indexed, etc. In other words, everything I need for email is built into Emacs (or outsourced to the IMAP server) - no extra packages/software required.

[1] https://www.gnu.org/software/emacs/manual/html_node/gnus/Sea...


This. I kinda hate, but still understand, the general offlineimap/notmuch philosophy in this space. I am not in a bunker, I am not optimizing for a situation where I only have internet intermittently. I just don't want to leave emacs if I dont have to and want to be able to be quick and seamless between my code, mailing lists, rss feeds, org mode, and email in general. It was hard won, but I do get this with Gnus now. And yes, love how you can hijack almost all the IMAP/gmail niceties this way with a little bit of work, especially search.

One thing I have done is export the mbox archives of my old gmail accounts and keep them around in Gnus if I happen to need to search through old emails.


mu4e paired with mbsync is really amazing. All your email in Emacs, with super fast search, and the ability to integrate into things like org agenda.

I found this guide particularly useful for setting things up and even dealing with annoying outlook/office365 servers:

https://brettpresnell.com/post/email/

Does take a bit of doing, but so worth it.


> HTML breaks not only the philosophy but also many of the tools developed around email

I was one of these die-hard-text-only people, back in the mid to late 90s. It was true. People were sending HTML/rich text emails, and it broke everything, and it was awful to read with. Not to mention the kilobytes of bandwidth wasted!

But it's 2024 now. There are vastly more tools that can deal with HTML email than those that can't. Like, I wouldn't be surprised if it's 4 orders of magnitude.

Sorry, folks, we lost. Email is not plain text any more. We can't pretend that it is or should be.


> Email is not plain text any more. We can't pretend that it is or should be.

I send plain text emails and this is a hill I will die on. :-)

Do you not contribute to the development of any open-source projects that only accept patches via plain text emails sent to mailing lists (e.g., many GNU projects)?

Here's a tip for anyone who sends plain text emails, or wants to, and has to deal with annoying normies who complain about undesirable wrapping[1] when viewing plain text emails on mobile devices with small screens: configure your mail client to allow lines in emails to be up to 998 characters[2], which is longer than any paragraph you will likely write. I did this for my work email years ago.

[1] https://www.arp242.net/email-wrapping.html

[2] https://datatracker.ietf.org/doc/html/rfc5322#section-2.1.1


> I send plain text emails and this is a hill I will die on. :-)

I don't want to be mean, but yes, it is likely this hill will die with you :-)))

I doubt you can find many 18 year olds these days that would willingly use plain text emails.


As someone who started using plaintext emails recently, HTML emails are still awful in 2024. Besides being a ugly hack on top of an originally text-only protocol, it encourages bad practices like top-posting, bad alignment etc. What's really intolerable though, is the external and dynamic content in email. I expect email to be a long-term record, not something that changes after I receive it. They should find another tool for that. Besides, most GUI clients just block external content due to security risks anyway.


Perhaps LLMs can solve this somewhat? Not for email summarization - but to intelligently strip away all the HTML fluff and return a plain text version of the contents.


It is a solved problem. Here is a solution that requires something of the order of 1,000,000th of the resources of your proposed idea, no subscription, and runs so fast that you would not even notice it on a machine from 20 years ago:

    > grep text/html ~/.mailcap
    text/html; lynx -width 72 -assume_charset=%{charset} -display_charset=utf-8 -dump %s | sed 's|^   ||'; nametemplate=%s.html; copiousoutput
If you want something more modern:

    text/html; webdump -dli < %s | sed 's/^  //g'; needsterminal; copiousoutput


Whats webdump?



FWIW, it's pretty straightforward to extract text from an HTML snippet without LLMs, I'm not actually sure if there's anything they'd do better than a simple HTML parser.


Apple Intelligence already does this in the line summary.


> But it's 2024 now. There are vastly more tools that can deal with HTML email than those that can't. Like, I wouldn't be surprised if it's 4 orders of magnitude.

Is it? Whatsapp, Signal, Slack, Notion, ChatGTP, are amongst the apps I use daily - and used by many non-hacker daily, that's pretty much "text only". all support some (subset of) markdown, which is close to "plain text" than to "HTML" in editing and displaying.

What I am trying to say is not that email should use markdown, or that HTML-email is bad or good. What I am trying to say is that there's clear and obvious proof that, in 2024, there's a need and use for "plain text". Even in tools that overlap with what email does.


Slack is far from plain text: https://api.slack.com/block-kit/building


The Slack blocks format is horrendous, and not very powerful.


I disagree that email should be plain text, but honestly I don't think that's really relevant to the question. I read the question as "why CLI instead of GUI", which I think is totally fair. Using a CLI email client instead of a GUI strikes me like using your feet to open jars - maybe you can do it, but it's so much harder for no benefit.


I think it's not the question. One can continue using GUI, and value CLI for its flexibility. E.g. if I'd like to script some routine task, availability of a CLI tools will make it a breath. In the average case of GUI it's either impossible altogether, or requires some ugly user input simulation. Which is like using your feet to open jars - to borrow your comparison


As another commenter pointed out, CLI/TUI isn't that hard. In many cases it's easier than GUI ones. But I have a different purpose. I can configure different pieces (imap for incoming, smtp for outgoing, notmuch & afew for tagging and search, etc) and use it uniformly from a variety of different programs including git and emacs. Not very simple, I must admit. But it's a personal choice. It works very well for my use cases, including realtime full mail backup and offline use.


It just depends on the user. You probably also think cd & ls is so much harder than Finder or whatever.


> Using a CLI email client instead of a GUI strikes me like using your feet to open jars - maybe you can do it, but it's so much harder for no benefit.

Eh? I used to use mutt and now use notmuch. Much simpler to use than, say, Outlook. Not sure what you're talking about being "harder".


Absolutely right. Every GUI email client that I’ve tried is clumsy and slow. Mutt is elegant, powerful, and fast.

https://lwn.net/Articles/837960/


This question shows how far we’ve travelled from the original concepts of sending/receiving/viewing email. I just found it funny that you said “regular HTML client”, as if that was the default interface for email. Originally, it was all text, so this post is in many ways closer to how many thought of a “regular” client. But ever since Hotmail, it’s been a gradual shift away from command line email towards web applications. Desktop GUIs are still (kind of) holding on, but even they are more likely than not to be an Electron app.

To answer your question, these days, I’m not sure. There are so many extra features that email providers (Gmail/Office365) include in their web interfaces, it’s hard to not make the argument that the web interfaces are the better way to use email.


There are times it's really useful to access email from a terminal, and terminals are widely available (shell on your primary system, Termux on Android, SSH to your email host, whatevs).

It's also often convenient to either script interactions, or to have full access to shell tools when interacting with email. I practice this more often with mutt, but I can filter either messages or metadata (headers) and send those to an awk or sed pipeline to extract specific information of interest (this is especially useful with notifications / alert emails). This might be tens, hundreds, thousands, or more messages that are of interest.

Full-blown GUI or Web client email tools are pretty, but lack this flexibility.


I don't have much knowledge regarding mail but I can think of two reasons.

First is the use of mailboxes if your mail provider does not provide you with an IMAP server to connect to in which case you'll use a client like mutt to manage your mails.

Second one is the accesibility through the terminal could be reduced with HTML sites. If I want to access my email through a headless server using lynx or similar having to refresh the website to check new mails, or even composing them might be difficult.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: