More

progval · 2025-06-24T15:27:25 1750778845

from the article:

> Steam is huge and requires 32-bit to work properly for the client and for Proton / Wine

progval · 2025-06-22T12:07:11 1750594031

Mozilla is an ad company now: https://www.mozilla.org/en-US/advertising/ https://blog.mozilla.org/en/mozilla/mozilla-anonym-raising-t...

fsflover · 2025-06-22T21:24:23 1750627463

This is not at all comparable with the big companies.

progval · 2025-06-21T16:03:52 1750521832

I'm 31 and I don't know mine.

progval · 2025-06-21T14:06:12 1750514772

Can't they be split into lines? OTR was designed for IRC that limited protocol lines (ie. payload line + command + extra fluff) to 512 bytes, so that ought to work on Discord too.

johnisgood · 2025-06-21T14:41:32 1750516892

I have not yet tried, that may work since it does work for IRC (which also has a limit per message). It was just more of a proof of concept, tbh, but it works, just not as usable as it could be.

progval · 2025-06-21T14:02:19 1750514539

There are no occurrences of "cell" or "phone" in GDPR, and the only relevant occurrences of "number" are about "national identification numbers", which phone numbers are not.

progval · 2025-06-16T06:45:43 1750056343

It's not open-source (nor open-weight): https://huggingface.co/nanonets/Nanonets-OCR-s/discussions/2

souvik3333 · 2025-06-16T06:56:34 1750056994

Hi, author of the model here. It is an open-weight model, you can download it from here: https://huggingface.co/nanonets/Nanonets-OCR-s

gardnr · 2025-06-16T08:09:49 1750061389

Interestingly, another OCR model based on Qwen2.5-VL-3B just dropped which also publishes as Apache 2. It's right next to Nanonets-OCR-s on the HF "Trending" list.

https://huggingface.co/echo840/MonkeyOCR/blob/main/Recogniti...

CaptainFever · 2025-06-16T15:47:22 1750088842

IMO weights being downloadable doesn't mean it's open weight.

My understanding:

    - Weight available: You can download the weights.
    - Open weight: You can download the weights, and it is licensed freely (e.g. public domain, CC BY-SA, MIT).
    - Open source: (Debated) You can download the weights, it is licensed freely, and the training dataset is also available and licensed freely.

For context:

> You're right. The Apache-2.0 license was mistakenly listed, and I apologize for the confusion. Since it's a derivative of Qwen-2.5-VL-3B, it will have the same license as the base model (Qwen RESEARCH LICENSE AGREEMENT). Thanks for pointing this out.

progval · 2025-06-12T08:37:30 1749717450

At Software Heritage, we listed 380M public repositories, 280M of which are on Github: https://archive.softwareheritage.org/

Repository search is pretty limited so far: only full-text search on URLs or in a small list of metadata files like package.json.

progval · 2025-06-12T08:35:02 1749717302

> but github may have a feed of new repos anyway?

Yes: https://docs.github.com/en/rest/repos/repos?apiVersion=2022-... (you can filter to only show repositories created since a given date).

tough · 2025-06-12T18:38:51 1749753531

and using their obscure graphql api, you can do the same for -new commits- across any repos.

they have some secret leaking infra for enterprise

progval · 2025-06-09T21:08:39 1749503319

> So, I tell the server to drop traffic from the IPs that were scraping. Problem solved! Then immediately I start seeing a large number of attempts from different IPs. Residential IPs in the US: they're buying residential proxies.

progval · 2025-06-02T21:24:19 1748899459

The other side of the coin is that if you give it a precise input, it will fuzzily interpret it as something else that is easier to solve.

lechatonnoir · 2025-06-03T05:31:28 1748928688

Well said, these things are actually in a tradeoff with each other. I feel like a lot of people somehow imagine that you could have the best of both, which is incoherent short of mind-reading + already having clear ideas in the first place.

But thankfully we do have feedback/interactiveness to get around the downsides.

pessimizer · 2025-06-02T22:33:19 1748903599

When you have a precise input, why give it to an LLM? When I have to do arithmetic, I use a calculator. I don't ask my coworker, who is generally pretty good at arithmetic, although I'd get the right answer 98% of the time. Instead, I use my coworker for questions that are less completely specified.

Also, if it's an important piece of arithmetic, and I'm in a position where I need to ask my coworker rather than do it myself, I'd expect my coworker (and my AI) to grab (spawn) a calculator, too.

BoorishBears · 2025-06-02T21:27:06 1748899626

It will, or it might? Because if every time you use an LLM is misinterprets your input as something easier to solve, you might want to brush up on the fundamentals of the tool

(I see some people are quite upset with the idea of having to mean what you say, but that's something that serves you well when interacting with people, LLMs, and even when programming computers.)

progval · 2025-06-02T21:33:19 1748899999

Might, of course. And in my experience it's what happens most times I ask a LLM to do something I can't trivially do myself.

BoorishBears · 2025-06-02T21:49:02 1748900942

Well everyone's experience is different, but that's been a pretty atypical failure mode in my experience.

That being said, I don't primarily lean on LLMs for things I have no clue how to do, and I don't think I'd recommend that as the primary use case either at this point. As the article points out, LLMs are pretty useful for doing tedious things you know how to do.

Add up enough "trivial" tasks and they can take up a non-trivial amount of energy. An LLM can help reduce some of the energy zapped so you can get to the harder, more important, parts of the code.

I also do my best to communicate clearly with LLMs: like I use words that mean what I intend to convey, not words that mean the opposite.

jacobgkau · 2025-06-02T22:22:56 1748902976

I use words that convey very clearly what I mean, such as "don't invent a function that doesn't exist in your next response" when asking what function a value is coming from. It says it understands, then proceeds to do what I specifically asked it not to do anyway.

The fact that you're responding to someone who found AI non-useful with "you must be using words that are the opposite of what you really mean" makes your rebuttal come off as a little biased. Do you really think the chances of "they're playing opposite day" are higher than the chances of the tool not working well?

BoorishBears · 2025-06-02T22:30:06 1748903406

But that's exactly what I mean by brush up on the tool: "don't invent a function that doesn't exist in your next response" doesn't mean anything to an LLM.

It implies you're continuing with a context window where it already hallucinated function calls, yet your fix is to give it an instruction that relies on a kind of introspection it can't really demonstrate.

My fix in that situation would be to start a fresh context and provide as much relevant documentation as feasible. If that's not enough, then the LLM probably won't succeed for the API in question no matter how many iterations you try and it's best to move on.

> ... makes your rebuttal come off as a little biased.

Biased how? I don't personally benefit from them using AI. They used wording that was contrary to what they meant in the comment I'm responding to, that's why I brought up the possibility.

jacobgkau · 2025-06-02T22:41:01 1748904061

> Biased how?

Biased as in I'm pretty sure he didn't write an AI prompt that was the "opposite" of what he wanted.

And generalizing something that "might" happen as something that "will" happen is not actually an "opposite," so calling it that (and then basing your assumption of that person's prompt-writing on that characterization) was a stretch.

BoorishBears · 2025-06-03T02:29:09 1748917749

This honestly feels like a diversion from the actual point which you proved: for some class of issues with LLMs, the underlying problem is learning how to use the tool effectively.

If you really need me to educate you on the meaning of opposite...

"contrary to one another or to a thing specified"

or

"diametrically different (as in nature or character)"

Are two relevant definitions here.

Saying something will 100% happen, and saying something will sometimes happen are diametrically opposed statements and contrary to each other. A concept can (and often will) have multiple opposites.

-

But again, I'm not even holding them to that literal of a meaning.

If you told me even half the time you use an LLM the result is that it solves a completely different but simpler version of what you asked, my advice would still be to brush up on how to work with LLMs before diving in.

I'm really not sure why that's such a point of contention.

jacobgkau · 2025-06-04T01:51:38 1749001898

> Saying something will 100% happen, and saying something will sometimes happen are diametrically opposed statements and contrary to each other.

No. Saying something will 100% happen and saying something will 100% not happen are diametrically opposed. You can't just call every non-equal statement "diametrically opposed" on the basis that they aren't equal. That ignores the "diametrically" part.

If you wanted to say "I use words that mean what I intend to convey, not words that mean something similar," that would've been fair. Instead, you brought the word "opposite" in, misrepresenting what had been said and suggesting you'll stretch the truth to make your point. That's where the sense of bias came from. (You also pointlessly left "what I intend to convey" in to try and make your argument appear softer, when the entire point you're making is that "what you intend" isn't good enough and one apparently needs to be exact instead.)

BoorishBears · 2025-06-04T18:43:40 1749062620

This word soup doesn't get to redefine the word opposite, but you're free to keep trying.

Cute that you've now written at least 200 words trying to divert the conversation though, and not a single word to actually address your demonstration of the opposite of understanding how the tools you use work.

jacobgkau · 2025-06-18T09:17:40 1750238260

The entire premise of my first reply to you was that your hyperbole invalidated your position. If either of us diverted the conversation, it was you.

One of your replies to me included the statement "the LLM probably won't succeed for the API in question no matter how many iterations you try and it's best to move on" (i.e. don't do the work or don't use AI to do it). Yet you continue to repeat that it's my (and everyone else's) lack of understanding that's somehow the problem, not conceding that AI being unable to perform certain tasks is a valid point of skepticism.

> This word soup doesn't get to redefine the word opposite,

You're the one trying to redefine the word "opposite" to mean "any two things that aren't identical."

lechatonnoir · 2025-06-03T05:29:24 1748928564

Well said about the fact that they can't introspect, and I agree with your tip about starting with fresh context, and about when to give up.

I feel like this thread is full of strawmen from people who want to come up with reasons they shouldn't try to use this tool for what it's good at, and figure out ways to deal with the failure cases.

khasan222 · 2025-06-02T22:30:46 1748903446

I find this very very much depends on the model and instructions you give the llm. Also you can use other instructions to check the output and have it try again. Definitely with larger codebases it struggles but the power is there.

My favorite instruction is using component A as an example make component B