Hacker Newsnew | past | comments | ask | show | jobs | submit | finnborge's commentslogin

It already is "just" information retrieval, just with stochastic threads refining the geometry of the information.


Haha u mean it isn't AGI? /s


I think this is well illustrated in a lot of science fiction. Irregular or abstract tasks are fairly efficiently articulated in speech, just like the ones you provided. Simpler, repetitive ones are not. Imagine having to ask your shower to turn itself on? Or your doors to open?

Contextualized to "web-apps," as you have; navigating a list maybe requires an interface. It would be fairly tedious to differentiate between, for example, the 30 pairs of pants your computer has shown you after you asked "help me buy some pants" without using a UI (ok maybe eye-tracking?).


They actually aren’t done well via voice UI either - if you care about the output.

We just gloss over the details in these hypothetical irregular or abstract tasks because we imagine they would be done as we imagine them. We don’t have experience trying to tell the damn AI to not delete that cloud (which one exactly?) but the other one via a voice UI. Which would suck and be super irritating, btw.

We know how irritating it would be to turn the shower off/on, because we do that all the time.


On a tangent but I still don't know why we don't have showers where you just press a button and it delivers water at the correct temperature. It seems like the simplest thing that everyone wants. A company that manufactures and installs this (a la SolarCity) should be an instant unicorn.


For what it's worth, Northern European showers typically have two independent controls: temperature and flow. Leave the temperature at what you think is good, and either wait a moment for hot water to reach the end of the pipe or install a recirculating loop.


What's "correct" for you might not be "correct" for others. Furthermore, your owb definition of "correct" changes depending on circumstances; sometimes you want it hotter, sometimes you want it colder. Sometimes you want to change it partway through.

How do you calculate for that?

Back in the 90s, Fuzzy Logic was thought to be the solution. In a way, yes, but only for niche/specialized purposes, and they still have to limit the variables being evaluated.


How about it uses temp sensors to read your skin temp (if that's the variable that matters, idk), and some mechanism for "feedback" after an initial guess? It's all implementation details.


Water + electronics/power typically isn’t very durable, or reliable. Most people want their shower valves to work at least 20 years, ideally 50-100.


Can be mitigated to a degree by separating the (cheaper) sensors and the (pricier) logic.

But then it will become a tradeoff of complexity vs longevity.


Nah, because it would still need servicing.

And why? There are reasonably well done, low maintenance, temperature balancing valves out there.

And they do typically last 20+ or more years.


Maybe you don't even need a list if you can describe what you want or able to explain why the article you are currently viewing is not a match.

As for repetitive tasks, you can just explain to your computer a "common procedure" ?


My most charitable interpretation of the perceived misunderstanding is that the intent was to frame developers as "the user."

This project would be the developer tool used to produce interactive tools for end users.

More practically, it just redefines the developer's position; the developer and end-user are both "users". So the developer doesn't need to think AND the user doesn't need to think.


I interpreted it like "why don't we simply eat the orphans"? It kind of works but it's absurd, so it's funny. I didn't think about it too hard though, because I'm on a computer.


At this extreme, I think we'd end up relying on backup snapshots. Faulty outcomes are not debugged. They, and the ecosystem that produced them, are just erased. The ecosystem is then returned to its previous state.

Kind of like saving a game before taking on a boss. If things go haywire, just reload. Or maybe like cooking? If something went catastrophically wrong, just throw it out and start from the beginning (with the same tools!)

And I think the only way to even halfway mitigate the vulnerability concern is to identify that this hypothetical system can only serve a single user. Exactly 1 intent. Totally partitioned/sharded/isolated.


Backup snapshots of what though? The defects aren’t being introduced through code changes, they are inherent in the model and its tooling. If you’re using general models, there’s very little you can do beyond prompt engineering (which won’t be able to fix all the bugs).

If you were using your own model you could maybe try to retrain/finetune the issues away given a new dataset and different techniques? But at that point you’re just transmuting a difficult problem into a damn near impossible one?

LLMs can be miraculous and inappropriate at the same time. They are not the terminal technology for all computation.


In N years the idea of requiring a rigid API contract between systems may be as ridiculous as a Panda being unable to understand that Bamboo is food unless it is planted in the ground.

Abstractly, who cares what format the information is shared in? If it is complete, the rigidity of the schema *could* be irrelevant (in a future paradigm). Determinism is extremely helpful (and maybe vitally necessary) but, as I think this intends to demonstrate, *could* just be articulated as a form of optimization.

Fluid interpretation of API results would already be useful but is impossibly problematic. How many of us already spend meaningful amounts of time "cleaning" data?


If you haven't already seen the DeepSeek OCR paper [1], images can be profoundly more token-efficient encodings of information than even CSVs!

[1]: https://github.com/deepseek-ai/DeepSeek-OCR/blob/main/DeepSe...


I'm not sure I follow this entirely, but if the assertion is that "everything is math" then yeah, I totally agree. Where I think language operates here is as the medium best situated to assign objects to locations in vector space. We get to borrow hundreds of millions of encodings/relationships. How can you plot MAN against FATHER against GRAPEFRUIT using math without circumnavigating the human experience?


When I write to an unknown audience, unable to know in advance what terms they rely on, I tend to circumlocute to build emotional subtext. They might only get some percent but it may be familiar enough terms to act as middleware to the rest.

The words Man, father, and grapefruit aren't essential to existence of man, father, grapefruit. All existed before language.

What you mean by "human experience" is "bird song my culture uses to describe shared space". Leave meaning to be debated in meat space and include the current geometry of the language in the model. Just make it mutable.

The machine can just focus on rendering geometry to the pixel limit of the machine using electrical theory; it doesn't need to care internally if it's text with meaning. It's only represented like that on the screen anyway. Compress the information required to just geometric representation and don't anthropomorphize machine state manipulation.


This is amazing. It very creatively emphasizes how our definition of "boilerplate code" will shift over time. Another layer of abstraction would be running N of these, sandboxed, responding to each request, and then serving whichever instance is internally evaluated to have done the best. Then you're kind of performing meta reinforcement learning with each whole system as a head.

The hard part (coming from this direction) is enshrining the translation of specific user intentions into deterministic outputs, as others here have already mentioned. The hard part when coming from the other direction (traditional web apps) is responding fluidly/flexibly, or resolving the variance in each user's ability to express their intent.

Stability/consistency could be introduced through traditional mechanisms: Encoded instructions systematically evaluated, or, via the LLMs language interface, intent-focusing mechanisms: through increasing the prompt length / hydrating the user request with additional context/intent: "use this UI, don't drop the db."

From where I'm sitting, LLMs provide a now modality for evaluating intent. How we act on that intent can be totally fluid, totally rigid, or, perhaps obviously, somewhere in-between.

Very provocative to see this near-maximum example of non-deterministic fluid intent interpretation>execution. Thanks, I hate how much I love it!


> serving whichever instance is internally evaluated to have done the best. Then you're kind of performing meta reinforcement learning

I thought this didn't work? You basically end up fitting your AI models to whatever is the internal evaluation method, and creating a good evaluation method most often ends up having a similar complexity as creating the initial AI model you wanted to train.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: