Hacker News new | past | comments | ask | show | jobs | submit | worldsayshi's comments login

I've been thinking about using xreal glasses for coding but all the reviews I've seen seems to think that the fidelity isn't good enough for reading text for lengthy stretches of time. This article is the first counter argument here.

I feel there's a glaring counter point to this. I have never felt more compelled to try out whatever coding idea that pops into my head. I can make Claude write a poc in seconds to make the idea more concrete. And I can write into a good enough tool in a few afternoons. Before this all those ideas would just never materialize.

I mean I get the existential angst though. There's a lot of uncertainty about where all this is heading. But, and this is really a tangent, I feel that the direction of it all is in the intersection between politics, technology and human nature. I feel like "we the people" leave walkover to powerful actors if we do not use these new powerful tools in service of the people. For one - to enable new ways to coordinate and organise.


Good point. It's not that AI is "pushing us" towards anything. AI can be a muse that elevates our creativity. IF we use it that way. But do we use it that way? I think there will be some who do.

The majority of users seem to want convenience at any expense. Most are unconcerned with a loss of agency, almost enthusiastic about it if it removes the labor of thinking.


Agency only goes away if control of AI is ultimately centralized. If we end up in a world where anyone can run good enough models on consumer devices and we can install our own models into off the shelf humanoid robots I don't see that we have lost agency.

The superset of "emotion" is heuristic. Machines without heuristics wouldn't get very far. Their heuristics would probably look quite different from ours though.

> This is already true and will become increasingly more true for AI. The user cannot differentiate between sophisticated machine learning applications and a washing machine spin cycle calling itself AI.

The user cannot but a good AI might itself allow the average user to bridge the information asymmetry. So as long as we have a way to select a good AI assistant for ourselves...


> The user cannot but a good AI might itself allow the average user to bridge the information asymmetry. So as long as we have a way to select a good AI assistant for ourselves...

In the end it all hinges on the users ability to assess the quality of the product. Otherwise, the user cannot judge whether an assistant recommends quality products and the assistant has an incentive to suggest poorly (e.g. sellout to product producers).


> In the end it all hinges on the users ability to assess the quality of the product

The AI can use tools to extract various key metrics from the product that is analysed. Even if we limit such metrics down to those that can be verified in various "dumb" ways we should be able to verify products much further than today.


I love the idea and I would like to build something like this. But the few attempts i have made using whisper locally has so far been underwhelming. Has anyone gotten results with small whisper models that are good enough for a use case like this?

Maybe I've just had a bad microphone.


> Maybe I've just had a bad microphone.

Yeah, I would definitely double-check your setup. At work we use Whisper to live-transcribe-and-translate all-hands meetings and it works exceptionally well.


+1 this. Whisper works insanely well. I've been using the medium model as it has yet to mis transcribe anything noticeable, and it's very lightweight. I even converted it to a coreML model so it runs accelerated on apple silicon. It doesn't run *that* much faster than before.. but it ran really fast to begin with. For anyone tinkering, ive had much success with whisper.cpp.

What was the process of converting it like? I assume you then had to write all of the inference code as well?


I'd agree with your experience. I simply sit my phone (~200 dollar motorola, cheap phone) in centre of room, split voice file into chunks using voice prints/ID's I get from a voice embedding model I trained, then feed labelled chunks through whisper, and get a nice transcript of everything said. I combine that with my handwritten notes (as image, get a VLM to transcribe) and the agenda, and I get out really nice meeting minutes as a LaTex document. Works a charm and has turned an hour or two of work per meeting into maybe 30 minutes (proofing what was written).

Which model do you use? I use large usually, on a GPU. It's fast and works really well. Be aware though that it can only recognise one language at a time. It will autodetect if you don't specify one.

Of course the smaller models don't work nearly as well and they are often restricted to English. Large works great for me though it does require GPU hardware to be responsive enough, even with faster-whisper or insanely-fast-whisper.


> What is the alternative?

Reasonably there should be a two way exchange? It might be okay for companies to piggyback on research funds if that also means that more research insight enters public knowledge.


I’d be happy if they just paid their fair share of tax and stopped acting like they were self-made when they really just piggybacked on public funds and research.

There’s zero acknowledgment or appreciation of public infra and research.


Would be interesting to see which languages a language has influenced in addition to which influenced it.

Tangent to this: I think it's often useful to allow suggesting "bad" solutions to vague problems because good solutions often hang out close to the bad one's and shines interesting light on the problem. Or bad solutions often immediately provokes better ideas. If you immediately see that a proposed solution is bad there's a good chance you know what specifically is bad about it and can propose an amendment.

Suggesting a bad solution is sometimes half the way to a good one.


Exploring the problem space is necessary to determine what a bad solution versus a good solution even is.

Isn't that what a lot of this is about? It's a blue ocean and everyone are full of fomo.

Software quality be damned!

I guess there's an incentive to quickly get a first version out the door so people will start building around your products rather than your competitors.

And now you will outsource part of the thinking process. Everyone will show you examples when it doesn't work.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: