Hacker Newsnew | past | comments | ask | show | jobs | submit | dreamer7's commentslogin

I am able to run Gemma 3 12B on my M1 MBP 16GB. It is pretty good at logic and reasoning!


This is consistent with my own experiments with local LLM models. I've also been experimenting with chaining the model to flagship LLMs when it is incapable of achieving the outcome using Langgraph.

It is good as a learning tool or to use in applications where latency is not a concern but there can be a high number of requests. But, not yet fast enough to do auto-complete on an IDE for instance.


To a casual observer, this seems like a big deal. Can knowledgeable folks comment on this work?


I am still reading the paper, but it is worth noting that this is not an LLM! It is closer to something like AlphaGo, trained only on ARC, Sudoku and mazes. I am skeptical that you could add a bunch of science facts and programming examples without degrading the performance on ARC / etc - frankly it’s completely unclear to me how you would make this architecture into a chatbot, period, but I haven’t thought about it very much.

Comparing the maze/Sudoku results to LLMs rather than maze/Sudoku-specific AIs strikes me as blatantly dishonest. “1k Sudoku training examples” is also dishonest, they generate about a million of them with permutations: https://news.ycombinator.com/item?id=44701264 (see also https://github.com/sapientinc/HRM/blob/main/dataset/build_su... And they seem to have deleted the Sudoku training data! Or maybe they made it private. It used to be here: https://github.com/imone and according to the Git history[1] they moved it here https://github.com/sapientinc but I cannot find it. Might be an innocent mistake; I suspect they got called out for lying about “1000 samples” and are hiding their tracks.

[1] https://github.com/sapientinc/HRM/commit/171e2fcde636bcb7e6c...


> not an LLM! closer to something like AlphaGo, trained only on ARC, Sudoku and mazes.

ah! this explains the performance..

What is the conventional wisdom on improving codegen in LLMs? Sample n solutions and verify, or run a more expensive tree search?

I have thoughts on a very elaborate add-a-function-verify-and-rollback testing harness and i wonder if this has been tried


Is it feasible to create a mesh network using Bluetooth on participating smart phones?

In crowded places like college campuses, we could run campus IM on it for instance


I think any wireless mesh like this runs into issues with scaling because you cannot efficiently route messages between a bunch of moving nodes as the network topology is always changing. You have to use a flood network, which is also what Meshtastic does [0]. Flood networking wastes bandwidth with every node repeating itself, and it gets even worse with wireless that's a shared spectrum.

All these mesh networks have a max hop limit, to prevent messages from bouncing around the network repeatably, but also not guaranteeing messages reach their destination. Meshtastic defaults to 3. Gotenna I believe is also Lora and is also 3. Bridgefy is bluetooth and has a 250 max hop limit, but also a 7d TTL, basically not close to real-time.

It could be made better by having statically position nodes that keep track of the nodes it can reach. And then having all these statically positioned nodes communicate with each other on a different wireless spectrum so you don't interfere with regular nodes. Since that topology isn't changing, you can efficiently route message between them. Now that's basically just regular wifi mesh.

[0] https://meshtastic.org/docs/overview/mesh-algo


There are more state-of-the-art routing protocols working to solve this problem for mesh networks. A couple examples of projects I have been involved in:

https://yggdrasil-network.github.io/ https://github.com/matrix-org/pinecone


I don't know if I consider those the same thing as they're not fully wireless meshes. They'll use the internet when possible, so it wouldn't be an off-grid network. And without internet, it won't scale.

So if you're using internet anyways, at high-density locations like a college campus, just deploy more wireless APs in the area instead of building an inefficient wireless mesh network. The wireless mesh part of those protocols is only useful for areas with no internet, but somehow enough people to build a chain to an internet connected device.

Reading Pinecone's documentation: "The only requirements for a peering today are that it is stream-oriented and reliable" [0]. I don't think a phone that's constantly moving around and battery operated (so you want to power-save by transmitting less) is considered reliable.

Pinecone's offline protocol also will not route to devices that haven't been seen in the last 10 seconds [1]. Basically preventing phones from sleeping or going into a low power state. That's also the kind of protocol that only works for small wireless mesh networks. A huge wireless mesh network would quickly be filled with "I'm here" broadcasts if a device is expected to do it every 10 seconds and it has to be repeated for everyone else on the mesh.

[0] https://matrix-org.github.io/pinecone/introduction

[1] https://matrix-org.github.io/pinecone/virtual_snake/maintena...


To be clear, neither Yggdrasil nor Pinecone have any concept of “the Internet”. A peering over the Internet is fundamentally the same as a local peering taking place over something like Wi-Fi or Bluetooth and they are not treated preferentially or handled differently.

Both protocols are also designed with mobility events in mind and measure far better than many other routing protocols on route convergence in highly mobile environments.

Also interpret “stream-oriented and reliable” as link-layer characteristics, i.e. a peering over TCP even if it is link-local satisfies these requirements. Not “reliable” as in “never goes away”.


Do you have any examples of their mobility event handling? I'm reading the documentation for Pinecone and don't see much. Even Pinecone says Yggdrasil's spanning tree isn't good enough: "However, the spanning tree topology alone is not a suitable routing scheme for highly dynamic networks." [0]

I'm reading that as why Pinecone has the virtual snake topology. But they define that as a public key-based routing, which doesn't take into account optimal routing in the network. Nodes are ordered by public key [1]. It's good for P2P mesh, not wireless offgrid meshes.

And their SNEK routing does prefer the internet over Bluetooth [2]:

> we can further refine the path to use either the faster or lower latency link type to route to that peer:

> If the Best candidate has a slower peer connection type (Multicast > Remote > Bluetooth) than the connected peer

[0] https://github.com/matrix-org/pinecone#does-pinecone-work-on...

[1] https://matrix-org.github.io/pinecone/snake

[2] https://matrix-org.github.io/pinecone/virtual_snake/nexthop


Most of the mobility testing has been performed either in the meshnet-lab[1] or the pineconesim[2].

As the original author of that documentation, it's quite entertaining to have it quoted back to me. :-) In any case the routing "prefers" links labelled as the internet when there is a tiebreak between two peerings between the same pair of nodes, i.e. you are connected to some other device via Wi-Fi and Bluetooth simultaneously.

And while it is true that Pinecone cannot necessarily always make the best routing decision based on public keys alone, aggressive queue management attempts to provide the best QoS for all flows and it scales very well because nodes maintain only a small amount of state about their position in the spanning tree and their position in the SNEK. Importantly, shortcuts can and often are taken when Pinecone switches to tree-based routing as the geometric distance to the destination on the tree is evaluated at each hop. Routing "by the SNEK" is used primarily to find the remote node and as a fallback in case the tree routing fails.

[1] https://github.com/mwarning/meshnet-lab [2] https://pinecone.matrix.org or https://github.com/matrix-org/pinecone/tree/main/cmd/pinecon...


Thanks, looks interesting for sure, I hope to be proven wrong!


iOS limits what you can do in the background with Bluetooth. At least one of the iPhones needs to have your mesh app running in the foreground to communicate with another iPhone that has your app. Two locked phones with your mesh app will not be able to communicate over Bluetooth.


Yes it is possible. The P2P Matrix demos worked in this way but they were alpha-quality at best, shoehorning today’s federation protocol on top of Pinecone mesh routing over Bluetooth/Wi-Fi. It still needed a lot of work to adapt the Matrix federation protocol to be properly usable in real-time without full-mesh connectivity.


the 'find my shit' networks deployed within the gog and apl ecosystems can be used to piggyback on for low-bw off-grid communication


Interesting! Any further reading you would recommend on this?


I now use Firefox Focus as my default phone browser. It's great! Lacks a lot of the gestures that make Chrome good. But it deletes tab history by default on closing the app which makes it an adequate sandbox to open one-off links in


Same here. Firefox Focus has worked excellently as my default phone browser for years. I feel confident clicking randomly on cookie popups, knowing all the cookies will be gone by the time I close the page so they don't matter.

When set as the default brower, it also provides the in-app web view to other apps (e.g. inside LinkedIn). There it works fine too (also with that cookie-deleting confidence), and it has a drop-down to "Open in Firefox Focus", which is rarely needed but I found it useful for the full experience outside the app web view. That opens near-instantly without reloading, as it just transfers the already open page state.


I already migrated from Keep just for this reason. It's ok to make temporary notes there but for long lasting organisation I prefer a different solution


What did you pick?


I'm trying to get familiar with Obsidian using a folder synced to a private GitHub repo. The flow and type of notes are kind of different from Keep though, so I may actually just detach temporary knowledge types like grocery lists or activity reminders back into physical notes again.


I moved to Logseq (https://logseq.com/).

It also better follows my note taking habits. It is usually a single sentence I want to be timestamped and searchable (which is also why I wrote https://github.com/wsw70/dly to simplify the quick note taking)


Different poster here, but I focus on a file based notes system (mine is on iCloud but any DropBox, Drive, etc would work fine), right no Obsidian and Ai Writer share a folder of my “notes” in an iCloud folder.

Keep is good for grocery lists and ephemeral data, but too easy to delete a note and no way to backup — well maybe takeout works for backup.


Same here - after Keep clobbered a meticulous but long note, I imported everything to markdown files that are synced between my devices.



Apologies for the naive questions in advance.

Is it not possible to go up there in a hot air balloon to inspect it from a closer distance?

Also, what about base jumping onto it?

Are there any methods of getting it back to the ground intact?


It's theoretically possible to get up there in a manned balloon, but this was up at FL600. Most aircraft can't even go that high. It's not something that's done casually.

Going up to those altitudes is the domain of either the military or someone trying to set a record. It's not a forgiving environment - you basically need a space suit up there. I'm fairly confident that nobody has the equipment for such a mission sitting around ready for use on short notice, especially if you want to stay up there for any length of time before you run out of air.

As for base jumping... uh, gravity points in the wrong direction for that. This is twice as high as Mt. Everest.


Your comment reminded me of something I saw in grade-school that was probably the biggest inspiration in sticking with USASA (Red Bull is a long-time sponsor) for so long.

Red Bull Stratos: 'A high altitude skydiving project'

- Baumgartner flew approximately 39 kilometres (24 mi) into the stratosphere over New Mexico, United States, in a helium balloon before free falling in a pressure suit and then parachuting to Earth.

- Baumgartner broke the unofficial record for the highest manned balloon flight of 37,640 m (123,491 ft)

Video: https://www.youtube.com/watch?v=FHtvDA0W34I&t=3s

Interestingly enough: Alan Eustace, a former SVP of Engineering at Google surpassed that record (2) years later (https://en.wikipedia.org/wiki/Alan_Eustace)

- On October 24, 2014, he made a free-fall jump from the stratosphere, breaking Felix Baumgartner's world record.

- The jump was from 135,890 feet (41.42 km) and lasted 15 minutes, an altitude record that stands as of 2023

(1) https://en.wikipedia.org/wiki/Red_Bull_Stratos (2) https://en.wikipedia.org/wiki/Alan_Eustace


BASE jumping? Are there structures 60,000 ft tall?


Oops. I meant to say skydiving.


Are you writing a spy novel


At some point, AI will replace stuntmen with completely synthesized action scenes. Contrast that with the money that was poured into one scene in MI7 where Tom Cruise rides a bike off a cliff.

If AI could make us feel the same way as Tom Cruise himself jumping off a cliff, isn't it a no-brainer to use it?

If we stretch this scenario to the ultimate - Generating an entire movie using AI, it doesn't feel as fun, as real. But I wonder if it is the future


I don't know. Part of the appeal of Tom Cruise doing it for real is that you know it's real. They emphasize that in the behind the scenes materials. The experience is vastly different to mostly-CGI movies like Marvel.


Agreed; watching Marvel movies these days feels like watching someone else play a video game. Less interesting, actually.


But the Marvel movies make more money and sell more merchandise, which is really all that matters.


Don’t be so sure! Top Gun Maverick made more money than any marvel movie last year.


Yes, but I am sure Marvel made it up in volume. There are only so many Tom Cruises on bikes to go around, while Marvel is pumping out something every other week.


They also didn't field a marquee marvel title, and people are suffering a bit from marvel fatigue at any rate.


TIL they made a sequel to Top Gun...


Today? Go watch it, it might still be in cinemas, it's a fantastic movie.

(Disclaimer: I'm a big fan of Tom Cruise's movies, especially the Mission:Impossible franchise)


Yep. They absolutely could have done that stunt for much much cheaper by using a stand in. Tom Cruise does his own stunts because its marketing for the movie, and he enjoys it


Yeah. It feels overdone because it can all be CGI so just go crazy with it. Sure, it's a fantasy movie, but at a certain point it feels too much

(they're not the only ones guilty on that, obviously)


This generative AI stuff always makes me feel like my world is just a dynamically generated RPG or MMORPG.


Same. I don't actually believe this to be true, but I have had the thought that as I get older there's been a clear progression of tech being created which could convincingly create a fake reality. The thought being that at the culmination of my life will be the perfection of this tech followed by the universe revealing that my existence has been an elaborate generative hallucination all along.


> my existence has been an elaborate generative hallucination all along.

Consider the fact that there is no objective 'blue', there is a wavelength with a particular frequency measured in hertz, but the wavelength is not the same thing as what we know to be color...

See, what happens is: a photon (oscillating at a frequency) hits a rod or cone in your eye tuned to detect that frequency of light: which travels down the optic nerve as an electro-chemical signal (no more photons involved at all at this point). Which eventually turns into a chemical reaction in your brain. That chemical reaction is 'blue' to us - not the photon that never reaches this point nor even the original frequency: just our brain's translated chemical signal; it's interpretation of 'blue'.

Blue is your brain hallucinating what it thinks that frequency actually is... blue is a "guess" at reality...

So yes, your brain IS generating an elaborate hallucination after all.


Even more so than that, “blue” is not even attempting to “guess” at reality. A more accurate statement is that the concept of “blue” is a useful abstraction for the goals of the vision system of your brain.


What makes you believe those photons and chemicals and the brain exist when you have no access to any of that and all of the information about it comes from your mind (the only thing guaranteed to exist)?


i mean...wow. that's a bit mind blowing


That's just simulation theory with extra steps.


" Contrast that with the money that was poured into one scene in MI7 where Tom Cruise rides a bike off a cliff."

I don't know much about stunt stuff but I don't really understand the fuzz about this. Taking a bike down a ramp and then pulling a parachute doesn't seem to be that crazy compared to a lot of other stuff people are doing.


But not today.

I still prefer the practical effects of films like Mad Max Fury Road over most full CGI FX movies.


Or even further back in time: I watched "The Thing" (1982) the other day and the practical effects were far more frightening than any CGI monsters I've ever seen.


This is the new "you will own nothing and be happy."

Your world is synthetic and you will like it.


Maybe once it can also make Tom Cruise feel like he rode that bike off a cliff.


I had a fun game. Was disappointed that the AI blundered a piece fairly early in the game. Overall, I felt I was not being pushed at all.


True, in most cases, backgrounds with a lot of elements (ex: a house in the background) were part of real pictures


Typically in AI generated images another giveaway is that when the head covers the entire height of the picture, the background may randomly change between the left and right side in implausible ways.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: