Hacker Newsnew | past | comments | ask | show | jobs | submit | audunw's commentslogin

Foxconn is a Taiwanese company btw. I think they’ve been setting up several factories outside of China recently

These anecdotes come from the very peak of Chinas demographic dividend. In a decade or two their demographic dividend will be in a steep decline.

China also needs to change something drastic to avoid brain drain. The migration of competent people is still one-way. There no path to become a Chinese citizen. China has come a long way, but Europe is still ahead on building liveable communities and wok/life balance, while the US is still attractive to those seeking freedom and prosperity. China has avoided issues due to a huge population and that demographic dividend. But eventually it’ll become an issue


>China also needs to change something drastic to avoid brain drain.

Why does this matter? I hear this a lot but at the same time I look at what's coming out of China, especially in the AI space, and it's clear that brain drain isn't really hampering them.


It's almost as if you don't need the absolute best and brightest. Heck we used to get by retraining people from other industries to be programmers. I know companies absolutely can't do that now nor be expected to help grow their workers and can only work with exact match H1Bs, but it used to be a societal expectation of companies.

>very peak of Chinas demographic dividend

No, 2000s-2020s was peak blue collar dividend, think world combined, but low value dividend. When PRC had lots of hands but few brains, i.e. fraction of STEM vs US / west.

2040s-2080s is PRC peak tertiary skilled dividend. They'll have about 2-4x US in just STEM who'll be in workforce for most of our and our children's life times, even while tfr math starts eating away at future cohorts. The TLDR is they've just started cooking, their highend human capita pool will be exploiting greatest high skill demographic dividend in recorded history for high value. Their final form is OCED combined in talent and world combined in bluecollar backstopped by robots/automation (currently on trend to be more than world combined).

Brain drain barely a problem now, this isn't 2000s where there's shit domestic opportunities and PRC lose large % of the few best they produce. They now they mint so much talent, brain drain a rounding error, top talent increasingly stay in PRC. And many of the best that went abroad are returning. Or future trend is many of best that are leaving will recirculate back to PRC eventually. TBH most of those go abroad now are frankly PRC B/C tier talent, i.e. most international students are those too mid to do well on gaokao and even then they turn into A students in west. Like there's still some sectors where west can draw because they can afford to pay magnitudes more (which is matter of FX/geopolitics), but PRC now also in position to attract foreign talent via $$$, so much so that places have to ban nationals from working in PRC strategic sectors. China's expat draw is it's PRC, if you're high end talent and you want lab built in a few months, bottomless access to resources including human capital, dynamism of Asian tier1 cities, that EU+US can't offer. But immigration point really secondary to fact that when PRC produces plurality of high performing global talent, and retains most of them, they don't need to worry about immigration of competent people, just need to hold on people they have, which by and large is happening, i.e. Tsinghua brain drain rate went from 30% to single digits in last few years and returnee rate higher than ever. As in west depend on PRC talent surplus (because western talent pipeline shit vs PRC), trains them, and now that PRC rich with opportunities, many recirculate/reverse braindrain back to PRC anyway when they're high level.

At end of the day _most_ people are economic migrants, they move for $$$ not muh freedom/community. Ultimately US/EU strength is they have money/FX money multiplier, US way more than EU. When that goes away/decline people start making different choices. And again, PRC demographics will be lingering in background... but just means cheaper housing/less crowded cities, i.e. less drag on PRC living. It terms of active demographic dividend, most of us will be dead before PRC declines, imo not really worthwhile extrapolating on that timescale.


I just wanted to thank you for sharing your opinions on this site -- the sole poster here who I actually bookmarked to read like a blog.

Cheers.

The one big thing missing from LLMs is the ability to express how confident it is in the truth of what it’s saying.

Perhaps this could be a step in that direction. If we can associate the attribution with likelihood of being true. E.g., Arxiv would be better than science fiction in that context. But what is the attribution if it hallucinates a citation? Im guessing it would still be attributing it to scientific sources. So it does nothing to fix the most damaging instances of hallucination?


How is he taking it at face value? He’s saying it doesn’t matter. Which it really doesn’t.

The reason solid state is exciting is the promised high energy density, and in some cases better safety. We shouldn’t care if it’s really “solid state” or not. That’s just marketing fluff. It doesn’t even really have a good definition as some chemistries are somewhere in between (sometimes described as semi-solid state).

This test confirms the charging speed and basically confirms the energy density (estimates people have done based on the video/report put it in the ballpark of what’s claimed)

You and I should really not demand a test that it’s actually solid state. That just doesn’t matter. We need energy density tests, cycle life tests, puncture tests, etc. If all those specifications are confirmed, whether it’s solid state or not becomes completely moot.

And in the end what truly matters is if it can be mass manufactured at low cost, which can’t be tested anyway. All these social media demands for tests are kind of ridiculous, since the only thing publishing the tests does is give Donut more PR. They’re basically laughing all the way to the bank considering how easy it has been to manipulate YouTube, Reddit and HackerNews into giving them free press. We will have another round in a week when the next test is published. I’m honestly impressed.

Personally I reserve all judgement until the promised bikes are on the road and torn down by third parties.


It should be said that in the LinkedIn post announcing this the clams are much more moderate. Like “10% along what we consider to be AGI” and “lots of work left to do” if I remember correctly. How is that different from any other R&D company working on AI? They all claim to be on the path to AGI in some form

Norway is also finding that millions can be saved in new tunnel construction

https://www-tu-no.translate.goog/artikler/med-flere-elbiler-...


The convenience of filling is only there if you have the fuel stations. Considering how expensive it is I’d argue that it’s far better to spend that money on EV charging infrastructure, you get a lot more bang for gour buck. And EVs are arguable significantly more convenient when you have the infrastructure. Would you buy a phone that lasted a week or two, but you had to go to a phone filling station to refill it?

And yes, EVs can be more convenient also for street parking. It’s just an infrastructure problem and by now there are dozens of different solutions for every parking situation imaginable.

It’s frankly absurd reading debates about this online from Norway. It’s over. Yeah Norway has money and cheap electricity, that’s what makes it possible to “speed run” the technology transition. But other than that it’s a worst case scenario for EVs. Lots of people with only street parking in Oslo. Winter that’s brutal on range. People who love to drive hours and hours to their cabin every weekend. With skis on the roof. Part of schengen so people drive all the way down to croatia in summer. We gave EVs and Hydrogen cars the same chance. Same benefits. EVs won. End of story. Though a hydrogen station near me blew up in a spectacularly loud explosion so maybe that makes me a bit biased.


Yeah, thank god. As long as it’s easy to remove and replacement batteries can easily be purchased by individuals, I want my phone and battery glued, thank you very much.

I like apples approach to removable battery glue. Though it needs an extra tool. These days it should be easy to make a cheap USB-C PD powered thing that supplies a good DC voltage.


The electricity-controlled glue in Apple's iPhone is made by Tesa, a German Glue company

Models don’t get old as fast as they used to. A lot of the improvements seem to go into making the models more efficient, or the infrastructure around the models. If newer models mainly compete on efficiency it means you can run older models for longer on more efficient hardware while staying competitive.

If power costs are significantly lower, they can pay for themselves by the time they are outdated. It also means you can run more instances of a model in one datacenter, and that seems to be a big challenge these days: simply building an enough data centres and getting power to them. (See the ridiculous plans for building data centres in space)

A huge part of the cost with making chips is the masks. The transistor masks are expensive. Metal masks less so.

I figure they will eventually freeze the transistor layer and use metal masks to reconfigure the chips when the new models come out. That should further lower costs.

I don’t really know if this makes sanse. Depends on whether we get new breakthroughs in LLM architecture or not. It’s a gamble essentially. But honestly, so is buying nvidia blackwell chips for inference. I could see them getting uneconomical very quickly if any of the alternative inference optimised hardware pans out


From my own experience, models are at the tipping point for being useful at prototypes in software, and those are very large frontier models not feasible to get down on wafers unless someone does something smart.

I really don't like the hallucination rate for most models but it is improving, so that is still far in the future.

What I could see though, is if the whole unit they made would be power efficient enough to run on a robotics platform for human computer interaction.

It makes sense they would try to make repurposing their tech as much as they could since making changes is frought with a long time frame and risk.

But if we look long term and pretend that they get it to work, they just need to stay afloat until better smaller models can be made with their technology, so it becomes a waiting game for investors and a risk assessment.


> From my own experience, models are at the tipping point for being useful at prototypes in software

You must not have much experience using the new frontier models then. A lot of large tech companies are replacing their SDLC with agentic workflows. The tooling and frameworks are still ramping up, but the models have no problem producing production ready software given proper specifications.


“ Models don’t get old as fast as they used to”

^^^ I think the opposite is true

Anthropic and OpenAI are releasing new versions every 60-90 days it seems now, and you could argue they’re going to start releasing even faster


Are they becoming better at the same rate as before though?

In my unscientific experience, yes, but being better at a certain rate is hard to really quantify, unless you just pull some random benchmark numbers.

Per release, I’d say no.

Per period of time, I’d say yes.



yes, pretty much

It doesn’t make any sense to think you need the whole server to run one model. It’s much more likely that each server runs 10 instances of the model

1. It doesn’t make sense in terms of architecture. It’s one chip. You can’t split one model over 10 identical hardwire chips

2. It doesn’t add up with their claims of better power efficiency. 2.4kW for one model would be really bad.


We are both wrong.

First, it is likely one chip for llama 8B q3 with 1k context size. This could fit into around 3GB of SRAM which is about the theoretical maximum for TSMC N6 reticle limit.

Second, their plan is to etch larger models across multiple connected chips. It’s physically impossible to run bigger models otherwise since 3GB SRAM is about the max you can have on an 850mm2 chip.

  followed by a frontier-class large language model running inference across a collection of HC cards by year-end under its HC2 architecture
https://mlq.ai/news/taalas-secures-169m-funding-to-develop-a...

Aren't they only using the SRAM for the KV cache? They mention that the hardwired weights have a very high density. They say about the ROM part:

> We have got this scheme for the mask ROM recall fabric – the hard-wired part – where we can store four bits away and do the multiply related to it – everything – with a single transistor. So the density is basically insane.

I'm not a hardware guy but they seem to be making a strong distinction between the techniques they're using for the weights vs KV cache

> In the current generation, our density is 8 billion parameters on the hard wired part of the chip., plus the SRAM to allow us to do KV caches, adaptations like fine tuning, and etc.


Thanks for having a brain.

Not sure who started that "split into 10 chips" claim, it's just dumb.

This is Llama 3B hardcoded (literally) on one chip. That's what the startup is about, they emphasize this multiple times.


It’s just dumb to think that one chip per model is their plan. They stated that their plan is to chain multiple chips together.

I was indeed wrong about 10 chips. I thought they would use llama 8B 16bit and a few thousand context size. It turns out, they used llama 8B 3bit with around 1k context size. That made me assume they must have chained multiple chips together since the max SRAM on TSMC n6 for reticle sized chip is only around 3GB.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: