More

abeppu · 2026-03-29T15:55:14 1774799714

Do publishers really have fact-checkers? My understanding was that support for authors is now relatively minimal, even for established authors, and no one really has the time or resources to second-guess everything an author has claimed. I take as a key example Naomi Wolf learning after her book was "done" that a significant chunk of it was based on a misunderstanding of an admittedly confusing 19th century British legal phrase. https://nymag.com/intelligencer/2019/05/naomi-wolfs-book-cor...

I think maybe the idea that a single author spending months or years on their research, which the publish as a single bound and polished work is misguided -- an academic trying to do similar work in multiple articles would have gotten review from peers on each article, and hopefully have not spent so much time working under a correctable misunderstanding.

jfengel · 2026-03-29T16:04:55 1774800295

Fact checking as a separate job is more for journalism than books. But editors have fact checking as part of their jobs. (It is not copy-editing, which is a different job.)

Many nonfiction authors will hire a fact checker separately. They don't want to look like they missed something. Errors still happen, of course.

abeppu · 2026-03-27T18:26:14 1774635974

This paper describes finding security related concepts and using them to steer at generation time. While this is an interesting contribution on its own, the approach could also be applied to a range of other concerns -- e.g. can we use this to steer away from performance problems? can we make llm code generation anticipate maintainability or readability issues?

abeppu · 2026-03-20T20:18:24 1774037904

If people want to try untested peptides, I think society should use that as the engine to _test those peptides_. Instead of buying something that's supposed to but may not be the peptide you want, you should pay 50+k% + data and get something that has a 50% chance of being the peptide and 50% chance of being a placebo, and you're _required_ to submit a report about effects and side effects before you can get a refill.

Rather than complain about how these things have not yet gone through real experiments and are marketed as having been "studied" rather than "effective", I would love to see society use the obvious demand for some of these to actually test them.

DANmode · 2026-03-20T22:27:52 1774045672

> you're _required_ to submit a report about effects and side effects before you can get a refill.

Enjoy your bogus data.

abeppu · 2026-03-20T22:31:40 1774045900

ha ok that's fair maybe it needs to be sold with a package of associated tests

DANmode · 2026-03-20T23:44:32 1774050272

Getting warmer! =]

abeppu · 2026-03-20T19:44:08 1774035848

So I'm actually confused that in the little image of his run in the article it seems he's often making absolute progress in the opposite direction the ship is going for part of each lap. Like, was the ship going unusually slowly?

abeppu · 2026-03-19T22:50:36 1773960636

In their little algorithm box on Chain Distillation, they have at step 2b some expression that involves multiplying and dividing by `T`, and then they say "where α = 0.5, T = 1.0".

I think someone during the copy-editing process told them this needed to look more complicated?

arjie · 2026-03-20T02:25:41 1773973541

tl;dr it makes sense once you see there are hidden softmax in there; it's just the explicit formula written out and then applied with the common param value

Bloody hell, I am so unfamiliar with ML notation:

    L = (1 - α) · CE(M_k(x), y) + α · T² · KL(M_k(x)/T ‖ M_{k-1}(x)/T)

So CE is cross-entropy and KL is Kullback-Leibler, but then division by T is kind of silly there since it falls out of the KL formula. So considering the subject, this is probably the conversion from logits to probabilities as in Hinton's paper https://arxiv.org/pdf/1503.02531

But that means there's a hidden softmax there not specified. Very terse, if so. And then the multiplication makes sense because he says:

> Since the magnitudes of the gradients produced by the soft targets scale as 1/T2 it is important to multiply them by T2 when using both hard and soft targets.

I guess to someone familiar with the field they obviously insert the softmax there and the division by T goes inside it but boy is it confusing if you're not familiar (and I am not familiar). Particularly because they're being so explicit about writing out the full loss formula just to set T to 1 in the end. That's all consistent. In writing out the formula for probabilities q_i from logits M_k(x)_i:

    q_i = exp(M_k(x)_i / T) / sum_j exp(M_k(x)_j / T)

Hinton says

> where T is a temperature that is normally set to 1. Using a higher value for T produces a softer probability distribution over classes.

So the real formula is

    L = (1 - α) · CE(softmax(M_k(x)), y) + α · T² · KL(softmax(M_k(x)/T) ‖ softmax(M_{k-1}(x)/T))

And then they're using the usual form of setting T to 1. The reason they specify the full thing is just because that's the standard loss function, and it must be the case that people in this field frequently assume softmaxes where necessary to turn logits into probabilities. In this field this must be such a common operation that writing it out just hurts readability. I would guess one of them reading this would be like "yeah, obviously you softmax, you can't KL a vector of logits".

Good question. I just sort of skipped over that when reading but what you said made me think about it.

sdpmas · 2026-03-19T23:07:51 1773961671

the T stands for tea :)

naruhodo · 2026-03-19T23:36:48 1773963408

Ah, so it's a source of randomness! Presumably 1.0 corresponds to a really hot cup of fresh tea.

abeppu · 2026-03-18T15:38:31 1773848311

I get that the norms lean conservative and that's a good thing. But if someone says you should do a recall and the actual lab tests saying whether your product actually has toxin-producing bacteria haven't finished running yet, I can understand the desire to wait until the evidence is in.

autoexec · 2026-03-18T16:20:29 1773850829

They've got some evidence, 7 known cases over three states all linked to the same product. The history of problems from this producer makes it seem more likely to be true. A lot of companies would rather have their customers throw away their product and buy it again from a different batch than risk having their customers get violently sick or dead from their food because the people who get sick and survive can end up with a very strong aversion to the brand and/or product going forward, and voluntarily recalling the product just to be safe is good from a PR stance since it looks like you actually care about your customers.

abeppu · 2026-03-17T15:27:41 1773761261

I think some of it was just a belief that work you can see being done by a floor of people talking with their mouths and looking at screens in the same room is more real than the slightly less visible conversations in slack while looking at screens in their own rooms.

Open plan offices continue to be designed more for seeing the work happen than for doing the work. I spend a lot of mental energy on ignoring the distractions around me. No job has ever offered me a private office with a door that closes in exchange for being in the office 5 days a week.

abeppu · 2026-03-17T14:26:43 1773757603

> Meanwhile, you're a week behind on that Jira ticket for issuing JSON Web Tokens because there's no actively maintained JWT libraries for Gooby.

> I've seen buggy pre-1.0 libraries used for critical production softwareNot "critical" as in "if it goes down it's really annoying", but "critical" as in "checks and stores passwords" or "moves money from one bank account to another". simply because it was a wrapper library written in a functional language, as opposed to a more stable library written in an imperative language.

I do think if you're using a niche language in an industrial capacity, you need to know how to work with libraries outside the language. Admittedly this is easier for some languages than others. But I've run into candidates who wanted to interview in clojure but who didn't know how to call into pre-provided java libraries.

rowanG077 · 2026-03-17T14:35:40 1773758140

This. Almost all languages can at least call into c. Which should allow you to do whatever you want.

abeppu · 2026-03-16T14:53:02 1773672782

I don't have any deep background in econ, but do we not need to switch from talking about GDP to talking about a version of Net Domestic Product where "net" includes:

- changes to the value of natural and ecosystem resources (e.g. if I clear a forest to sell timber, we must acknowledge some lost value for the forest)

- amount of economic transactions in service of mitigating problems created by externalities from other activity (e.g. if my pollution gets into your groundwater, you paying to remediate the pollution isn't "value created")

I.e. growth of _actual net value_ still sounds like a good thing to pursue but we let our politicians run around doing anything to maximize GDP without talking about what the "gross" is hiding.

abeppu · 2026-03-16T15:03:50 1773673430

Also this isn't only a gap about environmental issues. If you pay X for child daycare and work but only make X+taxes, and have a dumb pointless job, GDP says the economy is at least 2X+tax larger than if you took care of your own child during the day (bc your employer paid you and you paid your daycare). This seems dumb at an accounting level, even before we consider that you probably get a greater emotional benefit from being with your child than does the daycare worker.

DANmode · 2026-03-16T15:24:58 1773674698

But they’re not measuring nor optimizing for the contentedness of you, or your kids.

They’re measuring money generated for shareholders, they’re measuring tax base.

bluefirebrand · 2026-03-16T15:27:28 1773674848

I think the point they were trying to make is that they are measuring and optimizing for the wrong thing

DANmode · 2026-03-16T15:29:09 1773674949

I am agreeing with that point,

and providing speculation why it’s unlikely to change.

ixxie · 2026-03-16T14:59:34 1773673174

It's just very hard to measure.

On the corporate scale, see the whole carbon / ESG / impact measure ent industry. Lifecycle Analysis, supply chain extrapolation, Bill of Materials analysis.

You only get some relatively crude estimate and a lot of missing data points, whereas economic growth can conveniently assign a dollar value on everything.

I think it only gets worse as you scale up.

maxerickson · 2026-03-16T16:08:05 1773677285

As an example, a forest managed for productivity won't really lose value from a harvest.

You'd have to price the conversion of it to that management strategy.

abeppu · 2026-03-16T15:16:20 1773674180

But we have a lot of sources of information already available that do not seem to be incorporated into any kind of top-level number that we grade ourselves on.

- when we have an estimate of how many hundreds of billions it costs to rebuild after a hurricane that would not have happened but for climate change, existing economic processes generate that number

- when insurers raise rates throughout a region, this reflects an expectation on the cost of damage, and the change over time reflects the increase in risk we've created

- when a heatwave kills a bunch of people, we already have a range of ways of estimating a monetary value for those lives from insurance, healthcare and liability litigation.

Further ... suppose your elderly relative left you a bunch of jewelry. You don't know how much it's worth and getting it appraised can actually be a bit complicated and doesn't give you complete certainty over value. But it would be _bonkers_ to continually take unappraised jewelry out into the marketplace, liquidate it, and pretend that the whole sales price was _earnings_. After the transaction, you don't have a thing you had before. You didn't know what it was worth initially but that doesn't mean that it was worthless, and you probably got scammed. Yes, measuring the full environmental impact of all our industries is hard, but pretending it's 0 is silly.

tim333 · 2026-03-17T12:49:40 1773751780

It's kind of an accounting problem. What you really want is human happiness and abundant nature but doing some gardening and playing with the kids may produce happiness but no GDP whereas enlarging a chemicals plant to make even larger SUVs gives much GDP.

Trouble is it's hard to account for that kind of stuff but maybe we could make a flawed but functional accounting thing with AI?

Tade0 · 2026-03-17T13:01:13 1773752473

A huge part of such stuff is deliberately hidden to avoid getting the government too involved in day to day lives.

Case in point: for a while we had an arrangement with our neighbour that we'll pick up their child from preschool and stay with her until her parents get home and in exchange they would prepare dinner for us.

No money exchanged hands, so no GDP generated, yet everyone's quality of life improved.

tim333 · 2026-03-17T14:43:10 1773758590

I guess a lot of the 'free market' stuff is also about avoiding too much government involvement. It tends to be a pain the neck when you have to fill tax returns and apply for permits.

goodroot · 2026-03-16T16:00:09 1773676809

Mark Carney's book "Values" pitches a system such as this.

In better times, perhaps we have the collective will to try.

amelius · 2026-03-16T17:46:37 1773683197

You should also include who is profiting. Is it the wealthiest 1% or is it the entire population.

abeppu · 2026-03-14T22:19:59 1773526799

> Moreover is the fact that they're 100% automated a material fact to the consumer?

I do think that for a meaningful fraction of first time customers, the choice to try it is about the novelty of it being automated. In SF I do often see people explaining waymo to out of town visitors, and the uniqueness of "driverless" vs "remote controlled" is part of the appeal.

Ferret7446 · 2026-03-15T01:38:23 1773538703

But that's not what they're paying for. You're hoping to get the automated experience but you aren't paying for the automated experience. This is like going to Hooters to buy a meal and then suing because the girl you wanted to see didn't serve you.

abeppu · 2026-03-15T02:35:13 1773542113

https://x.com/Waymo/status/1890083513531084973

Here's a waymo ad from a year ago. In like 10 seconds they repeat "it's driving itself" 3 times.

https://www.youtube.com/watch?v=0kJPDg207oc

Here's another one. The closing screen says "Autonomous rides 24/7". They talk about the robot

Here's a blogpost from 2021 in which they insist that their messaging from there forward will talk about "fully autonomous driving", and not merely self-driving. https://waymo.com/blog/2021/01/why-youll-hear-us-say-autonom...

Here's a post from this year where as part of their expansion to new cities they say " we continue our accelerated growth and welcome the first public riders into our _fully autonomous_ ride-hailing service in four new cities" (emphasis mine). https://waymo.com/blog/#:~:text=Waymo%20will%20begin%20fully...

I haven't read the TOS in the app and I'm sure they didn't legally commit that no human will ever be involved even in unusual circumstances (which would probably be irresponsible). But they have been advertising on the basis of being autonomous, they're presenting that as part of their value prop to new users. Maybe it's up to lawyers to decide whether that's "material". But they are repeatedly, loudly, proudly advertising and marketing on the basis of it being fully autonomous.