This is (no offense) intellectually dishonest. Nobody wants the risk to be zero. What we want is that, if there is a specific KNOWN flaw in a life-critical system, that the flaw is addressed and the shuttle re-tested before humans are placed in it.
There's no good reason not to do this, except as a lazy cost-cutting measure (and, presumably, under time pressure to perform the eventual moon landing mission within the timeframe of Trump's presidency).
Parent commenter is making a joke about the fact that, in the title "SpaceX Files to Go Public", the word "Files" could be read as either a noun or a verb.
You can have user-defined operators with plain old recursive descent.
Consider if you had functions called parse_user_ops_precedence_1, parse_user_ops_precedence_2, etc. These would simply take a table of user-defined operators as an argument (or reference some shared/global state), and participate in the same recursive callstack as all your other parsing functions.
Well, regardless of technology, the space of things you can accomplish without risking your own troops' lives is very small. (Unless you're willing to go nuclear, which has the pesky downside of ending the world.)
To put it in perspective - in Vietnam, opposition forces lost over a million troops and continued to fight viciously. The US lost around 50,000 and gave up and left.
Democratic countries simply lack the stomach for this kind of thing (which is a good thing, really).
As opposed to democratic countries like the US or UK which would just lay down their arms after a few tens of thousands of their soldiers were killed in the event of a foreign military invasion on their territory?
That’s obvious but you seemed to be putting down foreigners for being able to stomach a million or more of them dying to protect their country from invasion unlike the enlightened democratic countries who couldn’t tolerate so many of their own dying for any reason. I think if tens or hundreds of thousands of soldiers from, say China, attacked the US, Americans would be very willing to fight to the last man to prevent becoming a vassal state of the CCP.
Perhaps the disconnect exists because some Americans have become too used to thinking from the perspective of invaders that they cannot possibly think from the perspective of the invaded?
You're reading something into my comment which isn't there. Hard to say what it is, but it's causing me to not really understand what you're talking about, at this point.
Maybe you thought I was disparaging Vietnam for defending their land? But in your own comment you indicate that you know I'm not talking about defense, that I'm talking about not having the stomach for loss of life as the invading force. So, IDK
I think being the "home team" makes swallowing those casualties easier (as easy as they can be, anyways); it's easy to perceive the situation as a fight for your life.
Obviously, there were other things going on in Vietnam (and Afghanistan and the larger War on Terror) to keep them fighting but it's much easier to muster up the manpower when a war seems existential because it's happening in your neighborhood.
tbh I don't think this use case is going to be as big as people seem to think
there are a lot of reasons, but in brief - I think AI desktop use is a product that the average person isn't going to get much value out of. to make an analogy - the creators of Segway thought people would buy them in large numbers, but it turned out most people don't mind walking manually (or at least, don't mind it enough to spend money on a scooter). I think makers of AI Desktop Use products are going to find out the same thing as it relates to everyday tasks like checking email and shopping.
I was thinking more remotely managing the computer in a warehouse, replacing the mouse of an architect, or some physical object engineer. That your grandma can finally find Discord by speaking to such a bot is just a nice side effect.
well yeah I wasn't even talking about professional use, since I think in professional use cases it will turn out make a lot more sense to set up APIs that AIs, use, than to set up screen scraping and mouse+keyboard use.
in fact even in rare cases where it's not possible to get an API or CLI to interface with some piece of software, I think people will find that their best bet is to first create a deterministic screen-scraping program for that specific software, then have that program serve an API for the AI to use. it would be so much cheaper to run (inference-wise) and so much more reliable, than having the AI itself perform the image interpretation and clicking.
I see AI desktop use as mainly a consumer product for that reason, since that's the situation where you have to react "on the fly" to whatever the user asks you to do and whatever program happens to be on their computer (versus professional cases which are more large-scale and repetitive, and where you can have a software developer on hand).
people used to say this about search engines and web browsers, as well
regardless, eventually Google became the universal default for both. When it comes to software, the average person doesn't shop around for the technologically optimal choice, they just use what everyone else is using.
AI (that is, plain chat) is always going to be free to use as well. Google and Microsoft are going to keep it that way. And make the money back via ads.
That's why ChatGPT still has a free option. If they didn't, they would lose a billion users overnight to Gemini.
my point is today there is no clear winner. opus, gpt 5.4 and gemini have different strengths. google search was running circles around competition in basically all use cases.
Even if there were, it wouldn't be very useful. In a "common law" system like the US, legal questions are rarely answered purely by the plain text of a law. You also need case law - that is, how the law is typically applied in practice by the courts.
CourtListener (from the non-profit Free Law Project) provides bulk access to its case law collection.[1][2] I expect they are ingesting the GPO uscourts collection so it should have near complete (99%) coverage.[3][4]
OpenAI, Anthropic, and Google are way ahead of you. Ask your heart away. I am unsure about open source models, it'd be interesting to know if they're ingesting the law.
The reality is that the official United States Code gives plenty of history for statutes, while the Code of Federal Regulations gives less but still basic history. Both are also provided in XML in bulk, though the former has a modern USLM format and the latter has an archaic schema. Case law is less amenable to git histories.
I don't see this as a "modern" vs "back in the day" thing.
The real reason many software orgs nowadays don't have QA is for the simple reason that it's slow. Everything in the consumer tech space is about rapid growth, and moving as fast as possible. Nobody cares very much that the software has bugs, what matters is whether it has users.
But outside of consumer tech, QA is a lot more common, since it matters a lot more that the software's logic is correct. (Speaking personally - I used to work for a genetics lab, and we had QA.) There are just different economic incentives involved.
I mentioned in another comment that I make sure the cost/time is within 1.25x of the next best single-model run. So it's not perfect, but I think that aspect will only get better with time.
Of course I'm biased, but using Sup has been great for me personally. Even disregarding the HLE score, having many different perspectives in the answers, and most importantly the combined answer, has been very helpful in feedback for architectural decisions I make for Sup, and many other questions I would normally ask ChatGPT/Gemini/Claude/Grok individually.
There's no good reason not to do this, except as a lazy cost-cutting measure (and, presumably, under time pressure to perform the eventual moon landing mission within the timeframe of Trump's presidency).
reply