Why do you think there was a mass surveillance of American domestic communicatio...

malux85 · on April 12, 2023

Yeah but have you seen the leaked slides? It's clear that they have only the ability to analyse 10% or less of the data they are storing.

GPT-like systems will close that gap, and then comes all of the problems of automated law enforcement - Extrapolation from incomplete data, false positives from coindicences, interpretation errors, all that annoying stuff

justinjlynn · on April 12, 2023

Just wait until they go back and have the bandwidth to finally analyse all the historical data they've accumulated...

nr2x · on April 12, 2023

I’m guessing they are, and have been, for a while.

mztwo · on April 12, 2023

To add to that, the leaked documents from Snowden described a query language not unlike doing Boolean searches. Nothing close to GPT’s ability to comprehend human asks.

nr2x · on April 12, 2023

Yes but you can get GPT to write structured queries based on natural language. It works very, very well at turning normal phrases into SQL.

hex4def6 · on April 12, 2023

Collection an ocean of data from everyone isn't the same as actually painstakingly tying all the pieces together for everyone.

They've created a huge library of unorganized data. The difference here is they now can spawn a million untiring AI private investigators / librarians to organize this information into coherent "case files".

At least for me, until this point I've had a feeling of anonymity in the idea that, while my data is being slurped up, I'm just one data point in a sea of other 'normal' people. There would be little value in spending government time and effort tying all of the web detritus together for me. The juice would definitely not be worth the squeeze.

However, when the cost of this effort is nearly zero, that now becomes a different story. The balance of power between government and the people it rules is going to radically shift.

two_in_one · on April 12, 2023

Not exactly. Gov had to be selective because its surveillance required a lot of resources per person/call. New technology allows it cheap and en mass. Voice calls can be recorded, then converted to text, then filtered. And humans will only analyze something of interest. Like we did have alphabet and books, and newspapers for hundreds of years. But only with internet we got the ability to process them easily.

pixl97 · on April 12, 2023

Not only converted to text, it seems likely that we can document sentiment around the persons speech. For example if you're a low priority target that's still on the radar, but not high enough on the list to get a human handler I could see something like, not only what you said, but were you laughing, angry, crying. The the tone of your voice indicate the likelihood of action in a short time frame?

dwigthsomething · on April 12, 2023

> Gov had to be selective because its surveillance required a lot of resources

not exactly. Where do you think all those budget trillions that don't have to be accounted for goes into? the FBI+NSA (=CIA but for citzens) have infinite resources.

All the overhead they have is to make sure a small subset of the citizens are not impacted. Snowden goes into this in some detail when talking about day to day operations. The norm is to extend the net as wide as possible, until you reach some politician or government agency.

stevenhuang · on April 12, 2023

> This technology has been available since then

No, it hasn't lol.

nr2x · on April 12, 2023

How sure are you about that? The basic theories have been around since the 70s, have been proven at scale in the last decade, and the NSA has more data and compute than anybody else. I’d be shocked if they aren’t very far along in solving many problems.

RC_ITR · on April 12, 2023

This isn’t really a “throw money at it” problem like the government is good at.

Take drones for example. The government got really good at those because they made them jet-powered (lol) and blew a bunch of money on server-grade FPGA’s in each one of them.

You can’t really just buy a lot of GPUs to make an LLM work, you need iterative development of architecture and training methods.

Like maybe the government invented self-attention before 2017, but if they didn’t, then the constraint is training time, and the government has the same number of seconds as the rest of us.

rozal · on April 12, 2023

Did you like, forget the the military invented the nuclear bomb, semiconductors, coding, AI, NLP, the INTERNET.

Everything you use is from the military.

The government is good at lying, and making themselves ‘appear’ incompetent.

They secretly probably have a much further advanced quantum computer. Your viewpoint is limited to mainstream technology and mainstream science.

RC_ITR · on April 12, 2023

This is such an interesting take.

The military invented the nuclear bomb yes. But Fermi did most of his thinking work in Italy before the Manhattan project. He got money thrown at him once he got here.

As for semiconductor devices, it was Bell Labs and TI.

Coding is an ambiguous concept that wasn’t really invented, but if it were, it would have first appeared in programmable looms.

The military likes to take credit for things, but really all they do is throw money at existing inventions.

I’m sure they’re throwing a bunch of money at Transformers now, but who are all these uncredited super geniuses who invent things and then let randos at Google take the credit/earn the money?

rozal · on April 12, 2023

I used to agree with you. I used to think the military was kinda dumb. But after doing a deeper dive into past military technology, and present - I've come to realize this is just an intelligence ruse.

They made some mistakes in the 40's and 50's to where they had nuclear secrets stolen by the Russians. And ever since has been hyper compartmentalized.

It would not surprise me if in the 90's or 00's they had an internal working LLM, considering all the puzzle pieces. You will never hear about classified tech unless it's a bomb, gets leaked. (See code breaking machines declassed after 70+ years)

In a hypothetical scenario, a military organization might want to conceal its use of a large language model (LLM) for intelligence gathering and analysis.

Another scenario is the military's current interest in everything quantum. Quantum computers for example (you wouldn't want another nation being first and pirate baying out our secrets, would you) so there is an extreme national security importance of being first.

And to be first, you need to have the smart people, which the military has. There is a reason China struggles with jet engines 80 years after their invention, and still can't make nuclear carriers. While the US navy works on things like this: https://www.navair.navy.mil/foia/sites/g/files/jejdrs566/fil...

RC_ITR · on April 12, 2023

I mean, I think you’re again missing the point here.

Jet engines can be solved with money.

The actual steps of making an LLM require complex math and you can’t just pay people to make better math.

And if you could, wouldn’t those people decamp for industry and become literal trillionares?

rozal · on April 12, 2023

That's not how military intellectual property works. I don't doubt your technical credentials, but I think you lack deeper insight to what the military does, and has done.

RC_ITR · on April 13, 2023

>That's not how military intellectual property works.

So your implication is that the military is full of unnamed linear algebra, systems engineering, and linguistics super geniuses and these people never leave, never talk about their work publicly, never publish anything ever, and they're all cool with their huge innovations being kept away from the public forever? All because military IP regulation?

And none of these effective state prisoners ever defect to China (where they could live like royalty) because...

rozal · on April 14, 2023

Yes. That’s exactly how it works. Here is a brilliant one admitting to JUST that: https://youtu.be/8rHTff55fq4 (at the end)

stevenhuang · on April 16, 2023

Ah the reverse engineering of crashed ET craft.

If that is your angle, I very much agree it's possible some deep black project exists that (once) looked into this. In the UFO lore, there are many stories about these advances (tr3b etc).

But all of it is orthogonal to LLMs though. Picking one exceptional area that the military is great at (aerospace) does not suddenly make the military exceptional in other areas like AI.

rozal · on April 16, 2023

Who said anything about aliens? If we invented the nuclear bomb 36 years after the discovery of the atom, we did the same with virtual particles.

https://www.youtube.com/watch?v=-wU7nPDcTuY&t

See above video, even code breaking machines are kept a dark secret. LLM's that could make analytical decisions about war and strategy, must have started with the military first. It explains its massive data gathering operations in the 2000's. And it explains why some countries separation to make their own internet, away from what really is the US-Owned World Wide Web.

U.S. military has engaged in the commercialization of top-secret technologies (after it's considered obsolete by military standards), often by collaborating with private companies or research institutions.

nr2x · on April 12, 2023

I wouldn’t rule out NSA being ahead of the curve, but you have a good point re: GPUs. Likely another factor in the CHIPS act.

stevenhuang · on April 12, 2023

Pretty sure.

Even the Manhattan project had nuclear research going on in public universities at the time.

Nothing of the sort here for the attention mechanism which underpins LLMs we know today.

Fundamental research isn't something you just throw money at and acquire. All we had back then were cleverbot and other expert systems.

nr2x · on April 12, 2023

More my point is they have as big a research budget as a corporate lab.

rozal · on April 12, 2023

The military invented AI and NLP which underpins LLMs.

The military is responsible for most the technology we use and talk about today. The government may appear incompetent, but we’re living off military hand-me-downs, the entire world is

411111111111111 · on April 12, 2023

Your argument reminds me of the discussions about the moon landing.

If today's hardware was available 20 yrs ago, this would've been possible just like the moon landing could've been faked if it took place 20+ yrs later. The technology wasn't available at the time (GPUs in this case, and generally no experience in doing such advanced trick techniques for movies back then)

These models are having such a strong effect now because we've finally got the hardware to run them

111111101101 · on April 12, 2023

I'm not so sure about that. I mean, maybe not since the Snowden leaks but how do we know that governments haven't been running their own LLMs for the last five years or so? We know that they're using sockpuppets[1]. We know that they're astroturfing[2]. Integrating LLMs into their toolkits seems like an obvious move, so obvious that they would be stupid not to do it.

[1] https://www.theguardian.com/technology/2011/mar/17/us-spy-op...

[2] https://boingboing.net/2015/06/22/gchqs-psy-ops-squad-target...

pixl97 · on April 12, 2023

>haven't been running their own LLMs for the last five years

Because the hardware has not existed.

This said by accident I've seen hardware that was brought to a testing company by federal marshals that was massively parallel custom hardware that was likely for signal processing a lot of channels at once. So there is plenty of custom hardware out there, but these items have not been produced at the scale needed (from what anyone can tell) and, again from what we can tell, they don't have the general processing capability that GPU/TPU driven LLMs have.

lightedman · on April 12, 2023

Yes, it has. Consumer-wise, we've had Dragon Naturally Speaking since the late 90s. It's pretty simple to have a script read what it outputs text-wise and look for key words. No AI is even needed to do this.

RC_ITR · on April 12, 2023

Gaussian transcription models are old, but they also are AI.

They are not deep learning/neural nets.

Also fun fact as a pedant tax: Symantec is so named because they started out as transcription software, hit a wall, and pivoted to security SW.

wefarrell · on April 12, 2023

I think that was metadata and not actual audio of conversations.