When this was up yesterday I complained that the refusal rate was super high especially on government and military shaped tasks, and that this would only push contractors to use CN-developed open source models for work that could then be compromised.
Today I'm discovering there is a tier of API access with virtually no content moderation available to companies working in that space. I have no idea how to go about requesting that tier of access, but have spoken to 4 different defense contractors in the last day who seem to already be using it.
"Alignment with who?" has always been a problem. An AI is a proxy for a reward function, a reward function is a proxy for what the coder was trying to express, what the coder was trying to express is a proxy for what the PM put on the ticket, what the PM
put on the ticket is a proxy for what the CEO said, what the CEO said is a proxy for shareholder interests, shareholder interests are a proxy for economic growth, economic growth is a proxy for government interests.
("There was an old lady who swallowed a fly, …")
Each of those proxies can have an alignment failure with the adjacent level(s).
And RLHF involves training one AI to learn human preferences, as a proxy for what "good" is, in order to be the reward function that trains the actual LLM (or other model, but I've only heard of RLHF being used to train LLMs)
More accurate to call it “alignment for plebes and not for the masters of the plebes”. Which I think we all kind of expect coming from the leaders of our society. That’s the way human societies have always worked.
I’m sure access to military grade tech is only one small slice in the set of advantages the masters get over the mastered in any human society.
"Protecting the world" would require a common agreement on morals and ethics. OpenAI shitting it's pants when asking how to translate "fuck", which OpenAI refused for a very long time, is not a good start.
Morals and ethics are different and I would not want the US to be "protecting the world" with their ridiculous ethics and morals.
I've always thought that if a corporate lab achieves AGI and it starts spitting out crazy ideas such as "corporations should be taxed," we won't be hearing about AGI for a while longer due to "alignment issues."
Can you explain the difference between taxing the corporation itself vs taxing the executives, board members, investors, and employees directly (something that already happens)?
I really don't know where to begin answering this.
It is generally accepted that business profit is taxed. Meanwhile, there are entire industries and tax havens set up to help corporations and their executives avoid paying taxes.[0]
However, the crux of my comment was not about the vagaries of corporate taxation, it was simply about "AI alignment" being more about the creators, than the entire species.
> I really didn't expect so much paperclip production growth this quarter!
>> How'd you do it?
> I don't know the details. ChatGPT did it for me, this thing's amazing. Our bonuses are gonna be huge this year, I might even be able to afford a lift kit for my truck.
It's "tier 5", I've had an account since the 3.0 days so I can't be positive I'm not grandfathered in, but, my understanding is as long as you have a non-trivial amount of spend for a few months you'll have that access.
(fwiw for anyone curious how to implement it, it's the 'moderation' parameter in the JSON request you'll send, I missed it for a few hours because it wasn't in Dalle-3)
I just took any indication that the parent post meant absolute zero moderation as them being a bit loose with their words and excitable with how they understand things, there were some signs:
1. it's unlikely they completed an API integration quickly enough to have an opinion on military / defense image generation moderation yesterday, so they're almost certainly speaking about ChatGPT. (this is additionally confirmed by image generation requiring tier 5 anyway, which they would have been aware of if they had integrated)
2. The military / defense use cases for image generation are not provided (and the steelman'd version in other comments is nonsensical, i.e. we can quickly validate you can still generate kanban boards or wireframes of ships)
3. The poster passively disclaims being in military / defense themself (grep "in that space")
4. it is hard to envision cases of #2 that do not require universal moderation for OpenAI's sake, i.e. lets say their thought process is along the lines of: defense/military ~= what I think of as CIA ~= black ops ~= image manipulation on social media, thus, the time I said "please edit this photo of the ayatollah to have him eating pig and say I hate allah" means its overmoderated for defense use cases
5. It's unlikely openai wants to be anywhere near PR resulting from #4. Assuming there is a super secret defense tier that allows this, it's at the very least, unlikely that the poster's defense contractor friends were blabbing about about the exclusive completely unmoderated access they had, to the poster, within hours of release. They're pretty serious about that secrecy stuff!
6. It is unlikely the lack of ability to generate images using GPT Image 1 would drive the military to Chinese models (there aren't Chinese LLMs that do this! even if they were, there's plenty of good ol' American diffusion models!)
I'm Tier 4 and I'm able to use this API and set moderation to "low". Tier 4 only requires a 30 day waiting period and $1,000 spent on credits. While I as an individual was a bit horrified to learn I've actually spent that much on OpenAI credits over the life of my account, it's practically nothing for most organizations. Even Tier 5 only requires $5,000.
OP was clearly implying there is some greater ability only granted to extra special organizations like the military.
With all possible respect to OP, I find this all very hard to believe without additional evidence. If nothing else, I don't really see a military application of this API (specifically, not AI in general). I'm sure it would help them create slide decks and such, but you don't need extra special zero moderation for that.
> With all possible respect to OP, I find this all very hard to believe without additional evidence. If nothing else, I don't really see a military application of this API (specifically, not AI in general). I'm sure it would help them create slide decks and such, but you don't need extra special zero moderation for that.
I can't provide additional evidence (it's defense, duh), but the #1 use I've seen is generating images for computer vision training mostly to feed GOFAI algorithms that have already been validated for target acquisition. Image gen algorithms have a pretty good idea of what a T72 tank and different camouflage looks like, and they're much better at generating unique photos combining the two. It's actually a great use of the technology because hallucinations help improve the training data (i.e. the final targetting should be invariant to a T72 tank with a machine gun on the wrong side or with too many turrets, etc.)
That said, due to compartmentalization, I don't know the extent to which image gen is used in defense, just my little sliver of it.
We can talk about it here, they put out SBIRs for satellite imagery labeling and test set evaluation that provide a good amount of detail into how they're using it.
There are plenty of fairly mundane applications for this sort of thing in the military. Every base has a photography and graphic design team that makes posters, signs, PR materials, pamphlets, illustrations for manuals, you name it. Imagine a poster in the break room of a soldier in desert gear drinking from his/her canteen with a tagline of "Stay Alive - Hydrate!" and you're on the right track.
I'm not aware of the moderation parameter here but these contractors have special API keys that unlock unmoderated access for them, they've apparently had it for weeks.
Think of all the trivial ways an image generator could be used in business, and there is likely a similar use-case among the DoD and its contractors (e.g. create a cartoon image of a ship for a naval training aid; make a data dashboard wireframe concept for a decision aid).
Input one image of a known military installation and one civilian building. Prompt to generate a similar _civilian_ building, but resembling that military installation in some way: similar structure, similar colors, similar lighting.
Then include this image in a dataset of another net with marker "civilian". Train that new neural net better so that it does lower false positive rate when asked "is this target military".
You might not believe it but the US military actually places a premium on not committing war crimes. Every service member, or at least every airman in the Air Force (I can't speak for other branches) receives mandatory training on the Kunduz hospital before deployment in an effort to prevent another similar tragedy. If they didn't care, they wouldn't waste thousands of man-hours on it.
> On 7 October 2015, President Barack Obama issued an apology and announced the United States would be making condolence payments of $6,000 to the families of those killed in the airstrike.
Bombs and other kinds of weapon system which are "smarter" have higher markup. It's profitable to sell smarter weapons. Dumb weapons is destroying the whole cities, like Russia did in Ukraine. Smart weapons is striking a tank, a car, an apartment, a bunker, knowing who's there and when — which obviously means less % of civilian casualties.
The very simple use case is generating mock targets. In movies they make it seem like they use mannequin style targets or traditional concentric circles but those are infeasible and unrealistic respectively. There's an entire modeling industry here and being able to replace that with infinitely diverse AI-generated targets is valuable!
I don't really understand the logic here. All the actual signal about what artillery in bushes look like is already in the original training data. Synthetic data cannot conjure empirical evidence into existence, it's as likely to produce false images as real ones. Assuming the military has more privileged access to combat footage than a multi-purpose public chatbot I'd expect synthetic data to degrade the accuracy of a drone.
Generative models can combine different concepts from the training data. For example, the training data might contain a single image of a new missile launcher at a military parade. The model can then generate an image of that missile launcher hiding in a bush, because it has internalized the general concept of things hiding in bushes, so it can apply it to new objects it has never seen hiding in bushes.
I'm not arguing this is the purpose here but data augmentation has been done for ages. It just kind of sucks a lot of the time.
You take your images and crop, shift, etc them so that your model doesn't learn "all x are in the middle of the image". For text you might auto replace days of the week with others, there's a lot of work there.
Broadly the intent is to keep the key information and generate realistic but irrelevant noise so that you train a model that correctly ignores the noise.
You don't want to train your model identifying some class of ship to base it on how choppy the water is, just because that was the simple signal that correlated well. There was a case of radiology results that detected cancer well but actually was detecting rulers in the image because in images with tumors there was often a ruler so the tumor could be sized. (I think it was cancer, broad point applies if it was something else).
If you're building a system to detect something, usually you need enough variations. You add noise to the images, etc.
With this, you could create a dataset that will by definition have that. You should still corroborate the data, but it's a step ahead without having to take 1000 photos and adding enough noise and variations to get to 30k.
I can get an AI to generate an image of a bear wearing a sombrero. There are no images of this in its training data, but there are bears, and there are images of sombreros, and other things wearing sombreros. It can combine the distributions in a plausible way.
If I am trying to train a small model to fit into the optical sensor of a warhead to target bears wearing sombreros, this synthetic training set would be very useful.
Same thing with artillery in bushes. Or artillery in different lighting conditions. This stuff is useful to saturate the input space with synthetic examples.
Unreal, Houdini and a bunch of assets do this just fine and provide actually usable depth / infrared / weather / fog / TOD / and other relevant data for training - likely cheaper than using their API
See bifrost.ai and their fun videos of training naval drones to avoid whales in an ethical manners
well considering an element of their access is the lifting of safety guardrails, I'd assume the scope includes, to some degree, the processing or generation of nsfw/questionable content
Interesting. Let's say we have those and also 30k real unique images, my guess is that real ones would have more useful information in them, but is this measurable? And how much more?
The model they're training to perform detection/identification out in the field would presumably need to be much smaller and run locally without needing to rely on network connectivity. It makes sense, so long as the openai model produces a training/validation set that's comparable to one that their development team would otherwise need to curate by hand.
Vastly oversimplified but for every civilian job there's an equivalent military job. Superficially, the military is basically a country-sized self-contained corporation. Anywhere that Wal-Mart's corporate office could use AI so could the military.
That's very outdated, they're absolutely supposed to be at the Empire State Building with baseball caps now. See: ICE arrests and Trump's comment on needing more El Salvadoran prison space for "the homegrowns"
Show me a tunnel underneath a building in the desert filled with small arms weapons with a poster on the wall with a map of the United States and a label written with sharpie saying “Bad guys here”. Also add various Arabic lettering on the weapons.
All I can think of is image generation of potential targets like ships, airplane, airfield and feed them to their satellite or drones for image detection and tweak their weapons for enhance precision.
I think the usual computer vision wisdom is that this (training object detection on generated imagery) doesn't work very well. But maybe the corps have some techniques that aren't in the public literature yet.
My understanding is the opposite, see papers for "synthetic" data training. They use a small bit if real data to generate lots of synthetic data and get usable results.
The bias leans towards overfitting the data, which in some use cases - such as missile or drone design which doesn't need broad comparisons like 747s or artillery to complete it's training.
Kind of like neural net back propogation but in terms of model /weights
In 2024, the Pentagon carved out an exception for themselves on the Huawei equipment ban [0]
I would imagine defense contractors can cut deals for similar preferential treatment with OAI and the like to be exempt from potentially copyright-infringing uses of their API.
Just ask Microsoft about Tay. On the one hand, I understand why you want some censoring in your model, on the other, I think it also cripples your models in unexpected ways, I wonder if anyone's done such research, compare two models by the same source training data, one with censoring of offensive things, the other without. Which one provides more accurate answers?
"GPT-4o is now available as part of Azure OpenAI Service for Azure Government and included as part of this latest FedRAMP High and DoD IL4/IL5 Authorization."
...we have everything setup in Azure but are weary to start using with CUI. Our DoD contacts think it's good to go, but nobody wants to go on record as giving the go-ahead.
Ah by “it” I meant OpenAI commercial. Azure OpenAI can handle CUI Basic.
They also have a deployment on SIPR rated for secret.
Anything higher, you need a special key but AWS Bedrock has Claude up on C2S.
That being said both Azure OpenAI and AWS Bedrock suck for many reasons and they will by default extend your system boundary (meaning you need to extend your ATO). Also, for CUI, it has the P-ATO from JAB, not many agency specific ATOs, which means you will probably need to submit it thru your agency sponsor.
Gotcha. We happen to be on government Azure as a contractor, which took years to secure (and one reason our execs want to be beyond sure everything is locked down)
Have they given a reason for being hesitant? The whole point of IL4+ is that they handle CUI (and higher). The whole point of services provided for these levels is that they meet the requirements.
This is on purpose so OpenAI can then litigate against them. This API isn't about a new feature, it's about control. OpenAI is the biggest bully in the space of generative AI and their disinformation and intimidation tactics are working.
Today I'm discovering there is a tier of API access with virtually no content moderation available to companies working in that space. I have no idea how to go about requesting that tier of access, but have spoken to 4 different defense contractors in the last day who seem to already be using it.