Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The judge is clearly not caring about this issue so arguing before her seems pointless. What is the recourse for OpenAI and users?


You don't have any recourse, at least not under American law. This a textbook third-party doctrine case: American law and precedent is unambiguous that once you voluntarily give your data to a third party-- e.g. when you sent it to OpenAI-- it's not yours anymore and you have no reasonable expectation of privacy about it. Probably people are going to respond to this with a bunch of exceptions, but those exceptions all have to be enumerated and granted specifically with new laws; they don't exist by default, and don't exist for OpenAI.

Like it or not, the judge's ruling sits comfortably within the framework of US law as it exists at present: since there's no reasonable expectation of privacy for chat logs sent to OpenAI, there's nothing to weigh against the competing interest of the active NYT case.


> once you voluntarily give your data to a third party-- e.g. when you sent it to OpenAI-- it's not yours anymore and you have no reasonable expectation of privacy about it.

The 3rd party doctrine is worse than that - the data you gave is not only not yours anymore, it is not theirs either, but the governments. They're forced to act as a government informant, without any warrant requirements. They can say "we will do our very best to keep your data confidential", and contractually bind themselves to do so, but hilariously, in the Supreme Court's wise and knowledgeable legal view, this does not create an "expectation of privacy", despite whatever vaults and encryption and careful employee vetting and armed guards standing between your data and unauthorized parties.


I don't think it is accurate to say that the data becomes the government's or they have to act as an informant (I think that implies a bit more of an active requirement than responding to a subpoena), but I agree with the gist.


This clearly seems counter to the spirit of the 4th amendment.


> You don't have any recourse, at least not under American law.

Implying that the recourse is to change the law.

Those precedents are also fairly insane and not even consistent with one another. For example, the government needs a warrant to read your mail in the possession of the Post Office -- not only a third party but actually part of the government -- but not the digital equivalent of this when you transfer some of your documents via Google or Microsoft?

This case is also not the traditional third party doctrine case. Typically you would have e.g. your private project files on Github or something which Github is retaining for reasons independent of any court order and then the court orders them to provide them to the court. In this case the judge is ordering them to retain third party data they wouldn't have otherwise kept. It's not clear what the limiting principle there would be -- could they order Microsoft to retain any of the data on everyone's PC that isn't in the cloud, because their system updater gives them arbitrary code execution on every Windows machine? Could they order your home landlord to make copies of the files in your apartment without a warrant because they have a key to the door?


> It's not clear what the limiting principle there would be -- could they order Microsoft to retain any of the data on everyone's PC that isn't in the cloud, because their system updater gives them arbitrary code execution on every Windows machine?

My understanding is it's closer to something like: They cannot order a company to create new tools, but can tell them to not destroy the data they already have. So, in the question of MS having the ability to create a tool that extracts your data is not the same as MS already having that tool functioning and collecting all of your data that they store and are then told to simply not destroy. Similarly, VPNs that are not set-up to create logs can't keep or hand over what they don't have.

Laws can be made to require the collection and storage of all user data by every online company, but we're not there -- yet. Many companies already do it on their own, and the user then decides if that's acceptable or not to continue using that service.

If the company created their service to not have the data in the first place, this probably never would have found its way to a judge. Their service would cost more, be slower, and probably be difficult to iterate on as it's easier to hack things together in a fast moving space then build privacy/security first solutions.


The issue is, what does "the data they already have" mean? Does your landlord "have" all the files in your apartment because they have a key to the door?


Real Property, Tenant's rights, and Landlord laws are an entirely separate domain. However, I believe in some places if you stop paying rent long enough, then yes all your stuff now belongs to the landlord because they have the "key".

"the data they already have" means the data the user gave the company (no one is "giving" their files to their landlord) and that the company is in full possession of and now owns. Users in this case are not in possession or ownership of the data they gave away at this point.

If you hand out photocopies of the files in your apartment, the files in your apartment are still yours, but the copies you gave away to a bunch of companies are not. Those now belong to the company you gave them to and they can do whatever they want with it. So if they keep it and a judge tells them the documents are not to be destroyed (because laws things), they would probably get into trouble if they went against the order.

Which is what I was trying to bring attention to; the fact that the company has a choice in what data (if any) they decided to collect, possess, and own. If they never collected/stored it then no one's privacy would be threatened.


> However, I believe in some places if you stop paying rent long enough, then yes all your stuff now belongs to the landlord because they have the "key".

There is nothing analogous to this happening in this case though. The users aren't in default on any financial obligations.

> "the data they already have" means the data the user gave the company (no one is "giving" their files to their landlord) and that the company is in full possession of and now owns

You put your files in the landlord's building because you're leasing the apartment from them. You put your data on the provider's servers because you're leasing the service from them. How do they own this data? Did you assign the copyright to them? Does the judge's ability to order them to keep it depend on what the contract between you and the service says?


> You put your files in the landlord's building because you're leasing the apartment from them. You put your data on the provider's servers

The comparison between the physical and digital is not 1 to 1. You are putting copies of your files(data) into a digital server that you do not have possession of.

You have possession of your apartment and there are laws that apply to that world that in no way apply to copies of files you gave (gave being the key word) to the server that you do not have possession of.

Unless the company (company being the problem) specifically sets up terms, usually in a very expensive enterprise contract -- with legal cutouts for law things judges may do -- which promises what you give to their servers will function more like the rental apt you are trying to tie together -- you are simply giving them data. point. blank. simple. Even then they could renege and then you deal with it or go through long legal process to sue. The ship for data privacy has already long sailed by the time you come at a judge that told them not to destroy it.

People agree to it so companies get away with being allowed to use the copies of data you give them however they want. Thats why it's different. People(society that made the laws) don't accept that behaviour from landlords. They shouldn't from the online world either but here we are. I have no other way of helping this click for you, and at this point it's just moving the same words around to try and find the right sequence that will. Good luck, take care.


The third-party doctrine has been weakened by the Supreme Court recently, in United States v. Jones and Carpenter v. United States. Those are court decisions, not new laws passed by Congress. See also this quote:

https://en.wikipedia.org/wiki/Third-party_doctrine#:~:text=w...

If OpenAI doesn't succeed at oral argument, then in theory they could try for an appeal either under the collateral order doctrine or seeking a writ of mandamus, but apparently these rarely succeed, especially in discovery disputes.


Justice Sotomayor's concurrence in U.S. v. Jones is not binding precedent, so I wouldn't characterize it as weakening the third-party doctrine yet.


Yep. This is why we need constitutional amendments or more foundational laws around privacy that changes this default. Which should be a bipartisan issue, if money had less influence in politics.


This is the perverse incentives one rather than the money one. The judges want to order people to do things and the judges are the ones who decide if the judges ordering people to do things is constitutional.

To prevent that you need Congress to tell them no, but that creates a sort of priority inversion: The machinery designed to stop the government from doing something bad unless there is consensus is then enabling government overreach unless there is consensus to stop it. It's kind of a design flaw. You want checks and balances to stop the government from doing bad things, not enable them.


  > once you voluntarily give your data to a third party-- e.g. when you sent it to OpenAI-- it's not yours anymore and you have no reasonable expectation of privacy about it.
sorry for the layperson question, but does this apply then to my company's storage of confidential info on say google drive, even with an enterprise agreement?


OpenAI is the actual counterparty here though and not a third party. Presumably their contracts with their users are still enforceable.

Furthermore, if the third party doctrine is upheld in its most naïve form, then this would breach the EU-US Data Privacy Framework. The US must ensure equivalent privacy protections to those under the GDPR in order for the agreement to be valid. The agreement also explicitly forbids transferring information to third parties without informing those whose information is transferred.


Well, I don't think anyone is expecting the framework to work this time either after earlier tries has been invalidated. It is just panicked politicians trying to kick the can to avoid the fallout that happens when it can't be kicked anymore.


Yes, and I suppose the courts can't care that much about executive orders. Even so, one would think that they had some sense and wouldn't stress things that the politicians have built.


3rd party doctrine in the US is actual law... so I'm not sure what's confusing about that. The president has no power to change discovery law. That's congress. Why would a judge abrogate US law like that?


You're confused. This is not about the FBI's right to data, it's about the New York Times' right to the same. The doctrine you're referencing is irrelevant.

The magistrate is suggesting that there is no reasonable expectation of privacy in chats OpenAI agreed to delete, at the request of users. This is bizarre, because there's no way for OpenAI to use data that is deleted. It's gone. It doesn't require abrogation of US law, it requires a sensible judge to sit up and recognize they just infringed on the privacy expectations of millions of people.


It's a routine discovery hearing regarding documents that OpenAI creates and keeps for a period of time in the normal practice of its business.


They probably do already, but won't this ruling force OpenAI to operate separate services for the US and EU? The US users must accept that their logs are stored indefinitely, while an EU user is entitled to have theirs delete.


Stop giving your information to third parties with the expectation that they keep it private when they wont and cannot. Your banking information is also subject to subpoena... I don't see anyone here complaining about that. Just the hot legal issue of the day that programmers are intent on misunderstanding.


Not a real answer, but I think a local LLM is going to be the way to go. I've been playing with them for some time now — and, yeah, they're still not there with the save context, hardware requirements needed for a really good model... But I suspect like anything else in tech, a year or two from now a decent local LLM will not be such a stretch.

I can't wait actually. It's less about privacy to me than to being offline.


> a local LLM is going to be the way to go

Non-technicals don't know how LLMs work, and, more importantly, don't care about their privacy.

For a technology to be widely used, by definition, you need to make it appealing to the masses, and there is almost zero demand for private LLM right now.

That's why I don't think that local llms will win. There are narrow use cases where regulations can force local llm usage (like for medical stuff), but overall I think that services will win (as they always do)


> there is almost zero demand for private LLM right now.

you need some really expensive hardware to run a local LLM, most of which is unavailable to the average user. The demand might just simply be hidden as these users do not know nor want to expend the resources for it.

but i have hope that the hardware costs will come down eventually, enough that it reveals the demand for local LLM.

After all, i prefer my private questions to an LLM not be ever revealed.


I still happen to trust Apple with my cloud data, with their secure enclave. To that end, an Apple solution where my history/context is kept in my cloud account, perhaps even a future custom Apple chip that could run some measure of a local LLM.... This "aiPhone" might be the mainstream solution that non-technicals will enjoy.


> I think that services will win (as they always do)

We can have services but also private history/contexts. Those can be "local" (and encrypted).


Most companies would kill for private LLM capabilities were it possible. I think Mistral's even making it part of their enterprise strategy.


>"What is the recourse for OpenAI and users?"

Start using services of countries who are unlikely to submit data to the US.


appealing whatever ruling this judge makes?


You can't appeal a case you're not a party to.


It's a direct answer to the question what recourse OpenAI has.

Users should stop sending information that shouldn't be public to US cloud giants like OpenAI.


Do you really think a European court wouldn't similarly force a provider to preserve records in response to being accused of destroying records pertinent to a legal dispute?


Fundamentally, on-prem or just foregoing is the safest way, yes. If one still uses these remote services it's also important to be prudent about exactly what data you share with them when doing so[0]. Note I did not say "Send your sensitive data to these countries instead".

The laws still look completely different in US and EU though. EU has stronger protections and directives on privacy and weaker supremacy of IP owners. I do not believe lawyers in any copyright case would get access to user data in a case like this. There is also a gap in the capabilities and prevalence of govt to force individual companies or even employees to insert and maintain secret backdoors with gag orders outside of court (though parts of the EU seem to be working hard to close that gap recently...).

[0]: Using it to derive baking recipes is not the same as using it to directly draft personal letters. Using it over VPN with pseudonym account info is not the same as using it from your home IP registered to your personal email with all your personals filled out and your credit card linked. Running a coding agent straight on your workstation is different to sandboxing it yourself to ensure it can only access what it needs.


> I do not believe lawyers in any copyright case would get access to user data in a case like this.

Based on what? Keep in mind that the data is to be used for litigation purposes only and cannot be disclosed except to the extent necessary to address the dispute. It can't be given to third parties who aren't working on the issue.

> There is also a gap in the capabilities and prevalence of govt to force individual companies or even employees to insert and maintain secret backdoors with gag orders outside of court

There's no secret backdoor here. OpenAI isn't being asked to write new code--and in fact their zero-data-retention (ZDR) API hasn't changed to record data that it never recorded in the first place. They were simply ordered to disable deletion functionality in their main API, and they were not forbidden from disclosing that change to their customers.


"OpenAI user" is not an inherent trait. Just use another product, make it OpenAI's problem.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: