>They should include the part where the order is a result of them deleting things they shouldn’t have then. You know, if this isn’t spin.
From what I can tell from the court filings, prior to the judge's order to retain everything, the request to retain everything was coming from the plaintiff, with openai objecting to the request and refusing to comply in the meantime. If so, it's a bit misleading to characterize this as "deleting things they shouldn’t have", because what they "should have" done wasn't even settled. That's a bit rich coming from someone accusing openai of "spin".
Your linked article talks about openai deleting training data. I don't see how that's related to the current incident, which is about user queries. The ruling from the judge for openai to retain all user queries also didn't reference this incident.
Without this devolving into a tit for tat then the article explains for those following this conversation why it’s been elevated to a court order and not just an expectation to preserve.
No worries. I can’t force understanding on anyone.
Here. I had an LLM summarize it for you.
A court order now requires OpenAI to retain all user data, including deleted ChatGPT chats, as part of the ongoing copyright lawsuit brought by The New York Times (NYT) and other publishers[1][2][6][7]. This order was issued because the NYT argued that evidence of copyright infringement—such as AI outputs closely matching NYT articles—could be lost if OpenAI continued its standard practice of deleting user data after 30 days[2][6][7].
This new requirement is directly related to a 2024 incident where OpenAI accidentally deleted critical data that NYT lawyers had gathered during the discovery process. In that incident, OpenAI engineers erased programs and search result data stored by NYT's legal team on dedicated virtual machines provided for examining OpenAI's training data[3][4][5]. Although OpenAI recovered some of the data, the loss of file structure and names rendered it largely unusable for the lawyers’ purposes[3][5]. The court and NYT lawyers did not believe the deletion was intentional, but it highlighted the risks of relying on OpenAI’s internal data retention and deletion practices during litigation[3][4][5].
The court order to retain all user data is a direct response to concerns that important evidence could be lost—just as it was in the accidental deletion incident[2][6][7]. The order aims to prevent any further loss of potentially relevant information as the case proceeds. OpenAI is appealing the order, arguing it conflicts with user privacy and their established data deletion policies[1][2][6][7].
Gruez said that is talking about an incident in this case but unrelated to the judge's order in question.
You said the article "explains for those following this conversation why it’s been elevated to a court order" but it doesn't actually explain that. It is talking about separate data being deleted in a different context. It is not user chats and access logs. It is the data that was used to train the models.
I pointed that out a second time since it seemed to be misunderstood.
Then you posted an LLM summary of something unrelated to the point being made.
Now we're here.
As you say, one cannot force understanding on another; we all have to do our part. ;)
Edit:
> The court order to retain all user data is a direct response to concerns that important evidence could be lost—just as it was in the accidental deletion incident[2][6][7].
What did you prompt the LLM with for it to reach this conclusion? The [2][6][7] citations similarly don't seem to explain how that incident from months ago informed the judge's recent decision. Anyway, I'm not saying the conclusion is wrong, I'm saying the article you linked does not support the conclusion.
I think in your rush to reply you may have not read the summarization.
Calm down, cool off, and read it again.
The point is that the circumstances of the incident in 2024 are directly related to the how and why of the NYT lawyers request and the judges order.
The article I linked was to the incident in 2024.
Not everything has to be about pedantry and snark, even on HN.
Edit: I see you edited your response after re-reading the summarization. I’m glad cooler heads have prevailed.
The prompt was simply “What is the relation, if any, between OpenAI being ordered to retain user data and the incident from 2024 where OpenAI accidentally deleted the NYT lawyers data while they were investigating whether OpenAI had used their data to train their models?”
> I see you edited your response after re-reading the summarization.
Just to be clear, the summary is not convincing. I do understand the idea but none of the evidence presented so far suggests that was the reason. The court expected that the data would be retained, the court learned that it was not, the court gave an order for it to be retained. That is the seeming reason for the order.
Put another way: if the incident last year had not happened, the court would still have issued the order currently under discussion.
From what I can tell from the court filings, prior to the judge's order to retain everything, the request to retain everything was coming from the plaintiff, with openai objecting to the request and refusing to comply in the meantime. If so, it's a bit misleading to characterize this as "deleting things they shouldn’t have", because what they "should have" done wasn't even settled. That's a bit rich coming from someone accusing openai of "spin".