X users are now officially just training data for Grok.

spiderice · 2025-03-28T22:45:12 1743201912

The comment you just wrote is training data for every major LLM out there.

numpad0 · 2025-03-29T05:06:03 1743224763

But the difference is, Twitter users leave way too much into situational context for an LLM to comprehend, and so...

saltysalt · 2025-03-28T22:46:06 1743201966

I'm aware, thanks.

SV_BubbleTime · 2025-03-28T22:39:14 1743201554

As opposed to Reddit? Or your Gmail? Or everything else on the internet?

saltysalt · 2025-03-28T22:43:23 1743201803

I take your point, however an AI company owning a social media platform is new.

swat535 · 2025-03-28T23:01:54 1743202914

Sam owns Reddit and OpenAI.. how is that new? Google also owns Gmail and all Google Services , as well as its AI.

saltysalt · 2025-03-28T23:05:56 1743203156

OpenAI does not own a social network. Google doesn't even own a social network anymore.

Neither are valid comparisons to xAI owning X.

miohtama · 2025-03-28T23:16:56 1743203816

Google owns half of the emails in the world which may be more valuable.

saltysalt · 2025-03-28T23:24:15 1743204255

Email is not a realtime social network, not even close.

nothrowaways · 2025-03-28T23:59:03 1743206343

Does Google train AI on emails?

whats_a_quasar · 2025-03-29T00:00:12 1743206412

And also Sam Altman doesn't own either OpenAI or Reddit, lol.

rvba · 2025-03-29T00:17:04 1743207424

My understanding is that Google "owns" reddit in the sense that they paid to use it as source of training data. And goodle paid reddit so much that they have exclusive rights for that.

Probably this is the reason why all the reddit free public APIs are gone - to block scraping.

bag_boy · 2025-03-29T04:29:05 1743222545

Can you cite your sources on Altman not owning Reddit?

whats_a_quasar · 2025-03-29T20:04:10 1743278650

https://en.wikipedia.org/wiki/Reddit

Owners: Advance Publications (30%), Tencent (11%), Sam Altman (9%)

kristjansson · 2025-03-29T06:38:42 1743230322

"large-ish minority shareholder" != owner

fzzzy · 2025-03-28T22:57:42 1743202662

Not really, Facebook (Meta, whatever) has been an ai company for a long time.

saltysalt · 2025-03-28T23:02:43 1743202963

Their social network is not owned by a private AI company.

pests · 2025-03-29T17:01:08 1743267668

xAI is not a private company.

The AI company is public, but he social network was private.

saltysalt · 2025-03-30T08:24:13 1743323053

"xAI is a privately held company and is not publicly traded, therefore investing in xAI pre-IPO is only available to accredited investors."

Source: https://forgeglobal.com/xai_ipo/

pests · 2025-03-30T23:27:27 1743377247

Oops, thanks for the correction.

maxlin · 2025-03-28T23:45:10 1743205510

With that logic Github, StackOverflow, rest of internet is also "only" training data.

X just produces extra valuable training data as a byproduct. Like power plants create certain byproducts that can be sold etc. Good to see it going to Grok primarily, as other LLM's are far from being truth seeking with their built-in, documented, extreme bias.

saltysalt · 2025-03-28T23:50:32 1743205832

None of the companies you mentioned are owned by a private AI company, except X.

I can't think of any other example of an AI company owning it's own social network, it's a fresh precedent.

maxlin · 2025-03-28T23:58:51 1743206331

That is irrelevant to the invalidity of your original statement. LLM's clearly don't have problems having their training data scraped from all those mentioned irregardless of their ownership.

saltysalt · 2025-03-29T00:03:33 1743206613

My original statement was from the perspective of the users, not the LLMs. Perfectly valid to empathize with them.

maxlin · 2025-04-03T08:44:53 1743669893

No it isn't ok to patronize X users with a false precedent. X and Grok work very well together, one can ask questions and get relevant, and RECENT posts by X users answering that query, something other LLM's can't really do.

Content created by X users is for X users to find either through their feed, basic search, or Grok. There's no foul play here, and how Grok uses data on X is not hard to defend even from a basic "better search" angle. Your "emphatize" comment sounds like "will someone think of the african children" kind of detached waste of breath, something the Chinese call "Baizuo".

saltysalt · 2025-04-07T09:50:35 1744019435

It's not patronizing, it's a statement of fact: X is the only social network owned by an AI company (xAI), that only has one product (Grok) that is trained by data from X, which is user-generated data.

Now, you may not like that, but it's still real.

kristjansson · 2025-03-29T06:36:25 1743230185

Aligning with the distribution you happened to be able to sample from is not 'truth seeking'.

madeofpalk · 2025-03-28T22:56:38 1743202598

I presume they officially were before. And just unofficially for every other model, as all our posts online are.

saltysalt · 2025-03-28T23:06:31 1743203191

Indeed!