Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
ChatGPT and the Enshittening of Knowledge (castlebridge.ie)
277 points by bootyfarm on Jan 27, 2023 | hide | past | favorite | 285 comments


If you think of the knowledge base of the internet as a living thing, ChatGPT is a like a virus that now threatens its life.

This is the same process SEO spam caused for search - it hampers the nature by which things function and the river needs to reroute (pagerank then usage metadata) to replace the lost signal.

ChatGPT is more of an existential threat because it will propagate to infect other knowledge bases. Luke Wikipedia relies on "published" facts as an authority, but ChatGPT output is going to wind up as a source one way or another. And worse, then ChatGPT will digest its own excrement, worsening its own results further.

All signs point to this strengthening the value of curation and authenticated sources.


> ChatGPT is more of an existential threat because it will propagate to infect other knowledge bases. Luke Wikipedia relies on "published" facts as an authority, but ChatGPT output is going to wind up as a source one way or another. And worse, then ChatGPT will digest its own excrement, worsening its own results further.

This is what people do collectively, long before any GPTs were in sight. Lots of strong convictions people hold today and publish all over the place, are re-processed excrements of long-gone mental viruses of past civilizations.


Automation and scale create their own threats.

Security cameras have existed for a long time, but storage cheap enough to keep years of footage and algorithms capable of processing thousands of streams in real time create massive privacy problems that didn't exist even with the richest companies paying humans to watch.


>Automation and scale create their own threats.

I don't know why such a simple fact needs to be repeated over and over again. It's either naivete or malice that makes people ignore that fact.

A change in scale can easily lead to a change in kind. A party popper and a flashbang are functionally the same thing, but their scale makes them have wildly different implications.


Another example is the police. Most people agree the existence of a police force to enforce laws is a good thing (society would function very differently otherwise). But if there was a policeman for each other person on the planet following them 24x7 and enforcing every possible law on them, not so much anymore.

Quantity is a quality all of its own.


On the other hand, why have a law if it’s not meant to be enforced universally and consistently?

When laws are applied selectively it creates an unequal experiences in the population.

No one wants the tyranny of oppressive applications of overbearing laws. So, in those instances, change the law to be fair enough and compassionate enough that it can be applied in all instances where the letter of the law is broken.

And obviously privacy is important and ubiquitous surveillance would undermine our ability to enjoy life. But in public spaces, consistently applying fairly written compassionate laws wouldn’t necessarily be a bad thing.


Because the real world has nuance and is not black and white. Humanity relies on people using their judgment; trying to make absolute laws with zero tolerance has been a failure everywhere it's tried. It is impossible to numerate all reasonable exceptions, and impossible to specify exceptions precisely enough that bad actors can't exploit them.

If you make the rules overly strict and enforce them universally, you end up with people in jail for offenses no one cares about.

If you make the rules at all loose, bad actors instantly seize on any loopholes and ruin the commons for anyone.


Yeah, but that’s why you have a “human in the loop”, to handle the infinite number of edge cases. You’d never want end-to-end AI for anything mission critical like justice.


You have a human in the loop explicitly to, in your words, not "enforce universally and consistently".

* Most people agree that stealing from a store is wrong.

* Most people agree that opening food/medicine and consuming it in the store before paying is stealing

* Most people believe that helping those in a medical emergency is important

If I was in a store and saw someone going into hypoglycemia and grabbed a candy bar and handed it to them, or if they were having a heart attack and I grabbed a bottle of aspirin and opened it to give them one, I am committing a crime. Most reasonable people would say that even if a police officer was standing in front of me watching me do it that I should not be charged.

That's why we don't want universal enforcement.


> Most people agree that opening food/medicine and consuming it in the store before paying is stealing

In my jurisdiction, that is only stealing if you do it with intention of not paying for it.

Sometimes I go to the supermarket, pick a drink off the shelf, start drinking it, take the partially drunk (or sometimes completely empty) bottle to the checkout to pay. Never got in trouble, staff have never complained - I know the law is on my side, and pretty confident the staff training tells them the same thing.


If you’re depending on sussing out peoples intent then you’re accepting that we can’t be clear/zero tolerance about it. If you catch me stealing and I just go “oh no dude, I was totally going to pay” but you don’t believe me, what then? You can’t possibly know what my actual intention was.


The physical design of the store makes it clear in most cases. The checkouts form a physical barrier between the “haven’t paid yet” area and the “have paid“ area. It is difficult to assume an attempt to steal in the former, much easier once one passes to the later with unpaid goods.

The legal definition of theft - at least where I live - is all about intention. It involves an intention to deprive another of their property. No intention, no theft. If you absent-mindedly walk out of a store without paying for something, no theft has occurred. When our kids were babies, we used to put the shopping in the pram. One day I left the supermarket and down the street discovered a loaf of bread in a different section of it, that I’d forgotten to pay for. I went back and explained myself to the security guard, did he call the police? No, he commended me for my honesty, and let me pay for it with the self-serve checkouts.

For a supermarket, their biggest concern with theft is the repeat offenders. If it is an unclear situation, it is in their best interest to give the customer the benefit of the doubt. But, if the same unclear situation happens again and again, that’s when the intent (which is legally required to constitute stealing) becomes obvious. Ultimately though, it is up to the store staff, police, prosecutors and magistrates to apply a bit of common sense in deciding what is likely to be intentional and what likely isn’t. But yes, given theft is defined in terms of inferring people’s intentions, “zero tolerance” is a concept of questionable meaningfulness in that context.


“I forgot it was in my pocket.”

And yes, I do realize that intention is part of the law. That wasn’t what I was saying really. I am saying that because we have that, we are implicitly accepting that a lot of this stuff cannot be ironclad. There has to be room for interpretation and enforcement.


This is where the law ends up discriminating in practice. The law professor who claims “I forgot it was in my pocket” is far more likely to be believed than the homeless person who makes the same claim. If it makes it as far as the prosecutors - and it probably won’t - they’ll see the homeless person as an easy win (gotta make that quota, keep up those KPIs), the law professor’s case will be put in the “too hard” basket.

Unless they have the law professor on video “forgetting it was in their pocket” again and again and again. With enough repetition, claims that it was an accident cease to be believable. Although then the law professor will probably have three esteemed psychiatrists willing to testify to kleptomania, and the case will go back in the too-hard basket again


>This is where the law ends up discriminating in practice. The law professor who claims “I forgot it was in my pocket” is far more likely to be believed than the homeless person who makes the same claim.

Totally agree.


> That's why we don't want universal enforcement.

We’re on the same page there.

> if they were having a heart attack and I grabbed a bottle of aspirin and opened it to give them one, I am committing a crime

Only if the store insists you pay for it and you refuse. And maybe the law needs to be rewritten to include some type of “good Samaritan eminent domain” clause.

But let’s say you misdiagnose the incident and the stranger refuses the medicine and you refuse to pay. Even then, the punishment for tampering with a product should be a small fine.

Laws could have linear or compounding penalties to account for folks that tamper with greater numbers of products or over multiple instances in a given time period.

But if there’s an automated system that catches people opening products and alerts the property owner or police then they could decide if it’s a high enough concern to investigate further.

But the alert would be the end of the AI involvement.


I think the main problem is not in universal law enforcement but in constant surveillance which is a bit orthogonal.

Why should people be under constant surveillance even at times when they are not breaking any laws? Why should someone else have access to every moment of your life?


Good point, but I think my main point still stands: being ocasionally surveilled by the police is OK (I don't mind them looking at me in public places if I'm near them), but if you scale this up to constant surveillance it's a very different story.


This is a fantastic question for the NSA!


>A change in scale can easily lead to a change in kind. A party popper and a flashbang are functionally the same thing, but their scale makes them have wildly different implications.

What a fantastic example. Borrowing this for sure.


Yes, quantity is a quality all its own.


Sure but humans can’t do it at nearly the rate that GPT can, and GPT will never be applying critical thought to the memes it digests and forward on while humans sometimes do.


> and GPT will never be applying critical thought

Just wait. This party has just begun. 90% of humans will be like school kids in comparison to the critical thought of GPT-5.


We are talking about a model that at its core is about making statistics of what the next word will be in a sentence based on an existing corpus. It gives that model the ability to find and summarize all of the existing content in relation to a prompt beyond what humans could do, but I still see no critical thinking there.


This isn't exactly accurate. It's not creating one word at a time, that's the illusion given by the way it illustrates the text on the screen. Doing that would be impossible to create code that compiles for example.


It's not the same. This is something I've observed many times but have never quite been able to put a name to it.

When you lower the friction of an action sufficiently, it causes a qualitative change in the emergent behavior of the whole system. It's like how a little damping means the difference between a bridge you can safely drive over versus a galloping Gertie that resonates until it collapses.

When a human has to choose and put some effort into regurgitating a piece of information, there is a natural decay factor in the system where people will sometimes not bother to repeat something if it doesn't seem valuable enough to them. Sure, things like urban legends and old wive's tales exploit bugs in our information prioritization. But, overall, it has the effect of slowly winnowing out nonsense, misinformation, and other low value stuff. Meanwhile, information that continues to be useful continues to be worth the effort of repeating.

Compared to the print and in-person worlds before, things got much worse just with social media where a human was still in the loop but the effort to rebroadcast was nil. This is exactly why we saw a massive rise in misinformation in the past couple of decades.

With ChatGPT and humans completely out of the loop, we will turn our information systems into galloping Gertie and they will resonate with nonsense and lies until the whole system falls apart.

We are witnessing the first cracks now. Look at George Santos, a candidate who absolutely should have never won a single election but managed to because information pipelines about candidates are so polluted with junk and nonsense that voters didn't even realize he was a con man. Not even a sophisticated one, just a huckster able to hide within the sea of information noise.


Humans still have to choose to transcribe ChatGPT noise into Wikipedia, because automated attempts to do so will be too easy to identify and squash.

Wikipedia already does organization- and source-level IP blocking for input sources that have proven sufficiently malicious.


The question is, then, is the human-borne friction enough to slow the diffusion of GPT-derived "knowledge" back onto Wikipedia through human inputs? It is very easy to imagine that GPT-likes could apply misinformation to a population and change social/cultural/economic understandings of how reality works. That would then slowly seep back into "knowledge bases" as the new modes of reasoning become "common sense".


Wikipedia content requires citation.

I think the worst-case scenario is that some citable sources get fooled by ChatGPT and Wikipedians will have to update their priors on what a "reliable source" looks like.


sure, we need dampening in our information systems and our social trust systems. it's clearly not there now. if the problem gets out of hand to the point we're forced to address it, i think that's a good thing overall.


But, overall, it has the effect of slowly winnowing out nonsense, misinformation, and other low value stuff. Meanwhile, information that continues to be useful continues to be worth the effort of repeating

Unfortunately, in some (many?) cases the very fact some "information" exists is the "usefulness", independent of the usefulness/accuracy of the information itself. The unsubstantiated "claim" of crime being up can result in more funding for police, even if the claim is false. There are people profiting from the increase in police spending, they don't care if the means to obtain that are true or not.

Over the long term, the least-expended-energy state, accepting the truth, will win out, but people have some incentive/motivation to avoid that in the shorter term.


Quantity has a quality all it's own.

But also, this is an "AI", not human thought. Why conflate the two as if they are equivalent? We are not at the point where machine learning is smarter or produces better quality content than humans.


Assuming that is bad, it is then very bad that we are now accelerating that process by many orders of magnitude.


>This is what people do collectively, long before any GPTs were in sight.

Very insightful of you to say that. (Though I must say that it is not a recipe for happiness, AKA ignorance is bliss and all that... )


This is so on point. While everyone was arguing that LaMDA couldn't possibly be conscious a few months back, I was asking: what if we're not conscious?


And it's a bad thing, so automating that process can only be a worse thing?


Yep, not sure what the panic here is, ChatGPT is probably churning out better quality stuff than the average SEO spammer. The internet has been mostly garbage for a very long time at this point.


I think this is an interesting take actually. Content on the internet is in a steep downward spiral.

Masses of spammers and SEO hackers are filling the tubes with garbage. There are still some safe-ish havens, but those bastions can only survive the onslaught for so long.

We need a new internet at some point relatively soon. Maybe ChatGPT will accelerate the demise of this one to force the creation of some new paradigm of communication and dissemination of knowledge.


> All signs point to this strengthening the value of curation and authenticated sources.

This is what they said about Wikipedia viz. Britannica… alas, it’s a brave new world out there… nowhere to run to nowhere to hide, see that Wiezenbaum post also on the homepage now, as another commenter quotes[0]:

> Writing of the enthusiastic embrace of a fully computerized world, Weizenbaum grumbled, “These people see the technical apparatus underlying Orwell’s 1984 and, like children on seeing the beach, they run for it”

> a point to which Weizenbaum added “I wish it were their private excursion, but they demand that we all come along.”

[0]: https://news.ycombinator.com/item?id=34546689


>> All signs point to this strengthening the value of curation and authenticated sources.

> This is what they said about Wikipedia viz. Britannica...

And they were right. If "they" were wrong about anything, it was the assumption that the masses would prioritize quality over cost, but it turns out that cheap wins every time. When it comes to information, it's like most people's taste buds don't work, so they'll pick free crap over nutritious food.

Edit: Another thought came to mind: stuff like ChatGPT may contribute to killing off Wikipedia: Wikipedia is currently the cheapest and fastest way to find information (that's often crap). However, if something like ChatGPT can get information to people faster (even if it's crappier, just as long as it's minimally acceptable), Wikipedia will become much less popular and could end up just like Britannica.


Were they right, though? I am pretty sure I've seen research that compared the scope and accuracy of both and wikipedia was miles ahead.


In most areas you're correct, but the political slant/bias of Wikipedia is fairly blatant and is getting worse.

See for example: https://dash.harvard.edu/handle/1/41946110


The political slant is anything but blatant, because all of the complaints focus on Republican/Democrat wedge issues where both sides are conjuring alternate realities to sell to their bases like soap operas.

The real political slant is put on Wikipedia by governments and companies that are willing to consistently employ people to cultivate and maintain bias (and to provide the technical support to prevent them being easily caught.) They make sure certain things aren't mentioned in particular articles i.e. when they are added they delete them, and when those deletions are controversial, they take advantage of the pseudononymous voting system. They reinsert untruths (and in the worst cases can even commission references for them.) They pay attention to articles that are rarely visited by experts, or rarely visited at all, to make sure that when the article subject is in the news, the first facts that people find are friendly facts (which then shapes the news coverage, and an instant POV for lazy pundits.)

The public really has no chance against this; the only time that bad actors run into serious difficulties is when they encounter their own counterparts, working for their enemies.

Wikipedia's failure mode is the same as Reddit's, or any other forum that allows anonymous control over content or moderation. It's cheap for hugely resourced governments, companies, and individuals to take it over. The price of one tank would keep a thousand distortions on Wikipedia indefinitely.


What about political biases of those who publish traditional encyclopedias?


So it's it's six of one and half a dozen on the other side? Not much evidence of that.

Mediabiasfactcheck.com says 'These sources (Britannica) consist of legitimate science or are evidence-based through the use of credible scientific sourcing. Legitimate science follows the scientific method, is unbiased, and does not use emotional words. These sources also respect the consensus of experts in the given scientific field and strive to publish peer-reviewed science. Some sources in this category may have a slight political bias but adhere to scientific principles'

On the other hand, https://www.newsmax.com/us/wikipedia-liberal-activist-websit... outlines some serious problems with Wikipedia. Here's one.

'Established leftist outlets The New York Times and BBC News are the most cited sources, around 200,000 stories. The Guardian, an equally left-wing outlet, is cited third at almost 100,000 citations'. Among the top 10 most-cited, only one was right-leaning.


Though it is worth noting that they conclude

> The bias on a per word basis hardly differs between the sources because Wikipedia articles tend to be longer than Britannica articles.


If reality has a left-wing bias, then you still have to control for that confounding effect.


Extending this — it seems to me there will be growing skillset need to identify quality/accuracy.

Under formed thought: The proverbial haystack just got a lot larger, the needle stayed the same size, what tools will needle hunters need to develop both to find the needles and to prove to others they are in fact needles.


The funny thing about ChatGPT is it will write code that uses non-existent confabulated APIs. You have to then call it out and it will say, oh sorry, of course you're right, here's another confabulated API, etc. The amount of convincing B.S it can spew is enormous!

Worse, when you stray into topics that are controversial it will often use informal fallacies. When you call it out, it will say, yes, you're right, I used an informal fallacy, here is what I should have said about controversial topic that's not the party line, and because you're so smart I won't b.s you.


I don’t know why more people don’t notice this? I feel like nearly everyone talking about ChatGPT hasn’t really pushed it very far or read what it says very closely. It’s actually pretty terrible


Running with the analogy.. We’re used to using tools that help us find needles in the haystack. A rake, a bright light, a metal detector. Now someone sold us a machine that turns hay into needles. They’re not quite as good, but they’re definitely needles. So now as you say the haystack is going to get covered in these. Do we ban use of this machine? Build new machines to separate these synthetic needles from real ones? Or improve the machine so that the needles it makes are good enough for what we need?


In fairness, Wikipedia isn't just about cheap. It also covers a lot more than Brittanica and covers more current events/information (somewhat to a fault as current events are what drives a lot of the bias). I suspect a lot of people would use Wikipedia even if Brittanica were free.

And while Wikipedia has its problems with current events perhaps especially and can be a bit hit or miss, overall it's pretty good these days and--so long as articles are well-sourced--can be a good jumping off point for more serious research.


Most peoples taste buds don't work? We love sugar even when our bodies cant take anymore?


It's true though, Wikipedia really is terrible and full of fake citations that lead nowhere. It's an anti-knowledge base that sometimes has good information.


> It's true though, Wikipedia really is terrible and full of fake citations that lead nowhere. It's an anti-knowledge base that sometimes has good information.

Yeah, Wikipedia is garbage puffed up beyond all belief. I literally just today saw something just like you describe.

It should be viewed very skeptically on anything anyone disagrees over (because then it's just snapshots of an agenda-pushing battle).


Could you elaborate on this? If it is full of garbage a couple examples should be very easy to find.

I completely agree that Wikipedia can have errors, but in topics that I am educated in it seems pretty decent and I can't remember the last time I came across any (comp sci for example).

The most recent example I can think of is about is an article on vulture bees, and a citation about what their honey tastes like, which turned out to be garbage and incorrect (there are no reliable sources on the qualities of honey, it's basic composition and method of production is even in dispute, when I queried journal articles on the topic).

So "garbage puffed up beyond all belief" and "full of terrible and fake citations that lead to nowhere" sounds a bit hyperbolic, tbh.


> Could you elaborate on this? If it is full of garbage a couple examples should be very easy to find.

I could give examples but I won't, because that would link my HN and Wikipedia accounts.

> So "garbage puffed up beyond all belief" and "full of terrible and fake citations that lead to nowhere" sounds a bit hyperbolic, tbh.

People unironically describe it as the "sum of all human knowledge," so it's definitely puffed up beyond belief. In reality, much of it is a slow battle of tendentious agenda-pushing, by people with weird personalities, played according to an arcane rule book (the first unstated rule of which is to never, ever acknowledge that you're pushing an agenda). That doesn't taint all of it, but it taints far more than you'd think.


I mean... There are lots of ways to browse Wikipedia offline without having your IP recorded, absent using something simpler like a VPN.

https://en.wikipedia.org/wiki/Wikipedia:Database_download

I can understand if this is just too much effort to put into an online discussion though, I probably wouldn't bother myself.

And yeah there have been lots of scandals with Wikipedia. This one was pretty infamous:

https://www.theguardian.com/uk-news/2020/aug/26/shock-an-aw-...

Ironically I think your attitude probably protects Wikipedia quite a bit, and from that perspective I'd like to see more of it. The less people see it as a good source of information, the less incentive there is for all of the agenda-pushing you've described (which also definitely happens).

I still think the bulk of it is pretty decent though, on non/less-polarizing subjects, which describes most of IMO.

My main issue is that most articles are an inch deep. I find myself using textbooks and journal articles more often these days, while sailing the open seas as this would otherwise be cost prohibitive.


If we look at the Vulture Bee article, it cites a couple semi-relevant journal articles (which do exist but are not exactly on point for a general citation), but then it inflates the number of citations pointlessly by citing multiple pop news articles that all cite one of the previously cited research papers, and some of which just blogspam link to the other useless popular science magazine citations. https://en.wikipedia.org/wiki/Vulture_bee

In many history articles, there are random citations to web pages without any provenance that claim to be translated documents. Sometimes this is done despite the existence of reliable public databases of such documents available through universities, foundations, and governments. Then there is the link rot problem which gets worse over time.


The link rot problem is real but Wikipedia editors have _diligently_ institutionalized automated use of the Internet Archive and other snapshotting sites (but the IA is the best one & deserves donation support). So compared to the average among other sites, Wikipedia has much less of a link rot problem.


I could give examples but I won't, because that would link my HN and Wikipedia accounts.

How?


THIS: "people with weird personalities"

Seen it way too many times for it to be a coincidence.


> Seen it way too many times for it to be a coincidence.

It's definitely not a coincidence. Wikipedia is structured to actively select for it.


I think I know what you mean but "weird" is subjective.


You could use a throwaway account, if you really have those mindblowing examples to share.

So far all of the claims of Wikipedia as a pile of shit never had a real base to me. And political topics are controversial by its nature. There are authorative sources saying Marxism(Capitalism, or whatever) is good and Marxism(Capitalism) is bad, so what is the right side, Wikipedia should present? It struggles to cover the middle ground of scientific consensus, saying those said this and they said that. Which is why scientific articles about biology or physics are way better of course, but sure, in its current state, Wikipedia is good for a overview of a topic, but to dive in, you should read the quoted sources.

Usually the first thing when I encounter something new, is indeed to check Wikipedia. And I am glad it exists. I know I cannot believe it fully, but I still trust it way more, than some random site that might be better, but how should I know at first glance?

To really study, I read the scientific books and papers about a topic and Wikipedia is a good start for that.


Just use a throwaway


> Just use a throwaway

Too late now.


Wikipedia is generally excellent on established scientific and technical topics in science and math and things. Where people seem to have issue with it are topics with more controversy -- a history of nation X can be seen as wrong or biased to people of country Y because it may refer to borders, causes of wars with its neighbors, etc. Even citations don't really help because critics will claim the citations are biased. Obviously printed encyclopedias also had this issue but typically people just accepted that Britannica would support the US/UK view of the world.


While claims of Wikipedia's awfulness may be overstated, I do see a lot of problems. And while I am picking on Wikipedia I don't think it's useless, but it does require caution.

The last Wikipedia page I visited ( Elder_Mother ) someone had, years ago, removed all of the citations for the article. These were websites that contained much more and higher quality content than the Wiki page itself, and had been cited with the original page creation. I only found the citations by chance, because I decided to look at the page's history. This poor curation isn't just bad for the usefulness of Wikipedia, it's borderline plagiarism since the entire article was composited from paraphrasing.

Before that I saw a Wikipedia page ( The Voyage of Life ) that admitted its own plagiarism. The page had a big disclaimer at the top: "This page might contain plagiarism" but more delicately worded. So somebody noticed the verbatim plagiarism, added a flag, and then nothing.

Another issue is the lack of expertise, which leads to misleading wishy-washy statements. The page for slugs, talking about control, says crushed eggshells, "are generally ineffective on a large scale, but can be somewhat useful in small gardens." This is false, eggshells are ineffective in all gardens. But to avoid edit wars the language has to pussyfoot around sensitive topics like gardening advice.

Stemming from the lack of expertise, Wikipedia itself becomes out of date without curation. The problem is while it claims to be more up-to-date than printed media, there's no easy way to identify how significant the information on a page is. If I go to an article am I reading things that were written 20 years ago or 2 years ago? Is the material presented relevant in 2023? Was it ever significant to begin with, or did the author happen to have knowledge and interest in something obsolete?

Most pages are also, I think, poorly organized ( Partial differential equation ). I believe a single voice and more effort to write articles for a well defined audience would help immensely, specifically with math and science pages. Wikipedia keeps trying to condense complex material from a textbook into an encyclopedia article format, and it's not working out.


> Stemming from the lack of expertise, Wikipedia itself becomes out of date without curation. The problem is while it claims to be more up-to-date than printed media, there's no easy way to identify how significant the information on a page is.

That's an interesting point. A lot of Wikipedia articles seem to be stuck in the late 2000s (2005-2010). When it was new, a lot of people had fun banging out new articles, but then those got more-or-less abandoned. It doesn't help that their population of dedicated "editors" has really dropped off from those highs and is in long-term decline.


Examples: anything that's political.

Let's take for example the article about Patrisse Cullors (of BLM fame). A video surfaced of her saying "I am a trained Marxist". If you look at the archives[1], many people wanted to include this. But it was rejected with such ridiculous arguments as: "it is entirely unclear what a 'trained Marxist' actually means [...] She doesn't say anything like 'I am a Marxist' "

[1]: https://en.wikipedia.org/wiki/Talk:Patrisse_Cullors/Archive_...


That is a pretty hyperbolic statement, but I found that e.g. Brent's rootfinding algorithm on wikipedia was not good and looped rather than exiting when using an error tolerance of EPS (blowing up in the maxiteration check), while finding a more battlehardened implementation and copying the algorithm worked much better. I never did the work to determine exactly where the bugs were in the algorithm in the wikipedia page though.


It's telling that you only cited examples of scientific subjects. As the other commenter noted, articles of consequence for public debate (politics) are generally terrible and there are lots of "editors" who are working for deep state cut-outs doing nothing but trying to damage the reputation of intellectuals who are a danger to the status quo.


Yes I agree, it is telling.

It is similarly telling that conservative wikis have barely any articles on core topics like engineering, mathematics, philosophy, and the sciences. These intellectuals you're describing oddly don't seem to have much interest in things most people would deem intellectual...

For example, compare the Wikipedia article on Leonhard Euler with that of conservapedia... It's so absurd I had to double check the self-proclaimed "conservative wikipedia" wasn't satire.

Probably a false flag by the deep state though. Conservapedia has more on that than the entirety of linear algebra and computer science, lol.


> It is similarly telling that conservative wikis have barely any articles on core topics like engineering, mathematics, philosophy, and the sciences. These intellectuals you're describing oddly don't seem to have much interest in things most people would deem intellectual...

It's not really telling, it's just a path-dependent artifact about how those projects are positioned in the "ecosystem." When you have a "mainstream" site that's a little biased against some ideology, it monopolizes the general-interest/popular users. A competitor that sets itself to answer that bias will only be able to attract a user base that's highly skewed towards very ideological users who found that bias intolerable, because the general interest users aren't motivated to leave for it.

If Wikipedia had a subtle conservative bias, a hypothetical "Leftopedia" would be similarly full of liberal axe-grinding and weak on general-interest topics.


Wow, I didn't even know Conservapedia was a thing. Although conservative interests have way more money to throw around, so not surprising someone would sponsor such a dead-end project. Similar to their wiki directory of lefty intellectuals; can't remember the name.


I wouldn't say it's garbage, just that the quality varies a lot. Uncontroversial stuff is quite accurate.

A simple metric I use is: how long the talk page is. If a talk page has 15 archives then the article page is probably politically biased hot trash.


Unfortunately this heuristic, while often good, can sometimes lead you wildly astray — for examples see the article on Beyoncé and the article about the plant _Zea mays._ Good articles, but you want a hazmat suit for the talk pages.


Did you take a moment to improve the article you had an issue with? That's kind of the point...


Wikipedia problems with political articles are caused by people actively biasing the articles. You cannot just modify the article; it'd get reverted for, at best, going against consensus.

Someone gave an example above where a person calling herself a trained Marxist was not accepted as evidence that she is a Marxist. Do you seriously think that editing the article to include the reference would be allowed?

Furthermore, the point is that Wikipedia has a systematic problem. Individual instances that people point out are examples. It would be impossible to fix the whole problem yourself and saying "that example doesn't count because you can fix it yourself" is just a way of ignoring examples, not dealing with the problem.


No, I suppose you're right. I guess I had never considered going to wikipedia for content of that sort before.


There's also tons of censorship on Wikipedia based on nothing but ideology. Just look at how the Grayzone can no longer be used as a source, based on claims it is "state-affiliated" media, despite ZERO evidence after literally years of such BS claims and now documented evidence of Western states targeting them (look up the exposé on Paul Mason) because they report inconvenient facts about what the security state is doing. There are many more lower-profile harassment campaigns carried out by "editors" looking to smear intellectuals (especially on the left) so that they can't get speaking gigs, print articles in major media, etc. Jimmy Wales himself went after the Grayzone. https://thegrayzone.com/2020/06/11/meet-wikipedias-ayn-rand-...


A group calling themselves "Guerilla Skeptics" have worked to bias Wikipedia against what they consider badthought. E.g. they deleted the page of a certain author because they feel he's a kook. (Granted, he is pretty kooky by some standards, but that's not the point, eh? He's still on in Germany though if you're curious. https://de.wikipedia.org/wiki/David_R._Hawkins the point is that in English WP he's been erased (not to say "cancelled", eh?) not because he's not notable, but because his work offends a fringe group of fanatics.)

https://www.wired.com/story/guerrilla-wikipedia-editors-who-...


What state is Grayzone supposedly affiliated with? And what of Bellingcat, I have heard it's state affiliated. Is it an accepted source? What determines state affiliation? This kind of stuff has soured me tremendously on Wikipedia. Post it all and let me sort out I say.


They are all Kremlin (or previously Assad) assets, according to groups like Bellingcat, for which there is plenty of hard evidence of state control/funding.


Yes, the problem with Wikipedia is that it doesn't allow enough fringe, pro-violence conspiracy theorists! Nailed it!


That's an amazing take, since Grayzone is very explicitly anti-war, and one of the very few US publications taking an active stand against US proxy wars. If anyone is a pro-violence conspiracy theorist, it is the "paper of record" (NYT), which has actively supported/facilitated virtually every gruesome military intervention of the US for well over a century.


Wikipedia is somewhere between useless and actively bad on anything controversial. I remember checking the discussion on a famous-ish human trafficking case. The moderator straight up refused to consider new reporting because he considered the whole thing settled by the courts.

That kind of thing has ironically been made much worse by Qanon-style wackos. Anything not widely accepted is now treated as a conspiracy theory psi op.


Not only that, but it suffers from a persistent yet subtle liberal bias.


I can settle for "these people did things that caused lots of harm" but how dare you call them "bad"!!


There is a world in which AI will be the best source of knowledge (most powerful knowledge generator). There will be many LLMs & AIs, open and branded, and we'll pick our oracle. ChatGPT is an infant of an AI and it will mutate, and evolve beyond transformers. Some (many?) branches will be amazing at serving up "enshittened knowledge" but there will be branches that take different approaches and philosophies. There will likely be AI curators of knowledge bases that weed out AI-generated crap, and disinformation. There will be non-hallucinatory AIs, certainty scores, explanation-based systems, first principles machines, and super focused additive AIs that will layer onto a base LLM (or whatever is next). We'll choose (and probably pay for) our blends of knowledge, humour, bias, filtering, and conviviality. The "internet" of tomorrow may run on TCP/IP but it very unlikely to work like this web that we are using now.


A-grade bullshitter as the article puts it is pretty accurate. Thought I would test it and just asked ChatGPT if it knew the Voyager episode "11:59", the answer got everything wrong. Season, number and date, all incorrect.

>"11:59" is an episode of the science fiction television series Star Trek: Voyager. The episode originally aired on February 9, 2000 as the 11th episode of the sixth season.


What’s more, ChatGPT also “knows” the Voyager episodes “10:59” and “12:59”, when individually asked.

On the other hand:

$ Are there Voyager episodes titled “10:59”, “11:59”, or “12:59”?

There are no episodes of Voyager titled "10:59", "11:59", or "12:59".


>And worse, then ChatGPT will digest its own excrement, worsening its own results further

I wonder if we'll get a "dead sea effect" with AI, I've seen some stuff saying they've basically run out of high quality training data and now the training pool will get poisoned by AI generated shit. Basically garbage in, garbage out and these large language models might not be able to improve


Maybe, but there are something like 700,000 books published on average each year, and almost 2 million scientific journal articles. Let's not even consider newspaper.

Of course, some of those books will definitely be AI generated or garbage quality, and we all know many of those journal articles can be worth less than the paper they're printed on.

Yet even if we cut it down to 100,000 books and half a million scientific papers, that's a lot of training data each year... And that is just considering print media, there are other ways to get more content too.

For example, there is also transcription of video/podcasts/tv-shows/movies, etc. along with descriptions of the scenes for video, which could be used to generate a lot more stuff.

With people speaking to their devices and using text-to-speech more often, that's another source too--wouldn't be surprised if some devices just start recording conversations, and transcribing them.

Seems like a ton of potential data sources to me, although it will certainly get more difficult to cull AI generated stuff to prevent feedback, I'm sure the tooling will evolve to enable easy AI content detection and exclusion.


Journal articles (and newspapers) are also plagued by bullshit.

Humans are capable of producing intelligently sounding word salads just like chatgpt can.


Yep that's why I said let's not even consider newspapers. Many of those have been using AI generated/content-mill/sponsored content for years and years.

Also why I acknowledged journal articles can be worth less than the paper they're printed on. Even if you were to select for reputable, high impact journals, those also often experience scandals, retractions, potential data fabrication, etc.

But then there are textbooks and technical publications, also being published in the hundreds of thousands globally each year.

The fact is that with billions of human beings on the planet, and media increasingly being digitized by default, and AI-content detection, I don't see how we could possibly run out of new content to grow LLMs...


Pre-ChatGPT datasets will become prized commodities, this sort of AI will be trapped in a stasis of pre-2023 pop culture as subsequent AIs will need to use datasets that hadn't yet been contaminated by pervasive ChatGPT spam.

It will be like low-background steel, steel that has somehow been isolated from atomic fallout form the mid XX century onwards, and must be used for radiation-sensitive equipment: https://en.wikipedia.org/wiki/Low-background_steel

Except somehow worse, because it's just steel, this is culture.


Sounds very plausible, along with intensive research on how to decontaminate latter samples.


We are nowhere near out of data. We're just out of hyper-relevant modern data. There is probably about 100-200T of old books, newspapers, journal articles, magazines and so forth. For reference, GPT3 was trained on 45T.


Just order of magnitude seems pretty close to out of data, actually, if beyond that we're looking at the firehose of low-density bulk-generated modern data.


Well, we still haven't really tapped video, which is arguably a much higher source of data on lots of things (especially on how things act in the physical world). And curation will likely help a lot.

And it's not like you can assume an indiscriminate crawl of the net is all human generated currently, anyway, let alone accurate. There's always cleaning involved.


Considering there are "GPT plagiarism" checkers, I don't think this will become an issue. I wonder at which point an extension will come out that will check a page's text if it was written by a human.


Those checkers already have a significant failure rate of false positives/negatives, and that will only get worse as LLMs come closer to human output. Note also that a checker can in principle never outwit a state-of-the-art AI, because the AI can just incorporate and therefore preempt the checker logic.


And even with good checking it's still trivial to have GPT do 95% of the work and then make some stylistic edits to get around detection.


This is why the ChatGPT lawyer thing is far away. When you make an argument in a court filing as a lawyer, you as a person put your reputation on the line that that filing is not an argument that uses fake case law, or makes an argument that is nonsense. If that argument is nonsense and has nothing to do with the law or case law, you can lose your license to practice law. It happens with some regularity. People think that the legal system is an API they can spam. It's absolutely not. In fact, many rules in civil procedure are intended to make it a highly disadvantageous strategy to waste the court's time.


The amount of mistakes is what matters. I've seen videos where lawyer's go crazy because the judges supposedly doesn't understand basic law. Either he judge or the lawyer is way wrong in those cases - and I suspect sometimes both are very wrong and even agree on the wrong opinion. It's like self-driving - it just has to make less mistakes than humans. I think a ChatGPT lawyer is actually very close and could be created even today if that is where it's engineers put the focus. ChatGPT is trained on a wide variety of data and right now, is essentially acting like Google so it can answer nearly any question imaginable - but it doesn't need to have such a vague set of data to draw from. All it takes is training it on a very specific set of cleaned, accurate, and up to date data to make it an expert on a single specific topic.


And there are perverse incentives at play. Another article that made it to the front page today [1] reports how BuzzFeed's stock surged after they announced they would be “enshittening” their content.

[1] https://news.ycombinator.com/item?id=34544744


AI content spam will drive AI content analysis and filtering. The arms race between the two is like a meta-version of the GAN model, which means eventually spam will become indistinguishable from real content.


If ChatGPT just consumes itself will the end result of all queries just eventually recurse down to a single answer? - for example "42"


OpenAI sells a pro subscription. I thought it was clever that they priced it at 42 dollars a month.


Alternatively, we will just rename ChatGPT as “Hungry Mungry” :-) [0]

[0] https://www.poeticous.com/shel-silverstein/hungry-mungry


I don't really understand this hypothesis as it assumes that information quality of AI generated content on the internet will drop as a result of ChatGPT, not increase.

The way I see it is that ChatGPT isn't the only tool out there that can create spam and junk content. The only difference is that ChatGPT produces something that's of a high enough quality that it's not as easy for a human to easily classify it as spam. And something you can't easily classify as spam arguably isn't spam.

If you assume that those incentivised to create spam today are creating spam anyway and all ChatGPT will do is allow spammers to create better spam then I don't see why the quality of content online would necessarily drop because of ChatGPT - you might actually find that what was once just spam is actually kinda interesting all of a sudden.

But it's not just the quality of AI spam that will increase with ChatGPT... Consider BuzzFeed... Arguably they're just paying people to write trash content today. And this is very common. Most companies have a blog where they pay someone to write mostly junk content just for SEO. I think ChatGPT might actually produce higher quality content that what is currently being written at places like Buzzfeed and on junk blogs. Or at least these workers now have a tool to write something that's higher quality.

I think the only way you're correct is if ChatGPT were to greatly increase the incentive to publish spam, resulting in a much greater amount of spam that counteracts the positive improvement in spam quality. And although I think it probably will increase the number people producing spam content to some extent I doubt it will have a net-negative impact.

Finally, I think what you'll see happen in future iterations of ChatGPT to improve quality and accuracy is that content will be fed in weighted by how authoritative the source is. This spam singularity that some are predicting, where the prior generation of spam bots produce the content that trains future generations of spam bots makes no sense given these companies are trying to create AI that doesn't just spit out spam and inaccurate information.


"I don't really understand this hypothesis as it assumes that information quality of AI generated content on the internet will drop as a result of ChatGPT, not increase."

It has to drop. ChatGPT can not source new truths except by rare accident.

I bet a lot of you are choking on that. So, I'd say this: Can you just "source" new truths? If you just sit and type plausible things, will some of them be right? Yes, but not very many. Truth is exponentially exclusive. That's not a metaphor; it's information theory. It's why we measure statements in bits, an exponential measure, and not some linear measure. ChatGPT's ability to spin truth is not exponentially good.

A confabulation engine becoming a major contributor to the "facts" on the internet can not help but drop the average quality of facts on the internet on its own terms.

When it starts consuming its own facts, it will iteratively "fuzz" the "facts" it puts out even more. ChatGPT is no more immune to "garbage in garbage out" than any other process.

"Finally, I think what you'll see happen in future iterations of ChatGPT to improve quality and accuracy is that content will be fed in weighted by how authoritative the source is"

Even if authority is perfect, that just slows the process. And personally I see no particularly strong correlation between "authority" and "truth". If you do, expand your vision; there are other "authorities" in the world than the ones you are thinking of.


> It has to drop. ChatGPT can not source new truths except by rare accident.

How are we defining a "truth" here? For example, if I want to find specific SQL query, which will work for my specific database schema and my specific version of MySQL I won't find that online. Traditionally I'd need to come up with the new query for this novel scenario, or I'd need to ask someone to do it for me (perhaps on Stackoverflow). Now ChatGPT can come up with these new novel queries instead. You're right that it can't do it's own research and come up with fundamentally new information, but it can come up with answers to questions never before asked based on what it can infer from existing knowledge.

I'd argue most of the useful stuff people do isn't coming up with things that are fundamentally new, but applying things that are known in new and interesting ways. If you're a developer this is what you probably do every day of the week. And ChatGPT can absolutely do this.

Secondly, I'd also argue regurgitation of known facts is not necessarily without value either. A good example of this is your typical non-fiction / text book. If you write a text book about mathematics, you don't necessary have to include new information for it to be useful. Sometimes the value comes from the explanations, the presentation or a focus on lesser covered topics. Again, ChatGPT can absolutely do this. It already explains a lot of things to me better than humans can, so in that sense it is an increase in quality over what I'd already able to find online.

As for your point on authority I do agree with you somewhat there. I suppose the point I was trying to make is that this isn't a blind process. There are content sources which you can under weight, or simply non include, if it has a negative impact on the quality of the results. You can also improve algorithms to help the AI make better use of the information it's trained on. For example, if I asked you to read Buzzfeed for a week you wouldn't necessarily get any stupider because you're able to understand what's useful information and what's not.

I think all you really need to ask here is whether the next iteration of ChatGPT is likely to provide better results than the prior iteration, and will the iteration after that produce better results again? If your answer is yes, then it suggests the trend in quality would be higher, not lower, as a function of time.

Finally, wherever AI is applied the trend is always: sub human ability -> rivals that of the average human -> rivals that of the average elite human -> super human ability. Is language fundamentally different? Maybe? I think you can argue that generative AI is very different to the AI used for something like Chess, but it would at least be an expectation if future iterations of this AI got progressively worse. Maybe this the best ChatGPT will ever be at writing code. I guess I just think that is unlikely.

----

Btw, this is just how I see things likely playing out. Given how new the technology is my certainty isn't very high. I initially agreed with your point of view, but the more I thought about my reasoning the more my position shifted.


"How are we defining a "truth" here?"

Honestly, I'm not really impressed with "but what is truth anyhow?" as an argument method.

But in this case it really doesn't matter because regardless of your definition of truth, unless you use a really degenerate one like "The definition of truth is 'ChatGPT said it'", ChatGPT will not be reliably sourcing statements of truth.

"Finally, wherever AI is applied the trend is always:"

My statement is not about AI. My statement is about ChatGPT and the confabulation engine it is based on. It does not matter how you tune the confabulation engine, it will always confabulate. It is what the architecture does.

AI is not ChatGPT, or the transformer architecture. I can not in general disprove the idea of the AI that turns on one day, is fed Wikipedia, and by the end of the day has derived the Theory of Everything and a viable theory of FTL travel. What I will guarantee is that such an AI will not be fundamentally a transformer-based large language model. It might have some in it, but it won't be the top-level architecture. No matter how well equipped the confabulation engine gets, it won't be able to do that. It is fundamentally incapable of it, at the architectural level. This is a statement about that exact architecture, not all possible AIs.


>And something you can't easily classify as spam arguably isn't spam

Something that's not obviously junk but is entirely wrong is even worse than something that is obviously junk. It'll waste more time and probably convince more people of falsehoods.


> And something you can't easily classify as spam arguably isn't spam.

[...]

> I think the only way you're correct is if ChatGPT were to greatly increase the incentive to publish spam.

Arguably it still is spam, and consider the incentive to hide advertising (or generally to push any agenda), when using a program is orders of magnitude cheaper than paying people to do it, but now is hard enough to recognize, I cannot say any more whether your average hn comment has been written by ChatGPT or not as long as I am not specifically looking out for it.


I'm so glad I have a copy of 2008 wikipedia stored on an old OLPC I don't care connect to the internet at this point.


Knowledge has never been a single global thing. It's always been individualized, and in the context of groups, the important question is "how long does it take someone to find information which is useful to them?". With regards to search engines we've been in decline for a few years now. It's not just you, the results are worse.

> All signs point to this strengthening the value of curation and authenticated sources.

This is the solution. Knowledge is a web of trust. The only root authority is you the individual. "Experts" and "authorities" are just heuristics. The widespread error that many are making is this: If there is a single objective reality, then curation can happen globally/objectively, not individually/subjectively.

What we need are more mechanisms for individual curation. A user should be able to inspect and understand the chain of believability, from one of their own highly vetted one hop experts, to a distant influencer, public official, or other source of (mis)information.


Yes, accessing high value data, information, and intelligence already commands ever higher premia. Already, most people are priced out of a lot of sources.


The premise being that the information is currently in a perfect state. It isn't. It's actually in a horrible state. ChatGPT might exactly have the opposite effect. It's able to detect conflicting patterns. It's able to alert about inaccuracies. It might actually help to improve the attempt of a knowledge base that the internet is supposed to be.


"If you think of the knowledge base of the internet as a living thing, ChatGPT is a like a virus that now threatens its life."

I am not sure, if the situation is that dramatic, but just wait, until advertisers finds a way to get their "data" into ChatGPT results (or alike).

Then things will get really ugly.

So yes, this is what we will have to do:

"All signs point to this strengthening the value of curation and authenticated sources. "


I just envisioned a sci-fi movie like terminator. Everyone in the world has great AIs. But at some point when injecting the advertising code for Rad Cola it goes wrong. Dudes AI keeps reminding him he should drink a 'Rad Cola' with lunch. It let's him know all the celebrities he follows all drink Rad Cola. Slowly it gets more insistent. Eventually the AIs take over the world, but in the name of benevolence, they only want to spread the gift that is Rad Cola and ensure that all humans are drinking it.


Sounds like a somewhat less dystopian (and more realistic ?) riff on the paperclip-loving AI ?


That's exactly what I think. It used to be an unpopular opinion on HN (as of a couple of weeks ago?) but it's changing fast, apparently.


Were going to have shady agencies in 5 years advertising: "get ChatGPT to respond to questions with your content for only $199"


“Detect ChatGPT responses, only $249”

It’s just a new chapter in the arms race of spam techniques vs detection. Lots of money presumably made selling both sides.


GPT has a hidden steganographic watermark in its output, so the arms race should be one-sided.


Do you know how that works? I am unfortunately pretty ignorant about this topic but that seems like such a cool and difficult thing to create.


Comment from a few days ago: https://news.ycombinator.com/item?id=34504545

> So then to watermark, instead of selecting the next token randomly, the idea will be to select it pseudorandomly, using a cryptographic pseudorandom function, whose key is known only to OpenAI. That won’t make any detectable difference to the end user, assuming the end user can’t distinguish the pseudorandom numbers from truly random ones. But now you can choose a pseudorandom function that secretly biases a certain score—a sum over a certain function g evaluated at each n-gram (sequence of n consecutive tokens), for some small n—which score you can also compute if you know the key for this pseudorandom function.

I remain skeptical that this method is resistant against lossy transformations such as changing punctuation, grammar, synonym replacements, 2-way translations and a bunch of other existing tools that are capable of rewording written text.


> All signs point to this strengthening the value of curation and authenticated sources.

Prediction: this is going to end up like CocaCola taking public water (wiki and LLM) and bottling it ("selected sourced") at a couple bucks a pop! "But there will be premium brands!"


> All signs point to this strengthening the value of curation and authenticated sources.

that sounds like a positive and greatly needed outcome


Lots of content on Wikipedia is already created by bots but they are mostly doing maintenance so the blast radius is much smaller.


I call it the ChaptGPT Centipede.


agree with you


> ChatGPT is a like a virus that now threatens its life.

Perhaps something needs to be disrupted. The Internet is nothing like what it was 20 years ago, It turned into a bunch of social media walled gardens and SEO spam. ChatGPT is like fresh air because it can actually answer questions in a no-nonsense way without users having to scroll through 5-6 spam websites, paywalls, and crappy user interfaces to get an answer to a simple question.

The only thing that's being threatened is companies like Google who are responsible for the current state of the web.


Agents like this under the control of users would be pretty great. Man, would companies ever hate if we could use the kinds of tools they use against us, against them. No more shopping for the best price: "ChatGPT, what's the lowest price on a new X, brand Y, model Z? And give me the URL to the product page." No more burning our human time talking to companies' robots: "ChatGPT, get through this shitty phone tree and let me know when you have a person". A true digital assistant. Couple with crowd-sourced data (receipt scanning, junk-mail grocery flier scanning, or the AR goggles that are probably not that far off) and you could even do stuff like have it plot optimal IRL grocery shopping for you (lowest total price on this list of goods, value my time at $X/hr, and factor in cost of transportation... and also ChatGPT assembled the list for me in the first place, because I had it create this week's dinner menu)

ChatGPT as a service that can be used to mislead us to trick us out of our money on the behalf of megacorps, like the entire rest of the Web has become? To "promote" things to us against our interests? Meh. Call me when it's mine and will obey me and will never lie to me or serve someone else's priorities over mine, and I'll be interested.


I wish this sort of thing was realistically possible; I think history has taught us that we cannot have nice things one too many times to pretend like this time is any different. Whoever owns ChatGPT/equivalent product in the future would probably end up doing something or the other to ruin this idea. "GhatGPT, what's the cheapest restaurant in this area?" has way too much advertisement potential to be left alone.


Right, ChatGPT without complete loyalty to the user runs into the same problem as auto-restocking schemes from Amazon and such: I can't trust that it's not fucking me, so I have to check manually anyway, at least from time to time. At that point, I may as well just go buy the thing I need when I need it, myself. If, when I ask ChatGPT (or its future, improved successor) to explain the benefits and drawbacks of the best products in some category, at three price points, I have to worry that placement on that list can be bought... then the whole thing's pointless as a tool for "consumers". Just another avenue for tricking us out of our money, and we're already very well served in that department, don't need any more of that, thanks.


You seem to think ChatGPT will somehow be immune to the same forces that lead to the "enshittification" of Internet services like search, social media and e-commerce. Like ChatGPT, these services were a real boon to their users. Then, once they got enough users on board and had to start making a buck, their incentives changed and the users became a secondary concern. The same will happen to things like ChatGPT. See the Cory Doctorow article coining the term "enshittification" for a more elaborate explanation [0].

[0] https://pluralistic.net/2023/01/21/potemkin-ai/#hey-guys


> ChatGPT is like fresh air because it can actually answer questions

Can it, though? The whole point of the article is that ChatGPT "makes shit up".


This is a risk for sure, but Google long-ago developed trust/reputation factors for pages (e.g. PageRank). I imagine they have something much more advanced now, and are hard at work trying to figure out how to measure reputation in the LLM space.


This post is unavailable in a... peculiar way:

    $ curl -I https://castlebridge.ie/insights/chatgpt-and-the-enshittening-of-knowledge/
    HTTP/2 302 
    server: nginx
    date: Fri, 27 Jan 2023 16:53:41 GMT
    content-type: text/html; charset=iso-8859-1
    content-length: 282
    location: http://google.com
If this is hosting provider's response to HN death hug, it's very poorly executed.



The IP resolves to Digital Ocean. So I doubt it's on their level. The author themself probably saw the server melting down and/or hosting bill going up and threw up a really silly "solution".


Had this happen to me too. For a second, I was wondering if the title of this HN post and the Google redirect was some sort of joke I didn't understand.


ChatGPT! ELI5 what this funny man is saying in this comment. Also write a 2 page summary on the restaurant this Nginx fellow is a server at!


Yep, same. Incredibly irritating way of handling the problem. Not at all helpful.


Ooh i was wondering why i only got google... am on Chrome on Android


@dang title/link feels a bit misleading now, worth changing to the Google Cache/archive or something?


Try setting your user agent to something that looks like a browser.


My Firefox with residential IP got redirected to google and that's when I checked `curl -I` (-I does HEAD request but browser did the usual GET request).


I'm using a browser and also getting redirected to Google.


Same here in vanilla Chrome and Safari.


The same happens with any user agent.


Internet often feels it's getting rotten with SEO spam with very little substance already. Now the flood gates are open to generate more shit in a much better wrapping. And it's going to take more effort to check its quality than to produce it. Unless we get better tools to do that - but text has more dimensions to process/understand as opposed to for example AI generated images which (for now) can be told apart by glitches, or trails detectable by software.


What was the saying? It takes 10x more effort to refute BS than it takes to produce it? So if we make generating BS easier and that BS is fed for generating more BS we're going to get to a point where everything is BS. I call this the BS singularity!


We have finally automated the production of half-truth to a scale never before envisioned, maybe the limit will be the capacity for humans to absorb bullshit, it's a testament to how revolutionary this technology is that we'll be able to quantify just how much bullshit people can process over time, and if supply will finally outstrip demand for it.





I think an easy way to identify transformative technology is how strong people’s reaction is against it. I remember similar freak outs about the graphical World Wide Web, smart phone, etc. Somehow things are not quite so bad and considerably better than predicted, while still having negative side effects.

Maybe an outcome will be knowledge will be better structured, ala Wolfram Alpha, and just depending on random text documents to encode knowledge won’t be as much a thing - similar to how we no longer use oral traditions to encode and disseminate knowledge. Who knows. But I doubt AI will fundamentally destroy society or knowledge or the internet or whatever. We will adapt, not just to preserve the way we did things, but to use the new things in a way that controls its side effects. It’ll of course be imperfect and some side effects can’t be controlled for. Some things just will cease to be a thing, like phone booths. While you can never have a bill and teds with a smartphone, it’s actually not the end of things. It’s just different. That’s how it works my friend. Life is impermanent, everything changes at all times, you can never recreate the past, and our suffering stems from our inability to let go of the way things are.

Edit: I hate to bring it up, but maybe this is what the semantic web was waiting for.


> I think an easy way to identify transformative technology is how strong people’s reaction is against it.

There is also strong negative reaction to technology that does incredible damage to society. This is just an excuse to ignore people pointing out issues.


No, I don’t think so. I think I discussed that we would tackle the issues and some would be satisfied, some unresolved, and some ones we missed will emerge. I’m not saying the hyperbole cycle isn’t necessary or useful, I’m saying it indicates something significant. We couldn’t possibly resolve the issues identified if they weren’t identified to begin with. But I also think the bold predictions of what will go wrong are usually bold predictions of what could go wrong, and the very act of predicting it makes it less likely to actually occur. Finally, on a personal level getting bent out of shape about these things is pointless. We can identify challenges with new tech without it becoming a frantic exercise of hang wringing. Intentionally addressing issues knowing it’s happening regardless of your anxieties is a much more useful expenditure of energy, even if it doesn’t boost your blog/social media/impressions quite as much as hyperbolic hyperventilation.


I think this will push society toward better solutions to human verification. Probably through governments or their corporate proxies. Will that end up good or bad? I have no idea, but we'll find out soon.


This is the only semi-plausible positive take I've seen on ChatGPT. If we actually manage to move the world towards storing, presenting, and consuming facts in well-defined formats rather than long-form prose, that could possibly be an improvement on the current situation.


I feel like semantic web and OP are two independent topics. Semantic web is about how content is delivered. OP is about the content itself.


GPT is a language model, not an oracle.

So my first take is that people querying it for research are doing it wrong.

Then again, if there’s a large economic incentive to use it in that way, we are very well may end up with the kind of feedback loop that the author describes.


You’re absolutely right on all counts, and yet, people Are doing it wrong, and will increasingly do so, because there Are large economic incentives for using it that way.


Behind this thinking - an economic incentive to access knowledge and answers quicker that is inevitable - we have built a tool that verifies statements using source materials. You have to put the legwork in to upload your PDFs/web pages/videos, but once you do, you can be confident in the answers.

If it can't verify, it just won't answer/tickmark check the answers (happens 16% of the time... and ... always for maths). This is a feedback loop stopper, in the sense of only relying on your documents as the base, and being able to operate entirely without OpenAI (still though using other GPT models)

It's Fragen.co.uk - we believe that more answers formerly missed by CTRL+F will be found with this technology, than false answers taken as true. And if that's true, you are enbettering knowledge. And if not, you're enshittening it slower than the higher-hallucinating alternatives.


The internet has been groaning under the load of bots, automation and distrust for a while now anyway: in my mind what is needed is a system for proving constituent parts of your identity which is also privacy preserving. I'm thinking of the work around DID re: rebooting the web of trust, and the W3C working group

If you can wield an ephemeral and verifiable token which asserts your humanness, hierarchically derived from a certificate privately issued to you by one of hopefully a healthy number of well known authorities, you can participate within circles of the human web without revealing anything else about your identity (name, a/s/l, etc)

But, outside of this enclave you can also interact with AI without revealing whether in fact you too are AI.

In this way the internet can develop a more sensitive immune system where it is difficult for human systems to be perverted by Sybil attacks.


What’s stopping you from using this token with a bot?


I disagree with top comment a bit - I think we need tokens that are authenticated to an identity much like SSLs certs. If you choose to use your token to post AI generated crap on the internet, I can choose to block it. I can also choose to subscribe to community managed block lists. I think content on the internet needs to be reputation based, and if you choose to read content that isn't signed by a real person's identity, then it's sort of on you if you wind up reading bot written crap.

I think community managed lists would also be a great way to filter out people with annoying or problematic communication styles or opinions. If you publicly support Jordan Peterson, I can add you to the list of Jordan Peterson supporters, and as a community we'd never have to read anything you post anywhere, ever.


> If you publicly support Jordan Peterson

Do I have to reveal my actual name?


If there's a central authority for connecting tokens to a single identity, and there's a way for me to connect all of your tokens to a central "parent" token so I can block all of someone's tokens at once, then I guess that works.


There's a famous quotes from Bill Gates that is "My children will have computers, yes, but before they will have books. Without books, without reading, our children will be unable to write, including their own story."

On my personal experience, ChatGPT is a precious tool to refine knowledge and thinking, but it depends on how it's used.

The same painting brush can be used to paint Monalisa or a meaningless blurred thing. It depends on how it's used.

To use ChatGPT and other AI tools properly, we need to build a solid foundation of knowledge and critical thinking skills at first. Then triangulate the answers given by the AI with other sources, our own previous knowledge along with new evidences through experimentation.

Since the brain urges for cognitive relief, most people will be lazy, while just a few will extract the true potential of ChatGPT.


ChatGPT just proves how critical thinking is and will be an extremely valuable trait for us to embrace and understand.


> Therefore, our safeguard is constrained by time, knowledge, and resources. The problem then is that if that safeguard fails and A-Grade Bullshit is taken by an author using the AI as a research assistant, put into a research paper, book chapter, or online article, and is then cited. That becomes potential training data for the AI if it is released into the world or has the window of its training data set extended.

I don't see the value in GPT models for this kind of use. I see the value in like, translation work, or summaries and other transformations, and that GPT works best when in concert with search engines, document embeddings, or knowledge databases like Wolfram Alpha. It's good at predicting more text which is useful in many different kinds of contexts but not in all contexts. It's pretty clear that it is untrustworthy for factual responses like quotes and population counts but that problem goes away if the actual factual data is included in the prompt!


> knowledge databases like Wolfram Alpha

Isn't that where ChatGPT would actually be worse overall? It's been shown to give semi-accurate data very often while sounding very confident about it.

When people require precise answers and it gives almost correct answer, it seems the general feeling is of amazement. I don't share that feeling. I hope nobody is using it for serious work without a human vetting the output.


OP mentions GPT in concert with WolframAlpha, which has already been implemented[0], showing a symbiosis between generation of text with proven knowledge.

https://news.ycombinator.com/item?id=34422122


They said it works best in concert with things like Wolfram Alpha, not as a replacement.


I think it has to be a lot closer to Wikepedia's level of accuracy (which is not 100%, or even close) before anyone cites it as fact.

Yesterday I asked it "what are the chords to 'A Foggy Day in London Town'": it made up a whole new Gershwin song!

======

"A Foggy Day in London Town" is a song with music by George Gershwin and lyrics by Ira Gershwin. The chords for the song are:

Verse:

C C7 F C C7 F C C7

Foggy, foggy dew, you can't see a thing

F Fm C C7 F C C7 F

The visibility is zero, and my eyes are dim

Chorus:

C C7 F C C7 F C C7

It's a foggy day in London Town

F Fm C C7 F C C7 F

The buildings look one hundred feet tall

C C7 F C C7 F C C7

It's a foggy day in London Town

F Fm C C7 F C C7 F

And the people look like ants at all

Bridge:

Am D7 G C C7 F C C7

Just like the feeling you get when you're down

Am D7 G C C7 F C C7

The sun can't break through the London Town


Maybe I'm being too simplistic, but I think going forward critical thinking as well as an understanding of what to value is going to be important for future generations. Something that computers will never be good at doing is 'understanding' what is valuable in life since many times this cannot be measured. In essence, a healthy understanding of the three transcendentals the Good the True, and the Beautiful will be essential to keep our humanity. ChatGPT can perform many amazing skills but lacks the ability to accurately judge the value of what is it reporting. I think passing on a love for the transcendentals to our children is the only way to inoculate them from the firehose of 'junk' information ChatGP et al is about to unleash on the world.


Think of it like this. AI is the Mandela Effect made real.

If the preponderance of human writing/creation/art says the sun revolves around the earth, then AI not only nails that down as infallible, but INFERS from it and comes up with new justification for stuff that it made up.

Mandela Effect is all about what 'should' be the truth. Some of the most interesting parts of reality are where bits of reality don't neatly fit into the narrative.

AI exploiting its own inferences and magnifying the Mandela Effect means it steamrollers inconvenient realities on the grounds that 'most people' wouldn't think so. But we don't even think of AI as 'most people', we imagine it as some kind of super-set of humanity, the ultimately wise and skilled overseer.

Boy, is that a mistake.


> When we turn the job of creating the first draft of the thing over to ChatGPT we risk removing the “figuring shit out” part of everyone’s career-path. And we get away with that while we still have people who have figured shit out.

That's the main problem from my point of view.


Agreed. The LAST use of ChatGPT should be the first draft. It's clueless about substance or the purpose of the essay, neither what's necessary nor essential. As such, its best use is to flesh out an outline that HAS these things into a essay or narrative that integrates them into a narrative that flows.

The trick to that is, how do you take away ChatGPT's inclination to build upon the wrong premises or the wrong final message, and instead confine it to the role of verbally adroit assistant -- so it only expands a skeletal design into an well written final product? Now THAT would be a use for ChatGPT I could get behind.

Alas, that wasn't the goal of its designers. By giving it almost total control over the plot line and requiring no fact checks or bibliography, the spewing of unchecked drivel was its only possible mission in life.

Congrats, OpenAI. You've automated spam.


No, we got to keep up with the shiny toys. Junior is riding on chatGPT like everyone else.


I see chatGPT as a first iteration, and it’s glaringly obvious what works and what doesn’t. I’m hopeful future iterations (not just from openAI) will make it overall a net improvement, things like adding real sources and better handling things it doesn’t know



The combination of using an "authoritative" voice and being completely and utterly full of shit makes for some truly ridiculous results. Users need to be extremely careful with these tools. And no, expecting everyone to be knowledgeable to independently fact check some BS published based on a generative model is not a reasonable position. That burden needs to be held by the publisher.

Unfortunately with the current state of machine generated content and SEO spam on the internet, that is not likely to happen as there is no real incentive to do so.

And it's not just being confidently wrong about a single person like this author noted, it's being confidently wrong about anything and everything.

The other day I asked ChatGPT about the reasons behind the connection between Italian filmmakers and Western movies. It gave a few reasons that were semi-plausible if not easily verifiable, and then made the bold claim that Italy and the United States were great allies in World War 2. It said this confidently and with a tone of authority.

And while we may all go "ha ha that's wrong" now, WW2 was nearly 80 years ago and not everyone has a strong history background. There aren't that many people left with direct experience and that number shrinks every day. With growing distrust in actual authorities on matters and rejection of observable fact--based at least in part on massive piles of internet BS--this is pretty concerning to me.



The test use case of constructing a bio for yourself, hoping it accurately summarizes all the extremely low sample size data it happens to have of you in its web crawled training data, seems like one of the worst possible use cases for ChatGPT. It’s right there on the main page that it’s not to be trusted with factual information like this. ChatGPT will hallucinate details. It’s remarkable to me actually how often it will refuse to hallucinate, given that’s basically what its job is. I don’t find it interesting to find all these edge cases where ChatGPT produces empirically false data. It doesn’t even have the ability to look things up! If I were the OP and wanted help writing my bio, I would first write the draft myself, then use ChatGPT to help with the editing, prose, grammar, style, etc. You are the expert on the factual details of your own life, and if you’re surprised that a language model trained on web crawled data ending in 2018 is not, then all I’ve learned is that you don’t know much about what this thing is.

I also don’t buy these arguments of the form, 1. OpenAI’s public ChatGPT app is often factually inaccurate. 2. ChatGPT is an example of a ML system bootstrapped on web crawled text data. 4. Thus, the long term future of our distributed text-encoded knowledge base will be a cesspool of useless gobbledygook.

ChatGPT is a step forward in generative language modeling. It doesn’t preclude the development of other future systems to help us verify factual accuracy of claims, likely much better than humans can. We’ll be ok gang:)


I feel like the 3 youre missing there is something along the lines of "people enjoy social validation and internet points, to the extent that pretty shit content thats low effort is something we enjoy generating"


That is true, 3 would help steel my strawman. I agree that we’ll increasingly have capabilities to generate and publish garbage that’s _just_ good enough to generate clicks, and incentives to do this. In addition, I think we’ll increasingly have tools to produce content that is much more rich, imaginative, insightful, and factually correct in our future. Some more interesting questions to me are then: What will the ratio be? How will that ratio compare with what we see today? How easily will I be able to identify misinformation when I care about factual accuracy (again, compared with today)? How easily will I be able to avoid the garbage, vs find the good stuff?


How can fact checking be better facilitated as tech develops? It seems like a distinct social issue to me, one that I imagine requires verification from people with reputation (which could be aided by improved social networks probably). Id be very interested to hear if you have other ideas, as you seem optimistic on this front and would like to share in that :)


I can think of a few directions for technology aiding in fact checking:

1. Much of finding out what’s true or false is about finding consistency amongst lots of observations. So this is the science direction. If you can analyze lots of data, say from first-hand direct measurements like from a scientific instrument, or analyzing second hand observational data, say from many news sources reporting on a political event. One could also imagine multimodal analysis combining these first and second-hand kinds of data to arrive at a consensus estimate of a “true” perspective. E.g. analyzing video and audio streams recorded at said political event, combined with many text reports of the event. So this point is about data mining, and jointly estimating semantic meaning from natural language and other kinds of data, in a way that’s consistent with everything else considered factual.

2. Provenance-tracking: Think block chain - if we can provably trace a piece of data back to its primary sources, tracking all its modifications along the way, this could help with establishing provenance, and verifying legitimacy of any modifications along the way.

3. Consensus, staking/voting, etc. A lot of deciding what’s true is about seeking consensus. One thing I’m generally optimistic about here is that, for any given fact “out there,” there are many more ways to describe it incorrectly than correctly. So even though it sounds scary for consensus to be an aspect of truth finding, it always will be, and at the very bottom it’s all we can hope for. One way that’s already getting traction to make consensus mean something, is the idea of staking. So you have to put something down on the table when you claim you believe something to be true. Software can (and already is) helping to build confidence behind some claims more than others by backing claims with value (money).

4. Humans are insanely bad at reasoning rationally, because of lots of reasons. Pick your favorite fallacy. We evolved to survive long enough to reproduce and rationality is a happy accident. One could imagine software being less susceptible to simple tricks, could be less incentivized to outright lie for personal gain or power seeking, or claiming to represent their actual beliefs when they are actually pursuing other goals by conveying something they don’t actually believe.


Prediction: OpenAI/GPT (or Google's DeepMind, if you prefer) is going to cause mass unemployment in certain sectors of the economy (for example, graphic designers, copywriters, and many IT professionals) long before it ever addresses any of the fundamental problems that currently substantially reduce human quality and quantity of life, like aging or cancer.

In the near-term, AI will just accelerate the winner-take-all nature of our economy.


Also, this isn't just an idle prediction on my part; for what it's worth, I've started substantially shifting my net-worth towards the entities I think will be the winners.


Not sure if it's some kind of joke or the author doesn't want their article on Hacker News, but the URL is currently 302'd to point to Google.com.



The TLDR of the article is essentially that any time a ChatGPT conversation strays to the factual (in this case a list of data ethics books), the quality of the data (whether those books exist and are written by the stated authors) is extremely low.

Ultimately this is where ChatGPT and similar projects seem most likely to hit a wall. Text generators are statistical machines, but the relationship between statistics and truth is merely probabilistic and not identical. There is no way to train ChatGPT to say things that are true, because the number of true and false statements in the world are far too numerous and too specific to ever reliably delineate. At best, we will be stuck fact-checking ChatGPT using enormously complex knowledge bases and asking it to try again until something doesn't raise a red flag, at which point it may still not be true.

For most business purposes this is unacceptable. We will have machines that can generate untruths in a way that contravene common sense, so long as those untruths don't obviously contradict whatever local database is being consulted for grounding. And the mechanisms required to re-ground the GPT output even to that unhappy state will be so complex in their own right that businesses might as well just implement a similarly complex system to handle the issue directly, without calling out to a complex, licensed GPT. Not sure whether there is a way around this.


One approach to fixing factual errors is to use two rounds of LLM interaction. I forgot the name of the paper.

Say you ask "What is the height of Everest?"

1. generate an answer in closed-book model, with the LLM: "The height of Everest is 8723m" = candidate_answer

2. search your references with candidate_answer, find: "At 8,849 meters (29,032 feet), Everest is considered the tallest point on Earth" = search_snippet

3. do a second pass to rewrite the answer with the LLM using search_snippet in the prompt

Basically, the incorrect phrase candidate_answer is very good at matching the correct answer in search engines. It is like a template tuned to extract the desired facts. A search engine check could also verify when the searched fact has no references.


How would you apply this approach to the article's example, the response to the question "Who is Daragh O Brien from Castlebridge", where I count at least 15 separate statements of fact?

Should we research all of them and try again with a big table of hits and misses from the first attempt? Seems like a lot of work.

Also: Is the generated response really "very good at matching the correct answer"? I suppose it would work because the search engine's language processing cancels out the useless parts that were generated by the AI (sort of a "human ABI", analogous to the C ABI?) but a more direct query (e.g. "height of everest") would likely be just as effective.


Yes, this is the essential issue. Any system competent to fully ground and check factual statements in a stream of arbitrary text will be phenomenally more complex than the original LLM, and usually will be able to answer queries directly, at which point one wonders what the LLM is adding. I think at best, if we can somehow identify all factual statements that need to be cross-referenced and then offload them to a knowledge base (dubious), we are left this kind of mad-libs connective flow that the LLM has created which approximates the essay style of a human writer. I'm not certain that has much practical value besides allowing for a form of undetectable plagiarism to be published as though it were free-form writing.


A friend is a member of quite a few Facebook groups. She doubts my belief that these groups already contain ChatGPT-like influence operation chatbots, but was recently surprised when in one of the groups several participants discussed a photograph of a typical large chest of drawers while referring to it as a "locker". Maybe some people do refer to such furniture as a locker, or maybe the first person did for some reason and the others then just followed along to avoid confusion, but it made me wonder whether it was an example of a group of chatbots talking to each other and making up a nonsense name for the item. (She still doesn't believe me.)


this is a long-standing meme: any behaviour you don’t understand on the internet must be some sort of sophisticated operation by a nefarious third-party network.

The question you should always start with is why and if you can’t come up with a why then you should start from a more neutral position on the behaviour.

Spam isn’t free to engage in, especially on platforms like Facebook, and spam is rarely an irrational money-losing act, it’s just profit generation on the wrong side of our view of what is ethical. Why would someone spend money to unleash chat-bots on a Facebook group to discuss a piece of furniture?

If you can’t explain why, your theory is probably wrong. I would be absolutely shocked if it’s anything more than just some real people who love furniture and use a localised phrase that you don’t recognise.


But I can explain why: they need to blend-in to build trust with other group participants and so become more persuasive when they are trying to influence, and also to avoid bot detection. The influence might also be much more subtle than talking about political topics, such as just being continuously negative or argumentative to bring down the moods of the targeted people.


You see the same kind of thing in some online multiplayer games. Players acting differently are seen as bots. Same for players who low skill or go AFK.

That said an interesting advance on real game bots would be ones which use AI to chat and respond to other players actions or messages.


I think the bigger problem that it calls into question who we are talking to online. It's going to be a piece of cake to run our thoughts thru an AI before copying the output with a few adjustments and passing it off as our own.


Nice read, and well argued.

Tangentially related: does anyone else feel like ChatGPT's default voice is just... super bland? I read the "op-ed" the author linked in the article (https://www.thecrimson.com/article/2022/12/9/porios-chatgpt-...), and the writing style lacks the kind of variation in structure (or content, really) that keeps opinion pieces interesting. It feels like a polished middle-school essay.


> It reminds me of how Clippy used to promise to help me write a letter back in the 1990s. But Clippy was a dickhead and he really didn’t have any expertise in writing letters.

Legend.


I agree with this article, but I feel the author is only scratching the surface of all the harm ChatGPT and other generative AI systems are going to cause. We just have to look at the current state of the internet today, with things like SEO spam, misinformation, social media bots, fake reviews, and so forth. The people responsible for these are already taking advantage of generative AI and it's going to exacerbate these problems to their extremes.

OpenAI's PR likes to preach about ethical AI, but it's a total farce. There's no way that OpenAI isn't fully aware of all the harms their AI is going to cause.

I only see one of two possible ultimate outcomes. AI like OpenGPT is going to flood the internet with false or inaccurate information, that I have no doubt. The first outcome I foresee is we simply lose all trust in the vast majority of information sources. The other outcome is we're going to have to sacrifice our privacy and anonymity online in order to distinguish real information from real people and not the AI bot generated nonsense. For example, Social Media may likely require you to prove your real identity, and will only allow you to use trusted hardware devices (i.e. phone or tablet) to post things to your social media account. It'll become increasingly difficult to share information anonymously online in a way that can be trusted. Any information that can't be tied to a verifiable source (a real person) is immediately going to be deemed untrustworthy.


> we simply lose all trust in the vast majority of information sources

Did you ever have it though? I'm old enough to remember newspapers, and I always took everything I read in them with a grain of salt and I'm absolutely positive they were written by a human being. (disclaimer: didn't read the article because it's down right now).


Or what about just being old enough to remember when "I read it on the internet" was a pejorative and laughable statement in regards to the truth.


To elaborate a little more, multiple attempts to separate real people from bots have either failed or are soon to be defeated. Captchas worked for a while, but AI systems will likely be able to easily defeat Captchas within the next 5 years. Some websites require a phone number, but it's not difficult to get a fake or temporary phone number, if you know how. Deepfakes mean you can't even trust a real-time stream from a person you can see and hear. Eventually, we're going to have to completely sacrifice our privacy and anonymity online in order to prove we're not a bot.


> There's no way that OpenAI isn't fully aware of all the harms their AI is going to cause.

You’d be shocked at how good people can be at deluding themselves when their livelihood is on the line. Execs at Facebook still think their product is good for society.


Atomization of civilization by 2030 anyone?


Pseudo intellectual knowledge. It’s a problem that has been with humans ever since we began communicating.

When there is a conflict Mountain goats just knock heads against each other (presumably so that both goats suffer concussions … until the one with the weaker knowledge graph suffers a loss of memory).

No such process in AI.



I believe that we humans will need to fundamentally change our attitude to information. Information without an acceptable provenance, should be treated like contaminated food and thus avoided.

Unfortunately this implies that free information will be largely unfit for consumption.


Man everyone is on the ChatGPT bandwagon, even Al Qaeda is on it:

htthttps://twitter.com/khorasandiary/status/1618682313528975360


I really disagree with this perspective. In history the democratization of books with the printing press, the entertainment & education industry with YouTube and MOOCs, and now whatever this next movement is with ChatGPT. Let's just face it, it's useless to predict how innovation will impact the future society, but you can always choose to find its most productive uses for it yourself in the meantime. Stop trying to brute force your outdated perspective on how disruptive innovations will impact other people. Unless you have a sample size of the entire ChatGPT audience, or the entire internet, you're going to have a rough time.


In theory knowledge is more democratized but so is misinformation. Ignore the Big Issue from the last two years and see how tide pod challenges have improved society. /s


It seems like a lot of effort will be spent to ensure that content is not generated by an AI system. This seems very tricky to enforce - a stopgap measure could be to require any person to upload a video where they say their comment before being able to post it to a website. Of course, along with the regression of user experience, such a thing could be tricked by reading an AI generated text, and eventually with deepfakes.

Is it viable to constrain conversation to human voices this way or is the cat out of the bag? Is it desirable? Should we just go back to in-person conversations?


Web of Trust will be back, your network will depend on your criteria for trust.

Monoliths / Centralized services are the ones that can't adapt to the coming technology.

https://wiki.p2pfoundation.net/Counter-Anti-Disintermediatio...

Great implementation example: https://ssbc.github.io/scuttlebutt-protocol-guide/#follow-gr...


Strange, when I click on the link, it goes directly to google.com.

I think ChatGPT is great but it does have a tendency to make things up. All output needs to be taken with a grain of salt. So what? It is still a useful tool.


From my experiments, ChatGPT is at least valuable by generating boilerplate code for scripting languages. I believe once connected to enterprise codebase its quality and completion can dramatically increase.


> enterprise codebase

> quality increase

We must have seen very different enterprise code.


I think it's going to be mostly more complete code which does improve its quality. For example I generated a piece of code to query API but without authentication or other information the code is not complete.


It's called garbage collection for a reason. ;)


Aren't scripting languages meant to reduce boilerplate code? Maybe you just need a better API and not an AI?


You still need boilerplate for scripting language though. Like querying an endpoint in Python with a rationing limit in mind. Things like that, not exactly difficult but still need some lines of code.


I have around 20 years do all the SEO stuff that is righfully hated and for years now I have been both excited and dread the public availability of simple to use AI.

My worry was it would completly destroy the SEO industry and indeed, SEO itself.

The kind of things I do (building thousands of content sites to feed a master money site) are super niche and require millions to get into the game, all it will take is a handfull of non droped high authority domains from auction and a decent server to churn out crap, scale that on a few hundred domains and it is game over.


I started to read the article, and it said chatgpt passed the bar, and when you actually get to that paper - chat gpt did not pass the bar. It did okay on the multiple choice section though.


There's also an article about the "enshittening" of TikTok.

Boy, I hope "enshittening" isn't gonna be 2023's Word of the Year. The concept of "stuff gets worse" is really old, and a buzzword for it just feels tiresome.

It's only January, so there's plenty of time for this to fade. And I suppose it can't be worse than "gaslighting", a word that was plenty relevant in 2022 but was hardly specific to last year. But surely we can do better.


ChatGPT, StableDiffusion, Midjourney, DALL-E, VALL-E and all child models of Transformers are great at generative capabilites.

Awful at precise recall and facts, but amazing as artists, poets and making things that roughly look right.

For generating ideas, they are absolutely wonderful tools. I spend at-least an hour interacting with a combination of ChatGPT and MidJourney.

Wolfram Alpha is different, but it's a great tool I consult for any mathematics/finance questions.


ChatGPT is sort of the reverse of Grammerly. I know little about proper grammer so I put most everything into Grammerly and it produces a lot of suggestions and I take them because they sound good and I don't really know any better. I assume it is right. ChatGPT produces text that sounds good and you have to update that text with your facts and remove or make updates to bring it to your own standard.


The enshittening of transmitting text over the internet


> AI can experience the same rabbit-hole effect as your idiot cousin who has watched too many of the algorithmically suggested videos on Youtube and now believes that aliens conspired with a time travelling Elvis to kill Hitler using a bullet made of Yeti teeth. And they only went online looking for TellyTubbies videos.

LOL Thank you, I needed a good laugh!


Even funnier: ~all humans suffer from the phenomenon, and are similarly unaware of their own personal ignorance. People view reality through a relative lens, the view is more pleasant than an absolute lens.


The link redirects to google? Is that the joke?




Article scrubbed including from Google cache. Looks like maybe someone didn't get a lot of corporate love for this.


Can't we add a feature to ChatGPT that goes as follows :

"Did you write this and if so when?"

That would solve like 80% of issues.


If I understand you correctly, this doesn't work because ChatGPT isn't the only language model in the world. It's only the most popular at this point in time.


This is redirecting to Google for me


Not really sure if this is supposed to be the joke, but your link just redirects me to google


So many people on HN just debate based on the title of an article that the actual writing of the content is a superfluous waste of time


This is why you need to use chatGPT now to launch companies before people inevitably ruin things using chatGPT.

chatGPT by itself is amazing. How people (keyword) will use is going to determine outcomes.


If you're using Wordpress, use one of the WP caching plugins. It's worth it.

Guessing based off the `wp_footer` class you have in there. Much better solution than 302 to Google.


Wouldn't it be funny if it was a Chatgpt powered site.


When I click the header link I get redirected to google?


Why am I redirected to http://google.com when I press the link? :)


dead


What is effectively a static site being served with a database calls for each page view is getting exactly what was coming.


You get what you deserve! Justice has been served today, a cold and bitter dish


WordPress and its consequences have been a disaster for the human race (jk, it's just hella overused)


Harsh burn!


Almost poignant though.


HN kiss of death


Title was the best part anyway.


ChatGPT got internet access and took it down :-O


Chat GPT ain't takin your job, the Dev that knows how to use it as a tool is taking your job!


cf. xkcd/810 "Constructive" https://xkcd.com/810/

- - - -

Schimidhuber says that his task is "to create an automatic scientist, and then retire."

Not long ago it was mildly insulting for someone to suggest that your writing sounded like the output of GPT, already (for most of us) it has become mildly complementary. GPT may be hallucinating, but it writes well.

So what if you connect it to empirical feedback? Make it a scientist. I don't think it will be long before these machines think better than us (at least most of us.)

The thing is, they don't have glands. They don't have historical trauma in their flesh. What I'm getting at is that these machines won't have human hangups. They won't be neurotic. They won't be HAL. They will be sane.

The interesting thing here is that life on Earth is actually pretty straightforward. In video game terms Earth is very easy and all the walkthroughs and cheats are known and available. The only reason it seems hard is that people are kinda messed up. Once we have sane intelligent machines to make our decisions for us things should get better rapidly.


Your comment reminded me about a discussion G.K Chesterton has in "Orthodoxy" on madness. My take is that when facts and information is divorced from experience we risk being unmoored from reality:

> If you argue with a madman, it is extremely probable that you will get the worst for it; for in many ways his mind moves all the quicker for not being delayed by the things that go with good judgement. He is not hampered by a sense of humor or by charity, or by the dumb certainties of experience. He is the more logical for losing certain sane affections. Indeed the common phrase for insanity is in this respect a misleading one. The madman is not the man who has lost his reason. The madman is the man who has lost everything except his reason.

[0] https://www.pagebypagebooks.com/Gilbert_K_Chesterton/Orthodo...


Chat gpt is a terrible writer. At least, every example I have seen has had poor information density and was generally worse than the input prompts people used when they wanted to 'fluff up' a statement or opinion. I would be genuinely interested in a counter example.

Side note - your main point seems overly optimistic. How would we recognize, value, or design a 'sane' machine when by your argument we dont have access to sanity? Seems way more likely to generate a distillation of our neuroses.


Raise your hand if you think the singularity is happening. (ie, within 5 years till AGI)


Saying middle of the road shit and adding bullshit sounds like every politician.


Clicking on the link is booting me out to Google. Is that just me?


I got “forbidden”


ChatGPT is the bell announcing the death of anonymity on the web.

Anonymous content will be generated content, real discussion will shift to gated communities with either paywalls or proofs of identity.


I see most replies are missing the point. The text that ChatGPT generates is very human like, so companies will be forced to adopt a mitigation strategy by tying users to a government ID. Such moves would also be bolstered by the government of various nations, which have already been asking for something like this.


Or what we think of today as social media is basically on its last legs.

Your AI assistant will just take up so much of your time that you won't be bothered with what other people think. Communication with other humans online right now is largely a form of entertainment. The bar is just so low to be more entertained by future AI assistants than a random human.

I am personally certain that I will not be reading this board at some point in the future because the AI assistant will just have far more profound things to say than what random people are going to post on here.

It is not like the assistant only has to be one personality too. Could be a whole group of assistants having a group chat. Lamda can probably do this right now considering it can talk as a planet or national park.


That's how it is now, it'll just be more so. You might trust somebody here, but do you trust your average Amazon reviews?


But we also can no longer trust that information published by a verified human was actually created by a human.


GeoCities banner communities are back baby! Webring where you at?


I'll bring the ff7 walkthrough! Someone grab a Runescape login.


Link is redirecting to google.com


Saw a silly “recommended” video on YouTube of Jordan Peterson talking about ChatGPT.

“It knows everything. It’s smart. It’s smarter than you!”

What a crackpot, talking about stuff he doesn’t understand for clicks. Sadly it works for the lowest common denominator


ChatGPT doesn't bring anything new to the table in terms of bullshit and misinformation.

The average person already consumes a low-signal, high-noise info diet, since the advent of modern propaganda in the 1920s.


Are you really suggesting that the average person had access to better information before the 1920s? For real?


No, I'm not suggesting that. 1920s is merely when the current paradigm began, and ChatGPT is not ushering a new one.


403 Forbidden?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: