Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You're being a bit uncharitable in your interpretation of my argument here, but I get where you're coming from now.

I'm not an LW hater. For a long time, I didn't really have an opinion on LW both as a community nor as a philosophical framework. I do consider myself a transhumanist, though. There are three concepts I do know from and about LW: their take on rationality, the top secret AI unboxing strategy, and the Basilisk.

I have a very poor opinion of the concept of the Basilisk (and yes, as someone pointed out, that opinion is basically the same as the one I have about Pascal's wager) - a concept that has been given additional, undeserved credibility by the reactions of Yudkowsky and LW.

As for the AI escape chat, it's a social experiment. People can be talked into making mistakes, or at least making risky judgement calls, whether they operate on a rational framework or not. I have no problem with that thesis. What I object to is the "magic trick" aura surrounding this experiment, including the insinuation that at the core there is an argument so profound and unique and potent, it cannot be allowed to escape Yudkowsky's head. Oh, and by the way, the trick can never be repeated, but all you laymen out there are welcome to devise your own version at home. This whole thing comes across as humongously self-important: there is a secret truth that has been privately revealed to our leader.

To me, and I recognize I may well be alone with this opinion, the more rational assumption is there is no such magical argument at all, and the prime reason for not publicizing it is to prevent it from deflation by public critique, in the same way the inventor of a perpetuum mobile device will keep the inner workings of his contraption a closely held secret because ultimately the device doesn't exist as stated. The amazing part of this very old trick is that, even in 2015, it still works on otherwise smart people.

I get that my opinions on both the Basilisk and the AI Chat are extreme outliers, and to my knowledge I have never met anyone who shares them - it would probably have been advisable to keep them to myself, but honestly I wanted to see if like-minded people exist.




> a concept that has been given additional, undeserved credibility by the reactions of Yudkowsky and LW.

For the record, EY agrees with you and says he mishandled the original comment. Also for the record, the reasons why the Basilisk does not work are _not trivial_ - it's not a simple Pascal's Wager, because with Pascal's Wager, we don't have the ability to actually create God.

> I have no problem with that thesis. What I object to is the "magic trick" aura surrounding this experiment, including the insinuation that at the core there is an argument so profound and unique and potent, it cannot be allowed to escape Yudkowsky's head.

Personally I never got that impression. My idea, from looking at the psychological state of Gatekeepers and AIs after games, was always that playing as AI involved some profoundly unpleasant states of mind, and that not publicizing the logs probably comes down to embarrassment a lot.

For the record, Eliezer never claimed to have "one true argument", and in fact publically stated that he won "the hard way", without a one-size-fits-all approach. A lot of the mythology you claim is utterly independent of LessWrong.

> Oh, and by the way, the trick can never be repeated, but all you laymen out there are welcome to devise your own version at home.

It probably helps that I've met other AI players, and their post-game state matched EY's.

I think in summary you're mixing up stuff you've read on LessWrong and stuff you've read about LessWrong. The latter is often inaccurate.


> I think in summary you're mixing up stuff you've read on LessWrong and stuff you've read about LessWrong. The latter is often inaccurate.

That may well be the case, but my only other information source is HN comments, and not those made by detractors either. If there are sites or articles dedicated to the deconstruction of LW ideas, I'm not privy to them, nor am I interested in seeking them out. Basically, I only remember LW's existence when it comes up, always accompanied by fawning comments, on HN.

> the reasons why the Basilisk does not work are _not trivial_ - it's not a simple Pascal's Wager

Correct. While my value judgement of both is the same, my reasoning about why the Basilisk is not a thing ultimately consists of more components. That doesn't mean it's worthy of more consideration though.

> because with Pascal's Wager, we don't have the ability to actually create God

I would not say this is centrally important, because the processes leading to the creation of AGI are in all likelihood not going to be influenced by the existence of the Basilisk thought experiment either way.

> and in fact publicly stated that he won "the hard way", without a one-size-fits-all approach.

Again, I have to take my cues from the perspective of an outsider looking in, and there are several people who commented in this thread alone who described it very, very differently. Of course, a movement is not directly responsible for all its fans and members - but among the advocates for the validity of the AI Chat experiment, the idea that out there is a mystical one-size-fits-all rhetorical exploit seems very much alive. It may be cynical, but I can't help noticing how this aura of mystique and secret knowledge seems to work very well when it comes to attracting fans.

Of course, ultimately, these are just memes - and like many memes they propagate best when reduced to an absurd core. It doesn't even require intent.


> Of course, a movement is not directly responsible for all its fans and members - but among the advocates for the validity of the AI Chat experiment, the idea that out there is a mystical one-size-fits-all rhetorical exploit seems very much alive.

I agree, and I am totally with you on this - I disagree with that interpretation wherever I see it. :) That's not exactly Eliezer's fault tho, and I guess it's to be expected that geeks attach to "clever" answers. I do think it's a bit unfair to judge the entire site by the two posts out of hundreds that happen to be in all the news articles - which LW has no influence on.

Inasmuch as _fans_ judge the site by these two articles, I'm just as much against that. I don't want LW to have an aura of mystery; that largely defeats the point!

[edit] I think a big part of the problem is that online reporting selects for clickbait.


I'm thankful you took the time to engage with me and explain things from an insider perspective (instead of just downvoting me like the others did). You are absolutely right that the entire site shouldn't be judged on two "meme-affine" topics and headlines, which I hope is clear was never my intention. You provided some insight into these two subjects that irked me where nobody else in this thread could or would step up. I find the nature of the HN-based fanclub still bothersome, but I do see a larger disconnect between unreflected fans and actual LW members now.


> instead of just downvoting me like the others did

FWIW, I downvoted your original comment on this thread (and only that one) for being vague, snarky and dismissive. If you wish people to engage with you, I recommend not starting off like that, although it seems to have turned out okay in this case.


> If you wish people to engage with you, I recommend not starting off like that

And I recommend you give people the benefit of the doubt, though honestly I have to say I frequently fail at that myself. For example, your comment could be perceived as somewhat condescending, but I force myself to categorize it differently. I also know that I can come across way more negative than I intend to, I apologize for that and I'm working on it.

For what it's worth, I do think my original comment was snarky and dismissive, but somewhat counterintuitively that's not usually what gets people downvoted and flagged on HN. People can and do get away with artful personal attacks on HN all the time, at least in my defense I can say I attacked an idea instead of a person.

It may well be the case that my insufferability amplified the reaction, but I posit the root cause was disagreement about the message, not its format.

> although it seems to have turned out okay in this case.

It turned out okay because a decent dialogue emerged from it, one of the very few in this entire thread. But it was sufficiently controversial to get enough downvotes in order for my comments to teeter around 0 points and also receive flags. There have been a few updates to HN's comment ranking and voting algorithms that will make me regret taking this stance for some time, which may or may not provide you with some comfort to know.


Yay! Thanks for listening :)

[edit] To be honest, I didn't even know you could downvote on HN. I've never seen a downvote button.

[edit] Not that I would have.


> [edit] To be honest, I didn't even know you could downvote on HN. I've never seen a downvote button.

It's offtopic, but when you reach 500 karma (or thereabouts), you get the ability to downvote the comments you feel are a net-detriment to the discussion. Sometimes, when I express a particularly unpopular opinion I also get flagged. The result is what I call "the doghouse", it causes comments to sink to the bottom like stones for a few months, among other things. HN moderation has hidden mechanics and the results can be very frustrating. But in 99% of cases, when those special mechanics are not in effect, the system works pretty well.


I do not think the trick can not be repeated, it is in fact, the other way around.

For the AI to be infallible, it needs to have a network of tricks and arguments and each one of them had to be created for a very particular person, even if later it can be repeated on others.

It is like Christianity. There's not a simple belief that everyone accepts and that's it. There's only the façade of simplicity, but in reality it is anything but simple.

There's a network of related explanations and rationalizations that has been expanding for centuries, and every time someone appears who will not accept the network of arguments, a new argument or explanation has to be added to account for that person.

For example, if Christianity had not need to deal with Gnosticism, or with Arianism, then the network of beliefs and explanations would have been a noticeable different one.


I myself share your views on the Basilisk and the AI chat, and I don't think they are as "extreme outliers" as you think.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: