Stack Overflow: the Blind Leading the Blind

kylec · on Sept 1, 2008

This is interesting - my StackOverflow answer (XOR) is now the subject of a blog post that's reached the top of Hacker News.

To clarify part of the article, the time granularity on the site isn't enough to pinpoint in what order the comments were made on the thread. Daniel Jennings' edit to his post existed before I posted my answer, so it's not his reply to my answer.

Every part of Stack Overflow is run and moderated by the community except one: the selection of the accepted answer. As long as you have one person in charge of selected answers you will always have to deal with their particular biases, which decrease the value of Stack Overflow as an objective reference to programming questions.

It might be a good idea that in addition to the answer selected by the asker that there be a community selected answer. This would be completely orthogonal to the up/down votes for the questions, and would only allow one selected answer per person per question. An approach like this would greatly serve to make Stack Overflow a more equitable and fair system.

In any case, I've had tons of fun learning new information and teaching people some of the tricks I know. I hope to see some of you there!

tlrobinson · on Sept 1, 2008

Clever solution.

Sometimes XOR seems magical (like with the XOR swap trick). Though in hindsight, it seems like the question was designed for the solution, not vice versa. XOR just happens to do exactly what we need in this case.

easyfrag · on Sept 1, 2008

I know in the podcast Spolsky and Attwood have discussed making a "community answer" a feature.

As I recall there were some issues around community answers in general, because of the wiki-like behavior of allowing others to edit answers there needs to be some threshold where an answer belongs to the community rather than the user who originally posted it.

bootload · on Sept 1, 2008

"... a working template that they could use to base Stack Overflow on: Usenet. Usenet has been around for almost 30 years now, and newsgroups like comp.lang.c++ are still active and full of experts discussing a wide variety of topics. But no, Jeff and Joel would rather make an Ajaxy shithole with no reason to draw true experts in. ..."

It's not great to see an example of the Nerd bully at work. Anonymous, negative and probably wrong.

This is a bogus article. Why? Well firstly you can't gauge the writer? Who has written this? They have an opinion but everyone has opinions. So what is the meat of the writers idea:

  "Usenet is an active hive of technical 
   users who know all the right answers"

Usenet is an example of a broken model. Sure you can ask a question and get an answer. You might get 50. So how do you work out who's is right and the best? Force of personality? Discussion? By the time you have followed this discussion the noise/sig has diluted the answer possibly for the person who asked the question. What about others who ask the same question 5 years down the line? Where is the knowledge trapped? How are the experts identified? What you get is a unfocused threaded discussion arguing the merits of proposed solutions punctuated by the random Star Trek pun or maybe a lame Simpsons joke.

The ideas behind Stack Overflow are worth looking at. Forget who is doing it. Don't like Joel or Atwood, Fine. But finding answers to specific questions, finding good answers, evaluating responders and filtering the best results and preserving them is a worthy idea. The filtering of information and building the reputations of users who contribute is proving to be a real problem. Something to look at a bit closer.

The questions and answers might be crap at the moment. But maintaining that an archaic system is the best way to do things is making a statement about the writer.

thorax · on Sept 1, 2008

I agree wholeheartedly. Definitely a bully sort-of post for no good reason. Reminds me a bit of the sort of chip-on-their-shoulder hackers that aren't very fun to work with.

I think it's fine for people to offer constructive criticism to those trying to help the development community, but I don't see any reason to be so mean about it. Belittling them is a great way to discourage people from exploring better/more approaches down the line. I don't want to see that as I'm particularly fond of ways of capturing knowledge that old technologies left on the table-- our bug.gd error search site is built entirely on that premise.

Towards his arguments, Even if the world's best hackers won't have the time to toss in help on the site, I know I've had some of the best learning experiences in my life because I took some time to explain something I didn't know 100% before answering. That's part of how you get to be an expert. Even if the site doesn't produce the world's most amazing answers, perhaps it will help build the next generation of developers into solid hackers.

Anyway, enough with that kind of meanness. I know from doing thousands of code reviews in my career that there is no developer who codes things perfectly even 80% of the time. Lighten up and stop judging people.

If you want to show "the blind" how awesome you are then hop in there and toss in your better answers.

bootload · on Sept 1, 2008

"... Anyway, enough with that kind of meanness. I know from doing thousands of code reviews in my career that there is no developer who codes things perfectly even 80% of the time. Lighten up and stop judging people. ..."

Good point. The telling point wrt StackOverflow I've been learning listening to the development converstations ~ http://blog.stackoverflow.com/2008/08/podcast-17/ , then watching how the problems are solve and watching the results. Talk alone is rather hollow.

orib · on Sept 1, 2008

How about figuring it out by reading the answers and analyzing them?

Using my brain to critically assess what people tell me has always worked for me, and I usually -- for any nontrivial problem -- end up combining the ideas from all the (correct) answers. As for finding them later, I find searchable archives work wonders for that.

bootload · on Sept 1, 2008

"... How about figuring it out by reading the answers and analyzing them? Using my brain to critically assess what people tell me has always worked for me ..."

Sure you can. But is there a better way? Can your way scale? Will your brainpower & intelligence be recorded and replayed for others?

Look at it another way. Would you take this approach to evaluating other types of information, links for example? Would you make a list of all the good websites and organising them for others to read? Or would you organise links utilising computers, software & inferred information form humans to do the same thing.

Yahoo vs Google?

orib · on Sept 1, 2008

Would you make a list of all the good websites and organising them for others to read?

Aren't you posting on a site that does just that? Manual contributions of stuff that people found interesting? Or were you thinking more of something like bookmarks in my web browser?

bootload · on Sept 1, 2008

"... Aren't you posting on a site that does just that? Manual contributions of stuff that people found interesting? ..."

And at the same time illustrating the point of a Usenet Q drifting from the point and increasing the noise :)

fendale · on Sept 1, 2008

The thing that got me excited about StackOver flow was that it was announced just a few days after I spent days attempting to dredge information out of those Bulletin board type forums where there are 100's of non threaded replies/discussions to a question (and the experts are reluctant to help the noobs).

If only the correct answer could bubble to the surface, or the question asker would take the time to update the question with the discovered answer, so I reckoned Stack Overflow sounded like a great idea.

I have not used it yet, but I reckon it will solve more problems than it causes.

Jeff and Joels podcast and current following didn't have solve the chicken and egg problem either, so they are off to a good start!

ardit33 · on Sept 1, 2008

Interesting post. I agree with this, and I see at work often. The blind leading the blind, especially when you have new people in the team comming in. A smart newcomer will eventually realise the advice they are getting is not good, and move on to somebody else.

"What would be the best algorithm for finding a number that occurs only once in a list which has all other numbers occurring exactly twice. So, in the list of integers (lets take it as an array) each integer repeats exactly twice, except one. To find that one, what is the best algorithm."

I want the hacker community to answer this question (which seems more like a typical interview question). Ideas?

thorax · on Sept 1, 2008

First thing that came to mind for me (just assuming there was needed a practical way to do it) was the following (translated into Python):

  # associative array / dict / hash
  store = {}
  for elem in inputlist:
    if elem in store:
      del store[elem]
    else:
      store[elem] = True
  # return the only key left
  return store.keys()[0]

Now that I've looked at the post, the XOR approach is quite efficient and a more fun answer.

river_styx · on Sept 1, 2008

I must admit that I didn't understand his XOR solution. Can someone explain it?

adr · on Sept 1, 2008

Suppose x = 0 and your list is [1,2,3,2,1]. If you xor 'x' with every number in your list then x = 1^2^3^2^1, which is the same as x = (1^1)^(2^2)^3 = 0^0^3 = 3. So, x=3 is the number that wasn't repeated. This works because a^b = b^a and a^a = 0.

sah · on Sept 1, 2008

Iterate through the list bitwise-XORing the numbers together into an accumulator. Because x ^ x = 0, and x ^ 0 = x, and ^ is commutative (x ^ y = y ^ x), the final value in the accumulator for a list like [x, y, y, x, z] is:

    x ^ y ^ y ^ x ^ z 
  = x ^ x ^ y ^ y ^ z
  = 0 ^ 0 ^ z
  = z

brianr · on Sept 1, 2008

Here's an example in python:

  >>> import random, operator
  >>> nums = (range(5)*2)
  >>> random.shuffle(nums)
  >>> nums.pop() 
  1
  >>> nums
  [0, 4, 0, 3, 2, 4, 3, 1, 2]
  >>> reduce(operator.xor, nums)
  1

thorax · on Sept 1, 2008

To clarify for non-Pythoners/non-coders: If you assume "nums" already contains the list, most of his code above is setup. You only need these two lines of his code to get the answer:

  import operator
  answer = reduce(operator.xor, nums)

gopher · on Sept 1, 2008

What are the constraints? If it's processing time or memory, I'd choose the XOR solution (which is brilliant). But it could also be the time to solve the problem. This would mean that the first working solution that comes to mind would be the best. Or the solution should be transparent to non-programmers, which probably eliminates the XOR hack. It could also be the joy while coding...

petercooper · on Sept 1, 2008

That trick of his at the end of the post to give you access to StackOverflow is a classic. Funnier, though, if once you've done it, you use any OpenID at your disposal to log in - it generates an empty account for you within their system.

Of course, this could actually work out well as only people with a smidgen of interest in using the site will bother to try it.

statictype · on Sept 1, 2008

Its not that big a deal. The cookie just gives you access to the private beta (which you could have gotten by just sending them a mail).

Once you have access to the beta, you can always create a new account whenever you want.

IsaacSchlueter · on Sept 1, 2008

Wow. This guy really hates Jeff Atwood.

Must be flattering to have someone go to so much trouble just to troll you. Nicely done, Jeff.

raganwald · on Sept 1, 2008

There seem to be a few different ways to use XOR on sequences of integers (or for exchanging the contents of two registers), and they're worth learning if you ever need to go on a job interview with someone who uses "Aha!" style technical questions.

Beyond that, however, I'm left scratching my head. Just the other day zenspider and I were debating whether the k combinator is a useful tool for programming or whether it is "clever" code. Am I wrong in thinking that using XOR for finding an integer that only appears once is rather clever?

kylec · on Sept 1, 2008

You are correct - the problem was probably constructed with that intended result. The specifications are unusually precise when it came to how the input was formatted. If they were any less precise XOR wouldn't work.

    ..a number that occurs only once in a list which has all other numbers occurring exactly twice..

raganwald · on Sept 1, 2008

I think somewhere in the "drafts" of blog posts that never made the light of day is one about "programming by isomorphisms."

One of the greatest intellectual feats we have is to look at a problem and realize that it is isomorphic to some other solved problem.

A lot of "Aha" programming challenges either test for the ability to recognize these isomorphisms (or more likely, they are culturally biased towards people with a lot of social exposure to older programmers).

I suspect this kind of thing might be a test for a specific kind of intelligence. I do believe--in general--that all other things being equal, the smarter programmer is the best hire.

But gosh, I admit to fearing this kind of actual code in production. It looks like exactly the kind of thing that breaks--possibly in subtle ways--when one of the underlying assumptions turns out to be false or is changed.

Of course, changing assumptions are never part of the tests, so it might be "fair dinkum." So the follow-up question probably ought to be, "when would you actually use this device?"

mercurio · on Sept 1, 2008

Wait... don't the two solutions have the same time complexity?

The bitwise XOR approach should depend on the maximum size of the numbers (say k) along with the length of the array (say n) as O(nk), and just using radix sort on the array will take time O(nk) too.

If you assume that bitwise XOR is constant time, that implies that the numbers in the array are bounded by a constant, and therefore Radix sort can sort the array in O(c.n) = O(n). In either case, they have the same running time.

ntoshev · on Sept 1, 2008

Good point, but no - XOR is clearly better. It is simpler and its constant factor is less than the constant of a Radix sort.

philf · on Sept 1, 2008

I wouldn't even start picking on the answers but I don't think that there are a lot of quality questions on stackoverflow. I quickly browsed through the ruby section and it was mostly questions that could be answered by googling or reading the documentation.

Maybe this will change after the beta is over..

petercooper · on Sept 1, 2008

To this guy's point.. yes, the blind can be leading the blind, but any answer is better than no answer.

Agathos · on Sept 1, 2008

Here's an answer that's worse than no answer (a classic):

http://groups.google.com/group/comp.lang.lisp/msg/4c95766a8b...

michael_dorfman · on Sept 1, 2008

Not when the answer is seductively wrong. "No answer" means "keep on looking".

baha_man · on Sept 1, 2008

I think the site is supposed to be for very specific technical questions, in which case there is less likely to be a 'seductively wrong' answer. Unfortunately most of the questions don't fall into this category, and either can't be answered objectively or leave more scope for 'bad' answers.