The quote I'm referring to mentions pseudocode, which makes integers overflowing and stuff like that irrelevant, since that's dependent on the integer representation you're using.
It's not integer overflow that makes the problem difficult. It's off-by-one errors in the predicates (< vs. <=) and indexing operations (n/2 vs (n+1)/2).
Not arguing with you as that gets many good people during interviews. But I'm going to provide some anecdotal evidence to the contrary ...
In binary search this problem comes from the length of an array, which is either odd or even, and then you're doing a division by 2, which should trigger alarms in your head all over. And just as when you were taught in your first CS class (either high-school or college) you pick 2 simple examples for both cases, and test with pen and paper (tools provided during any interview).
When I did interviews from the employer's side -- the people not doing this were either totally unprepared, failing most other questions, or brilliant (and lazy) people that considered the problem too easy to bother checking the implementation.
Surely, put into context it can provide a valuable filter, but it's not saying much about the top X% of the talent pool, which my post was about.
It depends on how the test was given. For example, did they ask you to write quicksort == you submit a version and then they grade it. If so I can imagine a lot of little mistakes... from the integer overflow issue (if they did pick a language), to off by one errors, to not handling empty lists, or the case where the item isn't in the list, or other errors that if someone were to say, "it doesn't work with this input" you could easily fix it, but if not given a chance for correction many could easily get wrong.