Hacker News new | past | comments | ask | show | jobs | submit login
MySQL Doesn’t Always Suck; This Time it’s AMD (timetobleed.com)
63 points by ice799 on April 6, 2009 | hide | past | favorite | 12 comments



I would say MySQL's response sucked plenty. No effort was made to document the issue for end-users in the release notes. No effort was made to document a workaround that end-users could implement. No attempt was made to provide a workaround in the code; if a workaround was truly impossible as claimed, then why not implement a killswitch to prevnt MySQL from running in such a configuration?


Workaround? Are you suggesting that MySQL somehow hotpatch the kernel (which is what would be needed to fix the problem)? When was the last time you saw a userland process hotpatch a kernel? This issue belongs to the kernel hackers, not userland people. And a killswitch? Great. Then what? Go buy new hardware?


I don't know the code. Maybe they could work around it by using a different primitive. They could change the multiprocessing technique in these configurations. They could change the RPM so that it has a dependency on the a kernel version where this bug is fixed or a dependency on the specific patch that fixes it.

I'd rather have MySQL refuse to start in a known-bad configuration than crash during runtime. Maybe I would have to buy new hardware; most likely I'd just have to patch the kernel.

Remember, MySQL is a database. For many of its applications, it is important that it doesn't crash.


From what I understand, this has to be in the userland threading library, unless you're a Windows developer. Wouldn't this functionality be provided by glibc (i.e. pthread_mutex_lock), in Linux? It's not reasonable to expect them to touch that library, whether it is glibc or the Solaris counterpart.

So correct me if I'm wrong but then the solution would be to either hack their own version of mutex_lock or change the call. Actually, both involve changing the call since you'd have to call your new in-house mutex_lock.

That is really not reasonable either. I would expect that they'd have to do it anyway, since MySQL is that mission critical, but the Proper solution following that is to make MySQL dependent on a version of glibc that has a mutex_lock that is patched against the damned Opteron, and issue an immediate patch release.


First of all, this was on OpenSolaris; I don't think it uses RPM packages at all.

Beyond just not crashing, any good database will already have a LOT of code dedicated to verifying data consistency. The best thing MySQL can do in that situation is loudly warn about running on known buggy hardware and then continue checking that its data hasn't been corrupted.


The MySQL bug report is for x64 Linux. OpenSolaris is used as an example of where AMD's bug has been detected and worked around.


I must have misread, sorry.


MySQL didn't need to do the best thing, it needed to NOT do nothing.


Indeed.


And that's what it did. As the article said, an assertion failed and the application died (instead of silently corrupting your data).


Quick reference: cat /proc/cpuinfo

Your processor is buggy if it's an Opteron family 15, models 32-63 inclusive.


Good call. I should have included something like this in my post. I'll update it and give you a shout-out. Thanks for commenting!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: