Android Project Changeset 4f8b683: libc/memset.c

pkaler · on May 26, 2010

Forget the typo. This is the slowest, most naive implementation of memset possible.

Is this actual shipping code? I've followed the changeset up to the top and it looks like it's in platform/bootable/bootloader/legacy. Hopefully this isn't actually called anywhere, but the review date says May 13th.

What's the point of optimizing the Java VM when functions like memset, malloc, memcmp, memcpy, memchr, etc, etc haven't even been optimized for the platform.

EDIT: I understand that 95% of your time will be writing in a language like Ruby, Python, Javascript or a C based language. But dear god, please add Agner Fog's Optimization Manuals to your reading list. http://www.agner.org/optimize/

__david__ · on May 26, 2010

Because in boot code you generally don't care about speed, but size is a very important factor. I've personally had to rewrite standard library functions to be intentionally small and stupid just so a piece of boot code could fit into ROM. It's not fun, but there is a certain satisfaction when you rewrite putchar and think, I just saved 10k of code!

signa11 · on May 26, 2010

> Because in boot code you generally don't care about speed, but size is a very important factor.

why can't you have a '#ifdef PREFER_SIZE_OVER_SPEED' or something similar ?

__david__ · on May 26, 2010

Practicality. Generally your stdlib is part of your cross compiler install. So now you'd need 2 versions of your stdlib--the small version and the speedy version. So now you aren't using -lc (which is usually built-in to the linker) foro your boot code but you are for your regular firmware. Now your makefile is more complex. And you are running a modified std library so it's a pain to upgrade. Blah blah blah the list goes on.

Of course you could do all that, but it's so much easier to just override the 5 functions you actually use in your code. Because of the way linkers work and the way standard libraries are designed, if you define your own version of a library function then the linker won't pull in the library version--effectively choosing your local version over the one in the library. That keeps the standard library clean of patches and keeps the small, tight code near the project that actually needs it.

JoeAltmaier · on May 26, 2010

Compilers have long ago memorized the code for memset (and memcopy and ...) and substitute optimized algorithms. As part of faking their benchmark stats for marketing purposes. So no worries, this code should result in kick-butt optimized assembler. Except for the obvious bug.

__david__ · on May 26, 2010

Optimized for what? Speed? While I can picture that happening, it has been my experience that this is not the case. The memory squishing optimizations I'm talking about require dumping the assembly output and I've never seen my stupid memcpy() routine suddenly turn into optimized-for-speed code. I'd be pretty upset if that were to happen...

neilc · on May 26, 2010

Given that it is in "platform/bootable/bootloader/legacy", I'm not sure there is much point getting all worked up about performance -- presumably something much more efficient would be used for the vast majority of cases.

joezydeco · on May 26, 2010

I haven't updated my source tree in a long time, but the ARM devices were using bionic/libc/arch-arm/bionic/memset.S. This is pure ARM assembly optimized for the pipeline and cache.

My ARM assembly is a little rusty, but it looks like they implement memset and bzero, the latter just calling memset with R1=0.

dfox · on May 26, 2010

Don't underestimate the compiler optimizations. GCC 4.3.1 on i386 with -O3 generates pretty involved optimized code that actually stores whole aligned 32bit words. (and I strongly suspect that this generated code is faster than rep stosb).

hoelle · on May 26, 2010

I'll offer the tip that even if you have a fast memset, you can and should avoid using it when possible.

Seeing memset is usually a sign that someone is being lazy about watching the length of their buffers and/or verifying their copies aren't off-by-one.

scott_s · on May 26, 2010

If I understand you correctly, you think programmers should say:

  int i;
  for (i = 0; i < SIZE; ++i) {
    buffer[i] = 0;
  }

Instead of:

  memset(buffer, 0, SIZE)

Because it indicates better buffer management.

SamReidHughes · on May 26, 2010

If I understand hoelle's point, it is that you shouldn't need to fill the buffer with a constant value in the first place. Why do people do it? The only time you generally need to prefill something is if you're using some algorithm whose initial state includes a prefilled array. It's hard to find a place where a byte array needs to be memset.

If you need an array zeroed on first use, why didn't you use calloc? The only reason memset has for existing is when you need to reinitialize an existing array to zero. If things are so tight that you're wiping out an existing array and reusing it, which usually involves being deliberately less abstract than you'd naturally be, then, well, that's a pretty unusual situation.

(While we're on memset, I'd like to point out that the weirdest thing is that while memset initializes value byte by byte, the places where you do need a prefilled array are always places where the array is an array of larger values.)

scott_s · on May 26, 2010

Sometimes you manage your own memory, so you can't use calloc. Sometimes you're zeroing out stack allocated memory, such as the common practice when doing socket programming in C.

Setting memory you own to a known value is just good, defensive programming. If I screw up - and in C, you're going to screw up - it's good to see a known value rather than unknown values.

hoelle · on May 26, 2010

Removing the memsets from the networking layer of one of my projects basically halved the frametime of that system (which was a bottleneck).

I'm all for defensive programming, but when you need to be fast, you'll just have to make sure it's correct.

prodigal_erik · on May 26, 2010

I took it as saying you should remember how much of the buffer you have actually used. If you only read what you write, it doesn't matter whether the rest of the buffer is filled with \0 or not, so you can skip the memset.

np3000 · on May 26, 2010

it should be memset(buffer,0,SIZE*sizeof(int))

scott_s · on May 26, 2010

Only if it's a buffer of ints. I had assumed it was buffer of unstructured memory, which would be chars and of 1 byte each.

ars · on May 26, 2010

How would you write this function? Fully portable, so no ASM.

tptacek · on May 26, 2010

You'd unroll the loop. But knowing the average workload for memset(), it probably wouldn't be a perceptible win.

ars · on May 26, 2010

With todays CPU, particularly with branch prediction, and cache, loop unrolling is not always better.

In fact I bet the performance would be worse, especially because you have to worry about setting areas of memory that are not a multiple of the loop unroll size. (I guess you could use duff's device.)

I actually tested it while ignoring that, and the unrolled version took 34.551 seconds vs 34.239 for the regular version (but those number are meaningless since the variation in time between runs is greater than the difference in runtime between versions).

And see: http://lkml.indiana.edu/hypermail/linux/kernel/0008.2/0171.h... - loop unwinding is not worth it anymore.

So I ask you again pkaler: how would you speed this up?

sshumaker · on May 26, 2010

I would imagine writing the native register size bytes per iteration would be a win, with a little cleanup for the remaining bytes. For a 32-bit architecture, you'd typically be doing 1/4 the iterations.

JoeAltmaier · on May 26, 2010

Far more involved than that. There are alignment issues. Its several machine cycles faster to do aligned store of a large scalar e.g. 32-bit. So a responsible implementation would do a head check for alignment, store 1,2 or 3 bytes efficiently, then do aligned stores of 32-bit values (or 64-bit or whatever native size is appropriate), then a tail check for the small change. OR better yet switch(p & 3) and handle each case with unrolled code.

rbanffy · on May 26, 2010

I would build it into the memory controller. Better: in the memory itself.

dfox · on May 26, 2010

Many modern platforms contain some logic, that can be (ab)used to do memset() directly in hardware without processor involvement (ie. in parallel with other code). But this tends to require so much setup and IO overhead so it simply isn't worthwhile to do.

Integrating memset/bzero logic into memory array itself will increase it's price drastically (but in some special cases it is done, generally when memory array already contains some other expensive non-memory logic).

spolsky · on May 26, 2010

Aren't there are processor-specific assembler versions of memset for common chips? There are usually assembler versions (for example using REP MOVS on Intel which is light years faster) or even unrolled C versions (typically setting 8 or 16 bytes per iteration) that are used. I suspect this might be "fallback code" that doesn't actually get executed on any chips that anybody really uses.

faragon · on May 26, 2010

For this case, the required x86 opcode could be "rep stosX" (just write), and not "rep movsX" (copy). Anyway, on x86 processors, "rep movsX" was faster from 8086 to 80386, but since the 80486, it is faster to do assembly unrolling for reducing the jump penalty (as the "rep movsX" doesn't do unrolling, being its conditional jump a penalty). On CPUs with prefetch hardware support -e.g. SSE2-, you could still speed it up a bit more, by fetching/ordering future cache line before needed, avoiding pipeline stalls.

The shown example, that just writes to RAM, is probably just fast enough (it is harder to optimize read cases), although a bit of unrolling could reduce jump penalty impact (specially on non superscalar or in in-order superscalar CPUs -typical ARM included on handheld devices-).

chadaustin · on May 26, 2010

While we're jumping down the architecture pedant rabbit hole, a simple loop like that will be trivially predicted, so the branch will be basically free. In addition, hardware prefetchers do a much better job at predicting linear memory access than manual prefetch instructions. On Core 2, iirc, if you have 2 or 3 L2 misses at fixed offsets either direction from each other, the hardware will automatically begin prefetching memory so it's there when you need. The problem with manual prefetch instructions is there high latency. They're best for hinting to the processor that you're about to make an unpredictable load.

spolsky · on May 26, 2010

Maybe this is a sign that someone is porting android to a new CPU!

humbledrone · on May 25, 2010

The thing that shocks me about this bug is that it should have been caught when the compiler issued a warning about an unused function parameter. Either the warning was not enabled, or the programmer just ignored it; either of those options indicates a pretty severe lack of professionalism.

klodolph · on May 26, 2010

"Pretty severe" lack of professionalism? It's not unprofessional not to turn the warnings all the way up and sift through each and every one. Some of us work code inherited from other people with different coding guidelines, and some of the warnings will generate loads of messages for code that is not only technically correct but stylistically correct as well. The unused parameter warning is one of the biggest offenders in this area. And if you have a hundred warnings for unused parameters, which ones do you pay attention to?

Another potential issue is that the code in the patch might almost never be called in the first place. Most libc implementations (three that I know of, where I've examined the "memset" source) have a "memset" like that as a fallback only for platforms without an assembly version.

I'd say the problem is that there are several different ways this error could have been detected sooner: warnings, test, static analysis tools, and review, but none of them worked. But again, it's also possible that the code is never called because assembly versions are always available.

queensnake · on May 26, 2010

| The unused parameter warning is one of the biggest offenders in this area. And if you have a hundred warnings for unused parameters, which ones do you pay attention to?

Well, so then you're exposed to this error. Do:

    void foo(int x)
    {
       (void)x;
    }

to silence unused parameter warnings.

tptacek · on May 26, 2010

Go look at Google code search, which includes vast quantities of code written by professional teams and (often later) open sourced. Very little of it casts expressions to (void). When I started at Arbor, I got rid of all the (void) casts. What an eyesore.

Very little C code is const-correct either. Most of the C devs I know make fun of const-correctness.

humbledrone · on May 26, 2010

| What an eyesore.

It's only an eyesore if you don't acknowledge the benefits that strict warnings can bring to the table.

| Most of the C devs I know make fun of const-correctness.

I for one don't mind putting in a tiny bit of upfront effort so that my compiler can double-check my work. Do the C devs you know also make fun of assert?

tptacek · on May 26, 2010

I don't acknowledge the benefits of casting things to (void). Therefore: definitely an eyesore.

And, no. Nobody makes fun of assert. Which rather makes my point for me.

ehnus · on May 26, 2010

If assertions disappear in certain optimized builds then what do you do, or recommend, to deal with warnings related to unused parameters or values if you don't believe that casting to void has value?

tptacek · on May 26, 2010

I don't understand how the question you ask even follows from the premise you forwarded. The fact that memset is probably always dead code here is why it's not really a bug. The fact that warnings about unused return values are useless 100-1000 times for every time they're potentially useful is why I don't litter my code with pointless casts.

scott_s · on May 26, 2010

As I pointed out above, in gcc you can explicitly mark variables as potentially unused.

ehnus · on May 26, 2010

And what do you suggest for MSVC, which doesn't support attribute unused or have any comparable declspecs?

scott_s · on May 26, 2010

If what you say is true, then you have to either put up with the warnings, compile at a level that doesn't have the warnings, or figure out a hack around them - I don't like any of those options.

I'm not as dogmatic as some others are in the thread. I just think it's a good idea to be warned when a variable is unused, enough that I'll mark my very few that might be #ifdef'ed out. If you're using a compiler that doesn't let you mark variables as such... well that sucks. You do what you can.

scott_s · on May 26, 2010

If you're using gcc, you can say __attribute__((unused)): http://gcc.gnu.org/onlinedocs/gcc-3.3.1/gcc/Variable-Attribu...

josephruscio · on May 26, 2010

Your compiler may offer a more formal way to indicate that it's acceptable for a parameter to be unused. In gcc it's an attribute:

int foobar(int foo, int bar __attribute__((unused))) { /* Code that does not use bar */}

s3graham · on May 26, 2010

(void)x does not work on all compilers to silence the warning. gcc + cl != C.

humbledrone · on May 26, 2010

So what if the code doesn't get called? That doesn't prevent it from being built. Considering that memset() is part of libc, which is a shared library, presumably it's going to get compiled regardless of whether it's called.

earl · on May 26, 2010

Current codebase: 9734 warnings.

As a point of pride, none of them were created by me, but it is what it is. Some of them come from since deprecated functions and nobody wants to go back and mess with old code; some of them come from sloppiness; some of them come from java's collection handling.

And yet this codebase gets a lot of work accomplished anyway.

jmillikin · on May 26, 2010

The "unused parameter" error is one of the most useless of C compiler errors. There are dozens of perfectly valid reasons why a parameter would go unused, so the occasional error just gets lost in the noise.

It's more helpful in C++, which can omit parameter names to indicate that it's intentionally unused. But then you're using C++; not worth the tradeoff.

humbledrone · on May 26, 2010

It's not hard to type a statement like "(void)v;" to explicitly say, "v is intentionally unused." I fail to see the drawback of this approach. Of course you should only do this when (1) you can't remove the parameter from the function's prototype and (2) there is a good reason for it to be unused. But once you develop that habit, the compiler can help you catch bugs that are otherwise easy to miss. Sounds like a worthwhile tradeoff to me.

leif · on May 26, 2010

What you want is __attribute__((unused)), assuming you're compiling with gcc.

tedunangst · on May 26, 2010

Then you get a "statement with no effect" warning, which is usually more serious. I'll take a bogus warning in real code over a bogus warning in contrived code any day.

cperciva · on May 26, 2010

I've never seen a compiler issue a warning about "(void)foo;". Even compilers which issue 'statement with no effect' warnings normally recognize deliberate casts to void as a way of saying "shut up, I know what I'm doing".

torpor · on May 26, 2010

>There are dozens of perfectly valid reasons why a parameter would go unused, so the occasional error just gets lost in the noise.

Such as?

kelnos · on May 26, 2010

I can only think of a few, but they're valid:

1. An API-defined callback signature where you need to use (for example) the first and third parameters, and not the second.

2. A public API function that has become deprecated and the previous behavior is emulated in a way that doesn't require all the parameters.

3. Stub/wrapper functions.

4. Init/cleanup functions for modules written to a plugin API, when all parameters passed are not needed in all cases.

5. Language binding closure/thunk functions.

Most of the uses, I think, boil down to backwards compat and fixed callback/plugin interfaces. In those cases I think an __attribute__((unused)) or void cast aren't a huge burden. I usually turn on -Wunused-parameter just in case.

jared314 · on May 26, 2010

That sounds like a severe lack of perspective.

humbledrone · on May 26, 2010

Huh? Any halfway decent software shop will have a policy of compiling with all warnings enabled, and fixing all sources of warnings (with only very rare exceptions). It is extremely unprofessional to disable or ignore warnings such as the "unused function parameter" warning. If you disagree, I respectfully submit that I would never let you near a C project that I was working on.

neilc · on May 26, 2010

Any halfway decent software shop will have a policy of compiling with all warnings enabled

FWIW, your definition of "halfway decent" would exclude most software shops in the world.

It is extremely unprofessional to disable or ignore warnings such as the "unused function parameter" warning

Unused function parameter warnings are often noise (e.g., due to #ifdefs, unimplemented APIs, backward compatible APIs, etc). I agree that code should compile w/o warnings, but the benefits of tracking down and squelching unused function argument warnings are pretty marginal in my experience.

tptacek · on May 26, 2010

All I can say is than in 15 years (fuck.) I've never worked on a team that would have caught a bug like this in dead code. -Wall has been the gold standard on the teams I've worked on.

josephruscio · on May 26, 2010

-Wall unfortunately leaves a lot of warnings disabled. We've used:

"-W -Wall -Wcast-align -Wstrict-prototypes -Wmissing -prototypes -Wpointer-arith -Wshadow -Wsign-compare -Wformat=2 -Wno-format-y2k -Wimplicit -Wmissing-braces -Wnested-externs -Wparentheses -Wtrigraphs"

Update: Forgot the most important one ... -Werror. Making warnings equivalent to errors is the only way to ensure that they're always taken care of.

renata · on May 26, 2010

-Wall doesn't enable -Wunused-parameter. You need -Wall -Wextra for that.

humbledrone · on May 26, 2010

I would probably agree that most software shops in the world are not halfway decent.

The benefits of tracking down and squelching such warnings would have very likely caused this bug to never exist. The point is that you make "compiles without warnings" a policy, and you build a habit of maintaining that policy -- after a while it begins to take almost no effort at all.

tptacek · on May 26, 2010

Do you write C code?

s3graham · on May 26, 2010

(and for more than one platform?)

jey · on May 26, 2010

The unused function parameter warning is usually one of the most annoying and least useful. I routinely compile with "-Wall -Wno-unused" because I don't want to litter my code with casts to void.

wvenable · on May 26, 2010

What's with all the unused function parameters? Shouldn't that be an edge case?

tedunangst · on May 26, 2010

Just about every callback interface takes a function pointer where the function expects a void * argument. Half the time it's not needed.

kelnos · on May 26, 2010

If the void * is the last parameter, just leave it off your implementation of the callback function.

DrSprout · on May 26, 2010

You have to know that -Wextra exists, which I for one didn't until just now when I ran a sample program. All my projects just compiled with -Wall, which doesn't catch that error.

Also, unless every programmer on the project is being scrupulous about finding and correcting such errors, it quickly becomes line noise, and a lot of it completely unimportant stuff.

seliopou · on May 26, 2010

-Wextra is total insanity. I just compiled some C code I've been working on with that flag, and got the following: "warning: signed and unsigned type in conditional expression." I was testing if a variable I declared as an integer was less than 0. To my knowledge, every number literal in C is assumed to be of a signed type unless followed by a 'u'. But now I know better. 0 is the exception.

swolchok · on May 26, 2010

Are you sure?

    $ echo 'int main(int argc, char **argv) { if (argc < 0) { return 1; } else {return 0;}}' > test.c
    $ gcc -v
    Using built-in specs.
    Target: x86_64-linux-gnu
    Configured with: ../src/configure -v --with-pkgversion='Ubuntu 4.4.3-4ubuntu5' --with-bugurl=file:///usr/share/doc/gcc-4.4/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --enable-shared --enable-multiarch --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.4 --program-suffix=-4.4 --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-plugin --enable-objc-gc --disable-werror --with-arch-32=i486 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
    Thread model: posix
    gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5) 
    $ gcc -Wall -Wextra test.c
    test.c: In function ‘main’:
    test.c:1: warning: unused parameter ‘argv’
    $

gcc-4.3.4 behaved the same way in my test.

humbledrone · on May 26, 2010

Yeah, that's why every programmer on the project has to be scrupulous about finding and correcting such errors. That's professionalism 101 -- if your team is not keeping the build warning free, things like this slip through.

tptacek · on May 26, 2010

Things like what, a blatantly obvious bug that would crash virtually every C program running on the system were it not for the fact that it was dead code?

This is practically a documentation bug.

[edit]

I kind of take it back, after mahmud's bzero() joke... out of oniguruma, Python, hexfiend, Apache, nginx, openssl, redis, regexkit, saxon, memcached, subversion, and valgrind, this hypothetical bug would have broken Python, redis, subversion, several apache modules, and memcached. The contents of my codebase/3p directory, FWIW.

niallsmart · on May 26, 2010

Ironic that this makes the homepage the same day as an article bashing unit tests :)

audidude · on May 26, 2010

Interestingly, they have the OpenBSD version that does word-size assignments in asm for x86/arm. wonder why it wasn't being used.

http://android.git.kernel.org/?p=platform/bionic.git;a=blob;... http://android.git.kernel.org/?p=platform/bionic.git;a=blob;...

fhars · on May 26, 2010

It was. That's why this bug hasn't been caught before: the buggy code was never actually compiled as part of a production build. Real builds always use the version optimized for the actual hardware platform.

mahmud · on May 26, 2010

Some libc hacker has fond memories of bzero(3), tsk tsk ;-)

jared314 · on May 25, 2010

Can someone who lives in the world of C fill in the blanks? If this was a major typo, I have done things along those lines.

justinweiss · on May 25, 2010

memset sets a block of memory to a given value. It is often used to initialize an uninitialized block of memory to 0. So often, in fact, that this bug where it always initialized the memory to 0 survived so long.

abstractbill · on May 26, 2010

So often, in fact, that this bug where it always initialized the memory to 0 survived so long.

Given that the bug went undetected for so long, I think you could almost reasonably claim that calling memset with a nonzero parameter is a "corner case" ;)

Psyonic · on May 27, 2010

It is a "corner case," or close to it, but that's not why it wasn't caught here. I don't think this code is ever called, as explained above. It's default code in case there isn't a platform specific version, but there pretty much always is a platform specific version.

Link: http://news.ycombinator.com/item?id=1379639

fserb · on May 26, 2010

on Google code search: "lang:c memset\(.+,\s[^0]+\s,.+\)" return 32k results and "lang:c memset\(.+,\s0+\s,.+\)" returns 335k.

wlievens · on May 26, 2010

10% is a lot more than I would have thought, wow. Good job looking that up.

mey · on May 25, 2010

Wish there was a cross-link to a bug tracking system where this was initially reported.

pufuwozu · on May 26, 2010

You can just search for the change ID:

https://review.source.android.com/#change,14699

draebek · on May 25, 2010

memset's job is to fill a block of memory with a given byte. The most common fill byte is 0, to initialize or otherwise clear a buffer. It looks like it is so common, in fact, that a programmer accidentally made Android's memset _always_ fill with 0, no matter what byte was passed in, and presumably no one noticed for a while. (I can't figure out when this patch was introduced.)

jared314 · on May 25, 2010

The "Up to change" link takes you to the summary. May 13, 2010

fourneau · on May 25, 2010

Memset (according to the c++ reference) is supposed to: "Sets the first num bytes of the block of memory pointed by ptr to the specified value (interpreted as an unsigned char)."

So, give it a pointer to some memory, give it a value and the number of bytes you want to set and it'll go wild. In this case, memset was just blindly dropping in zeroes into the memory block

amock · on May 25, 2010

The function was always setting the memory to 0 instead of what was passed in. I think most uses of memset are to initialize memory to 0, but any other use would have had unexpected results.

snowbird122 · on May 26, 2010

Now this is hacker news. We don't see too many of these types of posts.

kelnos · on May 26, 2010

No, this is The Daily WTF. "Oh look, someone wrote some really stupid code!" Who cares? How does it enrich us to know that someone mis-implemented memset() in a bootloader that possibly never even calls it, or if it does, probably passes zero as the argument anyway? (Or is dead code because on any arch anyone uses, memset() is implemented in asm.)

snowbird122 · on May 27, 2010

It enriched me to learn what memset was. I love hacker news, but I wouldn't mind a few more posts that contain some code.

jacquesm · on May 26, 2010

This is a typical cut-and-paste programmer bug, the person took 'memzero', copied it and tried turning it in to memset.

Instead of special casing memzero as memset(p,0,count);

jwr · on May 26, 2010

This bug might actually be difficult to find — 99% of the time memset is asked to fill the memory with 0.