Does someone know why these are not reentrant without the GNU extension? It seems like a really weird choice to make a library for hash tables and then make the state global like that.
You can look them up here in System V Interface Definition Issue 2 from 1986: http://www.bitsavers.org/pdf/att/unix/SVID/System_V_Interfac.... It has a note “Future Directions: The restriction on only having one hash search table active at any given time will be removed.” Which of course couldn’t be done without breaking compatibility.
Most programs at the time probably didn’t need more than one hash table, and it made for a simpler interface. (A program could in principle also work around the limitation by storing multiple values per hash table entry.) Most Unix libraries worked with global state at the time, since multi-threading also didn’t exist yet. I remember running into the limitation of only being able to have one lexer/parser grammar (via lex/yacc) per program.
Back when these functions were created, it was normal and expected for libc to store global state for a variety of things. Reentrancy wasn't really an issue that was considered then.
(Though this particular API has an even weirder limitation: you can only create a single hash table at a time in your entire program!)
probably they were added in 2.9bsd or something when programs couldn't be more than about 5000 lines of code before you ran out of the 16-bit address space
> The hcreate(), hdestroy(), and hsearch() functions first appeared in AT&T System V UNIX.
which... does seem a bit late for thinking it was a good idea to only support one hash table per process? that was in 01983, 6 years after the vax was introduced and 4 years after the 68000. i mean awk supported arbitrarily many hash tables in a single process in 7th edition unix, even before it had subroutines
> i mean awk supported arbitrarily many hash tables in a single process
This API supports that, too, as long as you don’t call one of the functions while another call is in progress. You get that for free if your process is single threaded.
SunOS 4.x implemented light-weight processes or LWPs. NetBSD 2.x+, and DragonFly BSD implement LWPs as kernel threads (1:1 model)”
According to Wikipedia, SunOS 4.0 is from 1988, NetBSD 2.0 from 2004, so I guess chances are good that AT&T System V UNIX didn’t support threads in user space programs.
you seem to be discussing what does or doesn't support multiple threads, but what we were talking about is what does or doesn't support multiple simultaneously existing hash tables, which hcreate() doesn't. i've edited my comment upthread to clarify
(it's also true, as kimixa points out, that a single-threaded process with signal handlers isn't really single-threaded, but i don't think that's really the important consideration in this context)
for what it's worth, it's not very difficult to use alarm() to invoke a context-switching subroutine to get preemptive user-space multithreading on unix without any kernel support. in theory you have to write the context-switching subroutine in assembly language, which isn't difficult at all; here's one i wrote in arm assembly for the arm eabi, for cooperative multithreading:
.thumb_func
.globl einyield
einyield:
push {r0, r4-r11, lr} @ save all callee-saved regs plus r0
ldr r0, =current_task_pointer
ldr r1, [r0]
str sp, [r1] @ save stack pointer in current eintask
ldr r1, [r1, #4] @ load pointer to next eintask
str r1, [r0]
ldr sp, [r1] @ switch to next eintask's stack
pop {r0, r4-r11, pc} @ return into einyielded context there
but in practice you can usually just use longjmp()
preemptive multithreading requires you to save all the registers, not just the callee-saved registers, but so does signal handling, so you can usually just use the signal-handling machinery and use the above code (or its equivalent on your vax or 68000 or whatever) to context-switch between signal-handler stack frames
Or a signal handler? Even if there's a single kernel thread, if it can be interrupted while inside one of these functions you can hit the same sort of problem.
Reentrancy was an issue on consumer devices well before multithreading was commonly supported there, after all
this isn't about safety, though; hcreate() and hsearch() don't support having two hash tables in the same process at the same time, not even in an unsafe way. that also makes them non-reentrant, but that isn't the main problem!
Probably started as a function in one program. Somebody copied it into a second program. Somebody wanted it in a third program and copied it into a library.
I did a quick search on GitHub for today's code that uses it, and
actually found out that people do use it today, but generally I agree,
this is a legacy API, legacy implementation, and the best thing we can
do for it is to forget about it.
The `hcreate(3)` API is interesting only because it has a very limited
use, yet it still exists in libc. So it's kind of interesting in a
circus way; we want to see it, but then we want to get out of it, and we
want to continue our rational lives without clowns and bearded ladies.
There are probably many thirdparty hashtables for C which work much
better than this.