I haven't really studied this either so I could be mistaken, but I think it's because OpenGL does the perspective divide between the vertex and fragment shaders, going from clip space to NDC space (which is in the [-1,1] interval and can only be changed by glClipControl()).
The value in gl_FragDepth, if written to, is final, but gl_Position is not and will go through the clip/NDC transformation and since the trick to get the extra precision is to put the depth range into [0,1] instead of [-1,1], this would fail.
So my guess is, it probably wouldn't work on WebGL/OpenGLES without also setting gl_FragDepth, which is, as you mentioned, impractical performance-wise.
The value in gl_FragDepth, if written to, is final, but gl_Position is not and will go through the clip/NDC transformation and since the trick to get the extra precision is to put the depth range into [0,1] instead of [-1,1], this would fail.
So my guess is, it probably wouldn't work on WebGL/OpenGLES without also setting gl_FragDepth, which is, as you mentioned, impractical performance-wise.