In practice, it's not "~500 lines". You have whole control-flow statements with ...

edflsafoiewq · on Feb 18, 2020

The line count isn't not "real" just because it isn't how a mindless autoformatter would do it. The formatting conveys actual information. A line expresses one "thought". Laying it out horizontally allows the vertical direction to be used to visually convey the repeating pattern

    else if (tk == Mul) { next(); *++e = PSH; expr(Inc); *++e = MUL; ty = INT; }
    else if (tk == Div) { next(); *++e = PSH; expr(Inc); *++e = DIV; ty = INT; }
    else if (tk == Mod) { next(); *++e = PSH; expr(Inc); *++e = MOD; ty = INT; }

kerkeslager · on Feb 18, 2020

You know what else would communicate that? A function or macro.

    else if (tk == Mul) { applyOperator(MUL); }
    else if (tk == Div) { applyOperator(DIV); }
    else if (tk == Mod) { applyOperator(MOD); }

But then you're not conforming to their arbitrary idea of "minimalism = fewer functions".

I definitely have some admiration for their picking a goal and following through on it, and there are a few tricks in there that are downright brilliant, but let's not pretend this is about effective communication.

bluetomcat · on Feb 18, 2020

I can agree that there's a repeating pattern in handling each of the binary operators (they are around a dozen), but I'm failing to see a pattern in fragments like these:

      if (tk == ']') next(); else { printf("%d: close bracket expected\n", line); exit(-1); }
      if (t > PTR) { *++e = PSH; *++e = IMM; *++e = sizeof(int); *++e = MUL;  }
      else if (t < PTR) { printf("%d: pointer type expected\n", line); exit(-1); }

ufo · on Feb 18, 2020

This code is implementing pointer indexing: p[i]. The first line is reading the closing `]`. The second line is computing the pointer offset, which is equal to `i * sizeof(int)`. The third line is producing an error if `p` does not have a pointer type.

I think I agree with you that this part could be refactored a bit. I would be tempted to put the "PSH" corresponding to the "i" next to when we parse the "i". I would also write the check that "p" has a pointer type before the code that indexes it.

    else if (tk == Brak) {
      next(); *++e = PSH; expr(Assign); *++e = PSH;
      if (tk == ']') next(); else { printf("%d: close bracket expected\n", line); exit(-1); }
      if (t < PTR) { printf("%d: pointer type expected\n", line); exit(-1); }      
      *++e = IMM; *++e = sizeof(int); *++e = MUL; *++e = ADD;
      *++e = ((ty = t - PTR) == CHAR) ? LC : LI;
    }

nitrogen · on Feb 19, 2020

This is why I mostly hate (but partly love for the sake of reducing bikeshedding) it when teams add an autoformatter as a mandatory part of a code pipeline -- it destroys relevant spatial information.

If we are going to force autoformatters, we might as well just use annotated ASTs instead of text so we all see our own chosen view of the code.

tptacek · on Feb 18, 2020

The number of lines isn't really the point; it's C in 4 Functions, not C In 500 Lines.

What's more impressive is that it's self-hosted and implements just the subset of C required to compile itself, which makes it harder to keep the code short, but it manages anyways.

mannykannot · on Feb 18, 2020

Splitting on non-line-end semicolons (other than in literals) gives you a functioning program of 942 lines, 62 of which are either blank or comments.