Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Introduction to Compilers and Language Design (2021) (nd.edu)
293 points by fcambus on May 18, 2022 | hide | past | favorite | 68 comments


Oh wow! Great resource. I think it strikes the right balance between lexing/parsing vs code generation/optimization.

I see that not even in modern books things like PEG parser or Pratt parsing make the cut. Which is a pity IMHO. As an aside I am yet to find a great book/resource of garbage collectors that I can understand. It is such a fascinating subject!

If you want to dive into compiler/interpreters but do not want to go straight into the computer science of it I wholeheartedly recommend:

https://compilerbook.com and https://interpreterbook.com

And, of course, the incomparable book by Robert Nystrom "Crafting Interpreters""

https://craftinginterpreters.com/


If you like books that are light on theory, I might add:

Practical Compiler Construction (http://t3x.org/reload/index.html)

Write your own Compiler (http://t3x.org/t3x/book.html)

Both written by me.


Nice! Is there a coverage of x64 assembly as well. If not, do you know of something equivalent that covers it? :-)


Thanks! Will definitely check them out too!


Great initiative. More professors should do this; write your own material suitable to teach a course, and make it freely available.

What I also like is that if you find errata, you have a place to send them, with a reasonable expectation that they will be picked up in a new version.


A shame that it uses C though, imo, instead of being language agnostic.


In the introduction (1.4. What language should I use?) - the author explains why he used C. Please read it.


I mean isn't C ubiquitous enough that it is essentially language agnostic? Or is your suggestion that is should be purely pseudocode?


With C you need to pay attention to things like manual memory management which aren't necessary with other languages or to understand the material.


Aye but these lads are building a compiler. Probably shouldn't hide things like 'memory management' at the introduction stage as use of memory is inherent to all computing and programming.


Sure, that makes sense, but you can write a compiler in languages that don't require it themselves. That's what I meant.


> Sure, that makes sense, but you can write a compiler in languages that don't require it themselves.

I don't follow your rationale. With C the developer manages heap memory by basically calling malloc and free.

This means that once you have the memory model set, all you need to do to roll your compiler is to implement the interface.

A language that "don't require it themselves" is a language which provides only high level constructs and dumps all the memory management logic to the compiler/runtime, from object allocation/deallocation to lifetime management.

How is that simpler to pull off by a compiler writer?


The target language generally has nothing to do with the language of the compiler, where automatic memory management is just as useful (and also helps to minimize the amount of unrelated technical detail) as anywhere else.


Manual memory management is optional. IIRC some version of the D compiler was written assuming an infinite virtual address space (basically they just did not bother calling free() ever)!


There's a chapter in the book dedicated to memory layout of programs that discusses program segments, paging, implementation of malloc/free, etc that is more or less required before getting to the assembler and codegen stages.

Even if you can write a compiler assuming infinite memory that generates programs that also assume it, it would be a disservice to readers to ignore that for brevity. It's an important part of understanding how programs are created.


Is that why D is so prevalent among (some) quant shops?


Yes. Pseudocode is better for teaching than C. Plus, translating pseudocode into a language of choice is a good exercise.

Another good idea is to write a compiler in the target language.


Pseudocode can be easy to understand but you (also) need a real language to cement your understanding of the subject.


> you (also) need a real language to cement your understanding of the subject.

That's why I'd like to use a language that I already know instead of learning a new one while also learning new material.


Well you can. If you understand the algorithm you can write it in any language you know. Additionally, C is at basic level just loops, conditionals and function calls so it will be easy to translate from it if you need to. And then there are other books using other languages out there and people can choose.


It is also manual memory management, lack of generics, of polymorphism, and what not else. Even C++ would be a better choice.


Basic C is not that hard. Probably one of the easiest languages to get into.


Sure, but it’s a hard language to write good code in.


> A shame that it uses C though, imo, instead of being language agnostic.

What did you have in mind?


Honestly, I think I'd prefer pseudocode that I can read to understand the idea and then work in my language of choice rather than C.


Psuedocode means ambiguity which means misunderstandings. As soon as you remove all ambiguity any Pseudocode just becomes a language.


Good point, although I think it should be doable when describing input and output well enough.


That's been delegated to the diagrams and algebra -- and there's also the usual english-psuedocode interspersed throughout the chapters (presumably more so on the theoretical chapters e.g. chapter 4)


> Honestly, I think I'd prefer pseudocode that I can read to understand the idea and then work in my language of choice rather than C.

What stops you from understanding the idea by reading C? It's a tried-and-true language whose K&R version fits entirely in an easy to read ~180pg book which has real world applications, unlike pseudocode.


C code is usually rife with unrelated technical detail like manual memory management. It also lacks ways to build useful abstractions that exist in other languages.


Agreed! I would have used ML faminly languages. Of course, that implies learning ML before reading the book whereas C is definitely known by more developers as it’s been in all of our courses


Sure, then it would be the same thing just with ML (which I pers. would prefer to C, but yeah). Pseudocode to explain the idea would be ideal, then you can work through the book in a familiar language.

C is fine, it's just not ideal for me to learn C which I have no use for and learn a new skill at the same time.


ML-like pseudocode would be much different from C-like or python-like pseudocode (e.g. heavy use of ADTs)


I think C is basically language agnostic, isn't it? Approximately everyone working at the level of a course on compilers knows C.


I don't know C and have no intention of learning it, so I'd prefer pseudocode that I can read without already knowing C or investing time into learning C just for this book.


I really struggle how someone can be ready to learn about compilers without already having enough context where they know C. If you know enough to be doing compilers, you can pick up C in 15 mins.


I don't know how this is difficult to imagine. I have no need for C and am interested in interpreters and compilers.


> have no need for C and am interested in interpreters and compilers

But you do have a need for it, because it’s the lingua franca of conversations about compilers and interpreters… as you’ve just found out with this course.

Like saying you want to work in the Vatican but you have no need for Latin.


> because it’s the lingua franca of conversations about compilers and interpreters

There are many materials, compilers, and interpreters that use ocaml and haskell, for example. C isn't the only one. That's why I'd prefer pseudo code instead of C, so that I could use this book without learning a specific language only for it.


I really don’t understand what you’re planning to do the first time someone says to ‘check the header file’ to understand how some system call works.


For that, you don't need to know C the language, but C the calling convention https://gankra.github.io/blah/c-isnt-a-language/


This doesn't make any sense. Even you knew the C calling convention, but not C, where are you going to get documentation for system calls that doesn't use C code? The Linux, macOS, etc documentation uses C.


Reading the headers and prototypes in the man pages doesn't require "knowing" C. You just need to know the very simple subset of C's syntactical rules and the calling conventions. Anyway, reading C is far easier than writing C, most popular languages use the same syntax and operators.


I don't know what that has to do with learning about compilers and interpreters.


> I don't know what that has to do with learning about compilers and interpreters.

Compilers and interpreters, unless they're running on bare metal, which is very niche, talk to the operating system using something called 'system calls' to ask for resources that they can't provide themselves, like virtual memory space and IO. These system calls are documented in C.

For example I wrote a Ruby compiler in Ruby, but I still need to know about the mmap system call, which is documented in C, to allocate my memory.

https://github.com/chrisseaton/rhizome/blob/main/doc/memory....

https://man7.org/linux/man-pages/man2/mmap.2.html

https://github.com/chrisseaton/rhizome/blob/main/lib/rhizome...


Now I can finally reply: that makes sense, and in hindsight is quite obvious, thanks. My general lack of knowledge in this area is a part of why I'd like to study it more, but preferably I'd like to use a language I'm already comfortable with. C is just very far removed from everything I have use for and am interested in.


> These system calls are documented in C.

Sure, but the compiler will still have to generate machine code for those.


Well, the author shouldn't have to potentially learn another language just to write a book on compilers and interpreters.

Maybe... MAYBE you don't have to read this book.


> Maybe... MAYBE you don't have to read this book.

Where did I say I have to read it or that the author should learn another language? Nowhere. I said I wish it wasn't using C so that I could read it more easily.


You will have to learn whatever it is you do not like about C anyway if you want to generate machine code.


Out of curiosty, could anyone explain to me the reason they think programming languages and the parts that combine into making them work are interesting?

I feel like I have an interest in it, but I'm having a hard time figuring out _why_ I find it so appealing. I know the why doesn't matter as long as I enjoy it, but I'm curious what others think.


There's something inherently satisfying anytime you take simple constructs and fashion them together to model complex systems cleanly.

From chaos, order emerges, and a well-designed language is the medium through which you draw out that order (of course, libraries, frameworks, DSLs also a part of that story, but the language is the "base", and thus the most impactful in doing so). Languages also have the most potential for a small simplification to produce massive results, as it cascades through the other semantics and into the libraries and ecosystem. Of course, language changes also have the most potential to fuck everything up, but that's why one should always strive to avoid putting it into production, or really ever just using it period, if you want to enjoy the making of the thing.


Part of it may be the high degree of leverage. A compiler potentially reduces the effort required to get a computer to complete a desired task by several orders of magnitude.

It's also kind of exciting to bootstrap a machine as well. For example going from bare metal to a C compiler to a Linux kernel to being able to browse the web.


I like to know "why" things work even when I can mostly get away with knowing "how" they work.


Personally I find it kind of exciting to be able to design and build a microprocessor and a compiler for it, magically turning logic gates and silicon (or your preferred implementation tech) into a usable system.

It's also nice to be able to understand a system from the device physics level up to the user interface (and maybe beyond into networked/distributed as well as sociotechnical systems.)


For me, part of it is a simple fondness for understanding things and a separate but closely related joy in making things that "go", things that act "on their own" as it were. I always kind of assumed it was an innate propensity, a kind of natural monkey curiosity. I was always taking things apart as a little kid. They say I disassembled the clothes dryer one time, but I don't remember it.

I like it when something goes from being mysterious magic to a familiar tool. (Like compilers.) And it's even more fun when you can use your tools and knowledge to create some new useful or beautiful (or both) thing with them.


Maybe it’s the illusion of power that comes from rethinking fundamentals? (In practice, it’s often a path to obscurity.)


I’ll be bookmarking this for use in the future. I teach a programming languages course so I’ve looked at a number of these texts, and this seems like a good new one but I have to say I don’t see much that differentiates it from other recent texts out there. It seems well written and organized, but what’s new?

I would say the best part about this book is the author made it freely available. But if I had to choose a newish compilers book I’d choose Crafting Interpreters, which is also available for free.

One thing that I don’t like so much is the word “design” in the title, as there’s really not much content in the book on how to design a language; most of it is devoted to implementing an already designed language. I’m not sure anyone who learns from this book would be able to design a language unlike C.


I just read through a couple of the chapters, imho the sections on codegen/assembly/memory layout are very useful. CI doesn't get into that, and it's necessary for modern language implementations from scratch (it seems everything is a JIT these days)


when clicking the link, first though was "introduction to web-design" ^_^ looks interesting though. Thanks.


You know how if you walked into a mechanics shop and it looked like an Apple store something would be amiss. If the mechanic were any good the shop should be covered in grease, adorned with the sort of decor that you wouldn't care about getting covered in grease, and staffed by the sort of people that aren't averse to getting covered in grease. It should be awash in the unintentional indicators of a working class establishment, very much the opposite of a conventionally "well designed" store. A mechanic shop that looks like an Apple store suggests an establishment is trying to sell you on style rather than substance. The indicator of quality for a mechanic shop is an aesthetic that is the opposite of quality.

Same thing applies to certain types of website. A CS professor and textbook author that has the time / interest to make their website "well designed" isn't covered in grease. The 90s DIY HTML adds to the credibility.


The front end of all mechanics shops I have been in for the past decade look just like the waiting room of a doctors office.

In reality, mechanics do book keeping, track inventory, take order, organize excel files, and ask customers to sign contracts just like any other business. Thus their front ends look just like any other.

> A CS professor and textbook author that has the time / interest to make their website "well designed" isn't covered in grease.

They could just drop it in a well designed template.

Also the idea that aesthetics and engineering cannot be conjoined is disproved in both Ferraris and 3D graphic programming.


I can't decide if this says more about the doctors you visit or the mechanics. There are also CVS minute-clinic doctors near me, where the "front-end waiting room" is the CVS store. There similarly are mechanics near me where the "front-end waiting room" is the gas station it is connected to.


Probably regional differences. I live out in a rural township / suburb. If nothing else, we don’t lack for land to have waiting rooms.

The CVS had a waiting room too until Covid.


If your mechanic looks like a doctors office they’re more than likely ripping you off. The best mechanics are smaller shops that don’t have time to make their waiting room look that nice.


From talking with the abode poster I think the problem is that my doctors room doesn’t look fancy, not that my mechanics office looks too nice.


I couldn't disagree more. Everything about this site is perfect.

Sorry it doesn't use enough frameworks or wizbang scroll-hijacking flyovers or email-harvesting popups, but it conveys everything I want to know quickly and efficiently and distraction-free. In recent memory this is the fastest I've found a "download PDF" button. By far.


>In recent memory this is the fastest I've found a "download PDF" button. By far.

That leapt out at me too. Pretty sad it's so rare that it stood out so starkly to both of us.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: