Oh wow! Great resource. I think it strikes the right balance between lexing/parsing vs code generation/optimization.
I see that not even in modern books things like PEG parser or Pratt parsing make the cut. Which is a pity IMHO. As an aside I am yet to find a great book/resource of garbage collectors that I can understand. It is such a fascinating subject!
If you want to dive into compiler/interpreters but do not want to go straight into the computer science of it I wholeheartedly recommend:
Great initiative. More professors should do this; write your own material suitable to teach a course, and make it freely available.
What I also like is that if you find errata, you have a place to send them, with a reasonable expectation that they will be picked up in a new version.
Aye but these lads are building a compiler. Probably shouldn't hide things like 'memory management' at the introduction stage as use of memory is inherent to all computing and programming.
> Sure, that makes sense, but you can write a compiler in languages that don't require it themselves.
I don't follow your rationale. With C the developer manages heap memory by basically calling malloc and free.
This means that once you have the memory model set, all you need to do to roll your compiler is to implement the interface.
A language that "don't require it themselves" is a language which provides only high level constructs and dumps all the memory management logic to the compiler/runtime, from object allocation/deallocation to lifetime management.
How is that simpler to pull off by a compiler writer?
The target language generally has nothing to do with the language of the compiler, where automatic memory management is just as useful (and also helps to minimize the amount of unrelated technical detail) as anywhere else.
Manual memory management is optional. IIRC some version of the D compiler was written assuming an infinite virtual address space (basically they just did not bother calling free() ever)!
There's a chapter in the book dedicated to memory layout of programs that discusses program segments, paging, implementation of malloc/free, etc that is more or less required before getting to the assembler and codegen stages.
Even if you can write a compiler assuming infinite memory that generates programs that also assume it, it would be a disservice to readers to ignore that for brevity. It's an important part of understanding how programs are created.
Well you can. If you understand the algorithm you can write it in any language you know. Additionally, C is at basic level just loops, conditionals and function calls so it will be easy to translate from it if you need to. And then there are other books using other languages out there and people can choose.
That's been delegated to the diagrams and algebra -- and there's also the usual english-psuedocode interspersed throughout the chapters (presumably more so on the theoretical chapters e.g. chapter 4)
> Honestly, I think I'd prefer pseudocode that I can read to understand the idea and then work in my language of choice rather than C.
What stops you from understanding the idea by reading C? It's a tried-and-true language whose K&R version fits entirely in an easy to read ~180pg book which has real world applications, unlike pseudocode.
C code is usually rife with unrelated technical detail like manual memory management. It also lacks ways to build useful abstractions that exist in other languages.
Agreed! I would have used ML faminly languages. Of course, that implies learning ML before reading the book whereas C is definitely known by more developers as it’s been in all of our courses
Sure, then it would be the same thing just with ML (which I pers. would prefer to C, but yeah). Pseudocode to explain the idea would be ideal, then you can work through the book in a familiar language.
C is fine, it's just not ideal for me to learn C which I have no use for and learn a new skill at the same time.
I don't know C and have no intention of learning it, so I'd prefer pseudocode that I can read without already knowing C or investing time into learning C just for this book.
I really struggle how someone can be ready to learn about compilers without already having enough context where they know C. If you know enough to be doing compilers, you can pick up C in 15 mins.
> have no need for C and am interested in interpreters and compilers
But you do have a need for it, because it’s the lingua franca of conversations about compilers and interpreters… as you’ve just found out with this course.
Like saying you want to work in the Vatican but you have no need for Latin.
> because it’s the lingua franca of conversations about compilers and interpreters
There are many materials, compilers, and interpreters that use ocaml and haskell, for example. C isn't the only one. That's why I'd prefer pseudo code instead of C, so that I could use this book without learning a specific language only for it.
This doesn't make any sense. Even you knew the C calling convention, but not C, where are you going to get documentation for system calls that doesn't use C code? The Linux, macOS, etc documentation uses C.
Reading the headers and prototypes in the man pages doesn't require "knowing" C. You just need to know the very simple subset of C's syntactical rules and the calling conventions. Anyway, reading C is far easier than writing C, most popular languages use the same syntax and operators.
> I don't know what that has to do with learning about compilers and interpreters.
Compilers and interpreters, unless they're running on bare metal, which is very niche, talk to the operating system using something called 'system calls' to ask for resources that they can't provide themselves, like virtual memory space and IO. These system calls are documented in C.
For example I wrote a Ruby compiler in Ruby, but I still need to know about the mmap system call, which is documented in C, to allocate my memory.
Now I can finally reply: that makes sense, and in hindsight is quite obvious, thanks. My general lack of knowledge in this area is a part of why I'd like to study it more, but preferably I'd like to use a language I'm already comfortable with. C is just very far removed from everything I have use for and am interested in.
> Maybe... MAYBE you don't have to read this book.
Where did I say I have to read it or that the author should learn another language? Nowhere. I said I wish it wasn't using C so that I could read it more easily.
Out of curiosty, could anyone explain to me the reason they think programming languages and the parts that combine into making them work are interesting?
I feel like I have an interest in it, but I'm having a hard time figuring out _why_ I find it so appealing.
I know the why doesn't matter as long as I enjoy it, but I'm curious what others think.
There's something inherently satisfying anytime you take simple constructs and fashion them together to model complex systems cleanly.
From chaos, order emerges, and a well-designed language is the medium through which you draw out that order (of course, libraries, frameworks, DSLs also a part of that story, but the language is the "base", and thus the most impactful in doing so). Languages also have the most potential for a small simplification to produce massive results, as it cascades through the other semantics and into the libraries and ecosystem. Of course, language changes also have the most potential to fuck everything up, but that's why one should always strive to avoid putting it into production, or really ever just using it period, if you want to enjoy the making of the thing.
Part of it may be the high degree of leverage. A compiler potentially reduces the effort required to get a computer to complete a desired task by several orders of magnitude.
It's also kind of exciting to bootstrap a machine as well. For example going from bare metal to a C compiler to a Linux kernel to being able to browse the web.
Personally I find it kind of exciting to be able to design and build a microprocessor and a compiler for it, magically turning logic gates and silicon (or your preferred implementation tech) into a usable system.
It's also nice to be able to understand a system from the device physics level up to the user interface (and maybe beyond into networked/distributed as well as sociotechnical systems.)
For me, part of it is a simple fondness for understanding things and a separate but closely related joy in making things that "go", things that act "on their own" as it were. I always kind of assumed it was an innate propensity, a kind of natural monkey curiosity. I was always taking things apart as a little kid. They say I disassembled the clothes dryer one time, but I don't remember it.
I like it when something goes from being mysterious magic to a familiar tool. (Like compilers.) And it's even more fun when you can use your tools and knowledge to create some new useful or beautiful (or both) thing with them.
I’ll be bookmarking this for use in the future. I teach a programming languages course so I’ve looked at a number of these texts, and this seems like a good new one but I have to say I don’t see much that differentiates it from other recent texts out there. It seems well written and organized, but what’s new?
I would say the best part about this book is the author made it freely available. But if I had to choose a newish compilers book I’d choose Crafting Interpreters, which is also available for free.
One thing that I don’t like so much is the word “design” in the title, as there’s really not much content in the book on how to design a language; most of it is devoted to implementing an already designed language. I’m not sure anyone who learns from this book would be able to design a language unlike C.
I just read through a couple of the chapters, imho the sections on codegen/assembly/memory layout are very useful. CI doesn't get into that, and it's necessary for modern language implementations from scratch (it seems everything is a JIT these days)
You know how if you walked into a mechanics shop and it looked like an Apple store something would be amiss. If the mechanic were any good the shop should be covered in grease, adorned with the sort of decor that you wouldn't care about getting covered in grease, and staffed by the sort of people that aren't averse to getting covered in grease. It should be awash in the unintentional indicators of a working class establishment, very much the opposite of a conventionally "well designed" store. A mechanic shop that looks like an Apple store suggests an establishment is trying to sell you on style rather than substance. The indicator of quality for a mechanic shop is an aesthetic that is the opposite of quality.
Same thing applies to certain types of website. A CS professor and textbook author that has the time / interest to make their website "well designed" isn't covered in grease. The 90s DIY HTML adds to the credibility.
The front end of all mechanics shops I have been in for the past decade look just like the waiting room of a doctors office.
In reality, mechanics do book keeping, track inventory, take order, organize excel files, and ask customers to sign contracts just like any other business. Thus their front ends look just like any other.
> A CS professor and textbook author that has the time / interest to make their website "well designed" isn't covered in grease.
They could just drop it in a well designed template.
Also the idea that aesthetics and engineering cannot be conjoined is disproved in both Ferraris and 3D graphic programming.
I can't decide if this says more about the doctors you visit or the mechanics. There are also CVS minute-clinic doctors near me, where the "front-end waiting room" is the CVS store. There similarly are mechanics near me where the "front-end waiting room" is the gas station it is connected to.
If your mechanic looks like a doctors office they’re more than likely ripping you off. The best mechanics are smaller shops that don’t have time to make their waiting room look that nice.
I couldn't disagree more. Everything about this site is perfect.
Sorry it doesn't use enough frameworks or wizbang scroll-hijacking flyovers or email-harvesting popups, but it conveys everything I want to know quickly and efficiently and distraction-free. In recent memory this is the fastest I've found a "download PDF" button. By far.
I see that not even in modern books things like PEG parser or Pratt parsing make the cut. Which is a pity IMHO. As an aside I am yet to find a great book/resource of garbage collectors that I can understand. It is such a fascinating subject!
If you want to dive into compiler/interpreters but do not want to go straight into the computer science of it I wholeheartedly recommend:
https://compilerbook.com and https://interpreterbook.com
And, of course, the incomparable book by Robert Nystrom "Crafting Interpreters""
https://craftinginterpreters.com/