I'm a big fan of incremental builds - no point in recompiling 1M LOC when only a few source files have changed. Image based systems (like Common Lisp) are particularly nice because you can just recompile the functions that have changed and have the changes instantly take effect on a running image. Unfortunately, ASDF (the most common build/packaging system) doesn't have particularly good support for incremental builds yet, but it's an area of active developement (particularly with XCVB and other new build systems).
But don't GCC, Java, and Visual Studio all support parallel builds pretty easily? I've never looked into it, but I know I've read articles about it.
What sort of language/env. does FogCreek use? I vaguely remember hearing that they had written their own compiler in Java that transformed ASP to PHP or some other such ludicrous thing.
Edit: Some googling showed that they decided their software had to run on Windows and Unix boxes, and that the best way to do that was to support ASP (pre .NET) on Windows and PHP on *nix. They already had a large ASP code base, so they wrote a compiler (named Thistle) that could translate a subset of ASP into PHP. Then, it seems like MS deprecated ASP (and they started to realize how much it sucked) so they wrote their own language (Wasabi) that is based on VBScript and can be compiled into ASP, PHP, or JS. My God. I would gouge my eyes out if I had to work there. From the fragments of code I've seen, it frankly seems horrible. See http://www.fogcreek.com/FogBugz/blog/category/Wasabi.aspx
Although, I guess I can't really judge. I recently just wrote some ridiculously convoluted Java code that could automatically and transparently trick the JVM into making certain recursive functions Tail Call Optimizing (i.e., use constant space on the stack, regardless of how deep the recursion is) by transforming the function into automatically issuing/catching Exceptions to manually unwind the stack when I want to. Ah, I miss Common Lisp - I never knew how good I had it.
Seriously? You would gouge your eyes out? Most good hackers that I know actually prefer to work on compilers. We like the fact that we can change the compiler and control our destiny... we've added code generation, active records, lambdas, picture functions, and a lot of other nice stuff to our code without having to rewrite or port any working code.
Does your language let you write one function and have it run on the browser or the server? When you want to change something about your language, can you?
People seem to be pretty quick to criticize the decisions we made without really having all the facts, based on a couple of bits and pieces of obsolete info they gathered from the internet.
I do respect that you had a huge (and effective) code base in ASP that you (for very valid business reasons) didn't want to rewrite from scratch. I understand that each of the decisions you have made have probably made you successful and profitable - and in your shoes I probably would have done something similar. From your point of view, I completely understand that your choices were to do a complete rewrite (which was obviously unacceptable), or to come up with these sort of hackish workarounds.
However, I think you have to understand that as a hypothetical programmer thinking about working at Fog Creek, my choice is to work on your large legacy code base in (what appears to me to be) a not very pleasant language - or work on fun little 3 month AI projects in whatever language I choose (which is roughly what I do now). The fact that your compiler infers type information from a variable's name is an ugly hack - and frankly scares me (not only itself, but also more than a little about other design decisions you made). The hoops you guys had to jump through with Wasabi to do metaprogramming would annoy me - I'd rather focus on the fun stuff.
I recognize that, as a business, you made good decisions. But do you understand why I, as a programmer, would not want to work with those sort of technologies?
Oh, and for what it's worth, my language has all the features you've mentioned. For example, we used Parenscript (http://common-lisp.net/project/parenscript/) at my last job - it's a subset of common lisp that can be compiled to Javascript and runs on either the client or server.
Also - sorry about the 'Gouge my eyes out' bit. I meant it to be a bit of funny hyperbole, but guess I failed to convey that properly (humor is hard on the internet). Fogcreek, in my mind, would definitely be above the average Java or .Net shop, but probably below a few fun research labs, prototyping contractors, or small startups.
It doesn't infer type information from variable names. (An old version had to do that in one particular case to work around a VBScript ambiguity). It does type inference like Haskell, from the actual types that are assigned to things, so DIM a=9 means a must be an integer.
Again, you're assuming a lot about Fog Creek without a lot of information. The dev team here is extremely happy to be working at Fog Creek. In 8 years only one developer ever left, because he had to move away. Your theory that Fog Creek is "not a very fun environment" seems utterly at odds with what everybody else tells us, which is that they would give their left ear to work here.
By "not a very fun environment" I was referring to the Wasabi programming environment, not the people/social/office environment. Sorry for the ambiguity. I've read most of your essays, and the FogCreek environment seems amazing - you've clearly done a wonderful job with that.
And you're right - I am inferring a lot from not very much information. But, imagine that I'm a hypothetical programmer that was considering working at Fog Creek (I'm probably not up to your standards, but bear with me ;-)). You publicly stated that your compiler infers type information from variable names. That, in conjunction with some other worrisome things I've read about Wasabi, might be enough to push me some other way.
Most of the other places I'd consider working have decent enough benefits (salary, offices, food, etc) that they really aren't a deciding factor. When I look for a job, my primary concern is how interesting/fun the problems/technologies are. As long as all the other stuff is decent, I kind of disregard it.
var in C# is only allowed for local variables, which means you can easily infer the type by analysing the assigment expression, which will always depend on known types. For example method parameters are always explicitly typed.
Haskell style type inference is more tricky, since it also tries to infer the type of function signatures from the different contexts where the function is called. This can lead to circular type dependencies.
"Easily"? I suspect that since C#3 offers lambda expressions that you'd need to use something close to (although probably different from) full H-M inference... in fact, it seems like it may need to be nearly as complicated as Scala's type inference, which is saying something. =)
If you define your dependencies correctly, make can do parallel builds (using the -j option). Of course you will have to use Makefiles in that case, which a lot of people don't like.
Maybe part of that is the pain of dealing with configure / autoconf / autotools?
Ive seen a lot of C/C++ programmers write their own build utilities/scripts and lose proper deps in the process.
I found cmake a reasonable alternative.. scons apparently is too, although I hesitate because it seems to encourage writing your own python build plugins, which can be too tempting for people who like to tinker.
agreed, make -j works a treat, and presumably will gain more as cores multiply.
Its kind of nice to work in a scripting environment and avoid some of these problems [via sensible defaults]
I remember a chapter or half a chapter in a compiler construction book or article that was effectively titled "Why spend time on compiler optimization", a delightfully ambiguous term. The point was that doing so sped up the compiler which saved time for a large number of users (this was back in the days of compilers as commercial products). Clearly the discussion was targeted to self-compiling compilers.
But the lesson here would be that some effort spent on the speed of compile could positively impact developer productivity or at least negatively affect swordfight time. And yes, I think that very short compile times, like you see in CL (sbcl as an example) or Smalltalk can have an avalanche effect on development. And yes I do believe that small incremental changes are th3e best way to do software development. And yes I would expect to recompile a function in a running image serving web pages as a routine part of development.
Ah, the cost of premature optimization: a couple thousand bucks, and a couple days of monkeying around. Better to first figure out where the bottleneck is (and maybe listen to your developer, since he thinks parallelization will help.)
Listen to the last Stack Overflow podcast. Every single month that they ship FogBugz earlier is an extra $200k in revenue share for the developers.
The developer may know how to solve the problem in code. But, Joel is the CEO. He has a better idea of how this affects the Balance Sheet and Income Statement. It's kinda, sorta his job.
Except that (a) he spent a couple days of his time being wrong about it, and (b) he then wrote a blog entry about how he was wrong about it.
Imagine if instead he'd said to the guy "okay, spend three hours profiling the build process and get me some suggestions with time estimates", and they'd found some likely prospects, and three days from now he gets to post about how they'll be shipping the next release a month earlier because of the improvements they made to the build process.
No matter how much you speed up a 30 second build, you're not going to save a month on the release process. Be realistic, now.
SSDs provide so many other performance benefits--even just launching apps--that they're going to make our developers a lot happier anyway. And my time is far less valuable than a developer who is in the critical path to shipping.
A 15 second build over a 30 second build saves way more time than 15 seconds a build. It means that I'm more likely to hit compile after a smaller change before moving on to the next thing. It means I can iterate more rapidly on my approach to fixing a bug, keeping the issue hotter in my mind. It means I have half the time to get distracted by something shiny. Any time you can make the build perceptibly faster, you win big.
For the Arora project one of my default git hooks it to build the project, tools, manualtests, matching autotests (and run them for regressions) before each commit. This slowly increased to the point where it was taking minute or so (over the course of just a few weeks). Taking a few hours I cleaned it up and had it back down to just a few seconds at most and on average less then a second. I did it the more correct way and did some quick profiling to see where the time was spent and then fixing that (mostly that object files that could be were not being re-used across different project when building). Making sure that I never break the build on any commit really pays off. There have been less then a dozen build breakages in the entire Arora commit history due to this git hook and those I believe were either build breakages introduced on OS X where a.cpp == A.cpp (breaking it on win/linux) or when we broke the build against older versions of Qt so it build against 4.5, but not 4.4. When you have a quick build time doing things like build hooks are very possible and useful.
Playing the devil's advocate, I would say that you can also look at it the other way and say that a developer might not build as often but instead make sure his/her code is right before building, thus being more careful about what s/he's writing. (just for the sake of argument ;))
Right, because I was advocating typing line noise until it makes it past the compiler.
I can't count the number of times I used to make a tiny change that I didn't think was worth running the build for, only to have it be the first thing to pop up as wrong next time I compile (now I always run a build, because we made it super fast).
And iterating over a bug, that can involve a lot of little changes, it can involve writing a ton of unit tests trying to duplicate the problem, it can involve subtle interactions that all seem right until you figure out what the issue is. Yes, you have to think, but sometimes you just need to churn through it, too, and frankly it's ridiculous to suggest that having a faster build process wouldn't help this.
And I, personally, get distracted pretty easily. This comment courtesy the 5 minute test suite I'm plowing through in the background right now.
...Now let's move them into separate offices with walls and doors. Now when Mutt can't remember the name of that function, he could look it up, which still takes 30 seconds, or he could ask Jeff, which now takes 45 seconds and involves standing up (not an easy task given the average physical fitness of programmers!). So he looks it up. So now Mutt loses 30 seconds of productivity, but we save 15 minutes for Jeff. Ahhh!... It is from one of your writtings...
Listen to the Stack Overflow podcast. The developer estimated that it would take "a few weeks". A day and a half of Joel's time sees like a fair trade to do this experiment.
Listen to the podcast. The conversation starts at about 16:00 mark if that helps. I am seeing a ton of comments that don't understand the context of the article.
But the developer wasn't going to take this brain-dead approach; he was going to do something that would actually help. Joel spent time and got nothing; the developer wanted to spend time doing something that would probably speed up the build.
Except that he didn't "get nothing" out of the exercise. As he expected it would, the new drive did provide significant speedups for a variety of tasks and will undoubtedly improve his productivity enough to have been worth the time and money he spent on it. As it happens, it didn't improve the build speed specifically, but so what?
What language is your compiler written in? Most languages have an off the shelf implementation of map / map-reduce or similar these days, which would seem well-suited to compiler tasks and greatly simplifies parallelizing jobs. We use map and map-reduce patterns in C++ heavily which is quite nice since if you've got that relatively small core of code thread safe you can mostly ignore thread safety in the major portions.
You also can often speed up I/O that way since you can saturate both your I/O and CPU bandwidth, rather than alternating between waiting on them.
I also take it that you're not compiling to individual object files since usually compilation since usual compile parallelization is based on farming out single object file compilations to multiple machines or processes.
I wrote some small portions of "icecream" which is used for compilation parallelization by the guys at SUSE and there it would actually transmit over the network a full chroot environment so that even different Linux installs working on compatible architectures (later cross-compiler support was added as well) could execute the compiler jobs of rather different OSes.
I can say my build times went down considerably. However I moved to a completely new machine.
I didn't plan on writing an article so I don't have any official bench marks. I'm guessing they dropped from over a minute and a half for a full recompile to under 30 seconds. Incremental builds used to take about 20 seconds and now take between 2 and ten seconds.
I went from a athlon X64 at 2.2Ghz and 2GB Ram with a 74GB raptor and another 74GB raptor for code to a Core i7 Extreme at 4Ghz and 12GB Ram with 2x80GB intel SSDs in raid 0 and 1 80GB intel SSD for code.
A couple of important things to keep in mind. Intel is still the way to go. The other SSDs can actually be slower in write performance than a normal hard drive if the usage pattern is small files. That is pretty much what you get when you compile.
One other point is that crappy anti-virus software can affect IO performance anywhere from 12X trend micro, 24X Norton. Yes that is Norton will reduce your IO by 2400% in my tests. AVG Free is 19%. So he could have negated the performance gain from the SSD by running bad anti-virus software.
I can say my productivity feels like it nearly tripled.
The important measurement from a programming perspective is not necessarily just the second or two here or there. My work flow changed. I used to hit build and walk out and get drink from the fridge in the garage and it would still be compiling. Now I can't even do anything but glance at my email. These interruptions sometimes take you out of the zone and it can take a long time to get back in the zone. Getting up could easily cost me fifteen minutes or even a half an hour. Just because I lost my train of thought.
Update: I'm running windows 7 and visual studio 8 on this machine.
My guess is that this was the cheaper option. It only cost Joel a couple of days and several hundred on the hard drive. What it did not do is interrupt his programmers. As Joel has already described, he is now a grunt worker. His job is to make it as easy for others to do their jobs. Their time is probably worth 10 to 100 times his, because they are the one actually making something that directly results in more sales and money for the company. So all Joel had to do was save a couple of hours of work for the programmers and he would come out ahead.
Plus these additional side benefits:
The programmer who was complaining does not feel like he is being ignored.
Joel is now going to upgrade most of their computers, making it better for the entire company.
So lots of warm fuzzies to go around, leading to happier programmers, leading to a better product.
The cheaper option would be to run a compile and open any of the many different programs in Windows that tells you whether you're using lots of CPU, RAM or IO-waiting.
Then you go AHA! I'll spend money on CPUs or making my compiler/build parallel.
So he didn't fix the compile speed but he fixed the speed of everything else. I know every second I have to sit and wait for my hard drive to catch up is a second I'm spacing out instead of working. As a developer I just want to get my idea out and code. Anything that can be done to make that faster makes me a happier coder.
He doesn't mention that the "compiler" that they're bottlenecking on is the one they wrote themselves, for "Wasabi" -- compiling their private language into VB or PHP.
On top of that, you don't fucking parallelize compilers!
You use a goddamn build system to do that (make, et. al.). Of course, that requires that your compiler be decent enough to be able to compile modules independently, and not have to recompile unmodified source. What do you want to bet that their compiler is just ridiculously awful?
If disk sizes are different, which they almost always are, AFAIK dd doesn't quite hack it, otherwise I would have just booted off a Ubuntu live CD and done that.
If the new volume is smaller (which it almost always is when switching to a SSD), you'd need to shrink the filesystem before you block-copied it with dd.
Indeed. Without this, though, I think his system still would have booted. Incremental progress is certainly better than wasting two days pressing buttons in a GUI, at least IMHO. (Once the system boots, you can worry about using the extra bits on the disk.)
You don't usually migrate between volumes of the same size. If the new one is bigger, you've now got unused space on the end. If the old one was bigger, you've now just fucked your filesystem -- not only will the metadata about the volume size be wrong, but all the inodes stored in blocks past the end of the new disk will be missing.
It's trivial to folks familiar with nix-a-likes but for others it is too unlike the abstractions they're used to for it to be entertained as anything other than some sort of bizarre incantation.
With the current abundance of incremental and/or parallel tools, the fact that the compilation process takes up a significant amount of time is your first clue that there is something fundamentally wrong with the technology selection and overall thought process at this shop.
Also, he could have saved $350 dollars and bought the 80 GB drive (unless he just HAS to store 40+ feature-length pirated movies on his boot/app drive?).
So... he concludes that CPU bound tasks are CPU bound, and that parallelizing compilation (technically it would be the dispatch of compilation) is better?
Every time I read this guy I fail to get what his allure to developers is.
What's wrong with that? All else being equal, compiling will generally be faster. Most languages (Python, Java, Ruby, most Common Lisps, etc) do compile - just (usually) to some intermediate and portable bytecode rather than raw machine code. That's probably what their compiler (Wasabi) does too (except, it seems like they are using PHP/VBScript/JS as their bytecode ;-)).
Sounds good. But, there is nothing wrong with using some other language as your intermediate 'bytecode.' If memory serves, early versions of C++ compiled to C, Arc (basically) compiles to Scheme, Python can compile to C, and many others.
Either way is good in my mind. But, I too, would appreciate some technical details about Wasabi (just out of academic curiosity). I feel like a lot of things I know about it are now outdated.
I hate explaining myself, especially when it shouldn't be necessary. Yes, I know Java et al compiles to bytecode. But in practice you never hit a "compile" button because it compiles continuously.
Java still has javac. I suppose an IDE could hide this from you. Of course, Java also has lots of static analysis tools written for it so many errors are caught way before compilation.
There's no supposing about it. All major IDEs do incremental builds and no one needs "actual, regular, compile processes" as with a command line compiler.
Joel runs his successful business, bothers to blog about it since he founded it years ago, graciously comes on here and responds to people - nice of him, huh? So why so bitchy?
You clearly disagree with his company's engineering decisions that you can't possibly have been privy to the reasoning behind. What's your motivation?
First of all, his business is so successful because he blogs about it, and establishes himself as a pundit. It isn't a 'bother', it's one of his primary responsibilities as the CEO.
Why so bitchy? What's my motivation? -- this whole thing is call and response, we're just filling roles. Spolsky's slipped into punditry, you're the sycophant, and I'm the prick popping your balloons. Joel certainly wrote his post knowing that he was teasing for a "WTF Wasabi" response -- why don't you see that as valid?
Any community where the only valid 'response' is fawning fanboy is not one I want to participate in, and I think spolsky would say the same. There's a reason he doesn't just post in his own forums.
I disagree with his company's engineering decision based on the reasoning he's publicly given for it (which happens to also run counter to his past advice). He hasn't said anything about it for a while, though he dropped a promising tidbit in this thread: http://news.ycombinator.com/item?id=536023
Instead of addressing my questions, you went with the ad hominem and called me a sycophant. Nice one.
Love your generalisation about blogging and (alleged) punditry being a primary responsibility too - because clearly SO many CEOs take this responsibility in their stride.
I wasn't talking about "many CEOs", just Joel. If you bother to read his blog posts, you'd notice that he constantly talks about how he's delegated away a lot of decision-making responsibility.
He functions as the public face of his company. His companies' products are marketed towards developers, and I gather that most find out about them because of Spolsky's blogging -- he's a very effective salesman.
2. He is describing a stupid decision he made, which is framed around an unacknowledged stupid decision he made years ago (see 3).
3. His company uses a private language they wrote themselves, for totally ridiculous reasons, despite having written essays about how "you shouldn't let your programmers use any non-mainstream languages".
When I first read about Wasabi, I kind of wondered. But then in a previous lifetime, I was with a company that wrote its own compiler. A few years after that, the CEO said "thank god for that compiler" as it relieved pressure on a target chip decision. Similarly, Wasabi has targeted a completely different end architecture with apparently little fuss. This is not a trivial result.
I think people are upset that Joel markets it as a good idea when, in fact, it is probably not that good of an idea.
(Yes, he gets that 1% of the market that can't run whatever language they pick, but he has to maintain his own in-house programing language and toolchain, and train new developers to use a language that exists nowhere else in the world.)
But don't GCC, Java, and Visual Studio all support parallel builds pretty easily? I've never looked into it, but I know I've read articles about it.
What sort of language/env. does FogCreek use? I vaguely remember hearing that they had written their own compiler in Java that transformed ASP to PHP or some other such ludicrous thing.
Edit: Some googling showed that they decided their software had to run on Windows and Unix boxes, and that the best way to do that was to support ASP (pre .NET) on Windows and PHP on *nix. They already had a large ASP code base, so they wrote a compiler (named Thistle) that could translate a subset of ASP into PHP. Then, it seems like MS deprecated ASP (and they started to realize how much it sucked) so they wrote their own language (Wasabi) that is based on VBScript and can be compiled into ASP, PHP, or JS. My God. I would gouge my eyes out if I had to work there. From the fragments of code I've seen, it frankly seems horrible. See http://www.fogcreek.com/FogBugz/blog/category/Wasabi.aspx
Although, I guess I can't really judge. I recently just wrote some ridiculously convoluted Java code that could automatically and transparently trick the JVM into making certain recursive functions Tail Call Optimizing (i.e., use constant space on the stack, regardless of how deep the recursion is) by transforming the function into automatically issuing/catching Exceptions to manually unwind the stack when I want to. Ah, I miss Common Lisp - I never knew how good I had it.