I gave haskell a shot as some of my earlier github repos indicate: https://githu...

bobfunk · on Oct 3, 2011

It's pretty much what I mean when calling Haskell hard to learn. For me it's also been a steep learning curve, and my experience hasn't been altogether different than yours.

I started out maybe 5 years ago following tutorials, reading up on all the metaphors about Monads and doing project Euler problems.

After a while I started to tackle some small web related things with Haskell and had exactly your experience of running into a lack of understanding of how the system works and wrapping my head around functional datatypes.

I pretty much gave up on Haskell as a practical language at that point, but something kept me coming back once in a while.

Then at a point I had a use for making a small web service fast and the Node prototype I made performed badly and crashed in spectacular ways under high loads. I found Snap and made a quick prototype in Haskell. At that point the experience of years of small experiments must finally have made something click. In a very short time I had a very fast service using almost no memory. It's deployed in production (as a part of http://www.webpop.com) and has been extremely stable.

By now I think I've crossed some kind of barrier, and feel like I'm both being productive and having fun when writing Haskell, but it really didn't come easy to me and all else being equal my experience tells me that a good deal of my colleagues would have an even harder time.

Homunculiheaded · on Oct 3, 2011

I think part of the issue with learning haskell is that it seems to invert the typical learning strategy for programming languages. Usually the best advice is read a little, then write a lot. Typically you can just look and some published code and go "ah yes that's how you do it". But I find, for better or worse, Haskell really requires you to understand before you code. Which in the end means your study to code ratio is very different than almost any other language.

Most languages, even lisps, are somewhat tolerant of 'programming by guessing' for beginners. Usually you write terrible code that works, learn more and see what you did wrong. Haskell is very unforgiving of this, if you don't understand why it works it probably won't

thesz · on Oct 7, 2011

I think you're wrong at "if you don't understand why it works it probably won't".

While I long given up PUI (Programming Under Influence) I still occasionally do some in Haskell. After a litre of beer I am pretty dumb, but I can follow clues from compiler to get something working.

Most of the time, it works the next day, when I sober. That's in contrast with C/C++. Scripting languages give some power like that, but I can screw myself with them much more violently.

In my humble opinion, Haskell is the language of choice for drunken programmers.

quinedstatement · on Oct 3, 2011

I recently had that same epiphany. For fun, I decided to re-implement a simple chat server I wrote a while back in Erlang. I found that everything clicked - the type system worked with me instead of against me, and I was able to create prototypes as quickly as I could think of them. But, it took two years of thinking about Haskell before I could synthesize code in it. (code here: https://github.com/dmansen/haskell-chat)

MatthewPhillips · on Oct 3, 2011

Did you perhaps jump into the water too quickly? I'm currently learning a couple of functional languages (including Haskell) and using it in production environments but my current use is restricted to "I have an input that will always produce a certain output. There are no database or environmental dependencies, this is straight computation. I want to never have to worry about this function ever again". And so far, knock on wood, haskell has been killer for that scenario. I'll probably eventually transition a lot more of my code to functional languages, but will do so slowly (using Go otherwise).

microtonal · on Oct 3, 2011

Haskell is beautiful, I love it, but I can easily see his points. There are a few traps one can easily fall in:

- Reach a point in a complex application where it becomes hard to reason what laziness will do to performance.

- End up in type-hell. E.g. some libraries extensively use existential quantification of type variables. Before you know it, you are chasing "type variable x would escape its scope"-type or error messages, in perfectly fine looking code.

- Pattern matching is nice, but if you extensively use it, adding a constructor argument is a lot of work.

- No-one uses the same data type for common things. For instance, for textual data, there is String, Data.Text, Data.Text.Lazy, Data.ByteSting, and Data.ByteString.Lazy. These days there is more or less consensus on when to use which type, but you are often converting things a lot. There are also types of data for which no consensus is yet reached (e.g. lenses).

- Artificially pure packages. There are some packages that link to C libraries, but (forcefully) provide a pure interface. (Or in other words: purity is just convention).

- For a lot of code you end up using monads plus 'do' notation, making your programs look practically imperative, but an oddball variation of it.

- Using functions with worse time or space complexity, to maintain purity.

- I/O looks simple, but for predictable and safe I/O you'd usually end up using a library for enumerators. Writing enumerators, enumeratees, and iteratees is unintuitive and weird, especially compared to (less powerful) iterators/generators in other languages.

Learning Haskell is something I'd certainly recommend. It provides a glimpse of how beautifully mathematical programs could be in a perfect world. Unfortunately, the world is not perfect, and even Haskell needs a lot of patchwork to deal with it.

anothermachine · on Oct 3, 2011

> Artificially pure packages. There are some packages that link to C libraries, but (forcefully) provide a pure interface. (Or in other words: purity is just convention)

Explain? What would the alternative be?

> Using functions with worse time or space complexity, to maintain purity.

This seems like the opposite of your previous complaint.

> For a lot of code you end up using monads plus 'do' notation, making your programs look practically imperative, but an oddball variation of it.

This seems to be a "psychological problem" with Haskell: the idea that because Haskell supports declarative, it's not OK to be imperative. It makes beginners tear their hair out looking for 'do'-free solutions when they could just use 'do'. C.f., "Lambda: the Ultimate Imperative" (and the rest of that series of LtU papers) http://dspace.mit.edu/handle/1721.1/5790

microtonal · on Oct 3, 2011

Explain? What would the alternative be?

Box the value that is the result of evaluation an expression that calls impure code in IO?

This is what I'd expect for calling impure code in third-party libraries.

jrockway · on Oct 3, 2011

If the library developer can prove that a C operation is pure, why shouldn't he tell Haskell about that?

jrockway · on Oct 4, 2011

And if you don't trust the developer, it's easy to fix his mistake:

    unUnsafePerformIO :: a -> IO a
    unUnsafePerformIO = return

ezyang · on Oct 4, 2011

Sorry, that doesn't actually work.

    ezyang@ezyang:~$ cat Test.hs
    import System.IO.Unsafe
    unUnsafePerformIO = return
    main = do
      let a = unUnsafePerformIO (unsafePerformIO (putStrLn "boom"))
      a
      a
    ezyang@ezyang:~$ runghc Test.hs
    ezyang@ezyang:~$

jrockway · on Oct 4, 2011

Yes, true. Patch the library if the annotation is wrong :)

jrockway · on Oct 3, 2011

I would say that after a little more experience, space leaks are the only thing that really worries me in Haskell. It's one of those things that I have to think about a little too much to really feel "safe" about. (The other worry is expressions that evaluate to ⊥ at runtime, but it's been shown that static analysis can solve that problem. I don't actually use those tools, though, so I guess I'm a tiny bit afraid of those cases. Like with other languages, write tests.)

Your other concerns don't seem too worrisome to me. Type hell doesn't happen very much, though there are some libraries that really like their typeclass-based APIs (hello, Text.Regex.Base) which can be nearly impossible to decipher without some documentation with respect to what the author was thinking (``I wanted to be able to write let (foo, bar) = "foobar" =~ "(foo)(bar)" in ...'').

The data type stuff can be confusing for people used to other languages, where the standard library is "good enough" for most everything people want. A good example is Perl, which uses "scalar" for numbers, strings, unicode strings, byte vectors, and so on. This approach simply doesn't work for Haskell, because Haskell programmers want speed and compile-time correctness checks. That means that ByteString and Text and String are three different concepts: ByteString supports efficient immutable byte vectors, Lazy Bytestrings add fast appends, Text adds support for Unicode, and String is a lazy list of Haskell characters.

All of those types have their use cases; for a web application, data is read from the network in terms of ByteStrings (since traditional BSD-style networking stacks only know about bytes) and is then converted to Text, if the data is in fact text and not binary. Your text-processing application then works in terms of Text. At the end of the request cycle, you have some text that you want to write to the network. In order to do that, you need to convert the Unicode character stream to a stream of octets for the network, and you do that with character encoding. The type system makes this explicit, unlike in other languages where you are allowed to write internal strings to the network. (It usually works since whatever's on the other end of the network auto-detects your program's internal representation and displays it correctly. This is why I've argued for representing Unicode as inverse-UTF-8 in-memory; when you dump that to a terminal or browser, it will look like the garbage it is. But I digress.)

I understand that people don't want to think about character encoding issues (since most applications I use are never Unicode-clean), but what's nice about this is that Haskell can force you to do it right. You may not understand character sets and character encodings, but when the compiler says "Expected Data.ByteString, not Data.Text", you find that ByteString -> Text function called "encodeUTF8" and it all works! You have a correct program!

With respect to purity; purity is a guarantee that the compiler tries to make for you. When you load a random C function from a shared library, GHC can't make any assumptions about what it does. As a result, it puts it in IO and then treats those computations as "must not be optimized with respect to evaluation order", because that's the only safe thing it can do. When you are writing an FFI binding, though, you may be able to prove that a certain operation is pure. In that case, you annotate the operation as such ("unsafePerformIO"), and then the compiler and you are back on the same page. Ultimately, our computers are a big block of RAM with an instruction pointer, and the lower you go, the more the computer looks like that. In order to bridge the gap between stuff-that-haskell-knows-about and stuff-that-haskell-deson't-know-about, you have to think logically and teach the runtime as much about that thing as you know. It's hard, but the idea is that libraries should be hard to write if they'll make applications easier to write. If everyone was afraid to make purity annotations, then everything you ever did would be in IO, and all Haskell would be is a very nice C frontend.

For a lot of code you end up using monads plus 'do' notation, making your programs look practically imperative, but an oddball variation of it.

That's really just an opinion, rather than any objective fact about the language. I find that do-notation saves typing from time to time, so I use it. Sometimes it clouds what's going on, so I don't use it. That's what programming is; using the available language constructs to generate a program that's easy for both computers and humans to understand. Haskell isn't going to save you from having to do that.

Using functions with worse time or space complexity, to maintain purity.

ST can largely save you from this. A good example is Data.Vector. Sometimes you want an immutable vector somewhere in your application (for purity), but you can't easily build the vector functionally with good performance. So, you do a ST computation where the vector is mutable inside the ST monad and immutable outside. ST guarantees that all your mutable operations are done before anything that expects an immutable vector sees it, and thus that your program is pure. Purity is important on a big-scale level, but it's not as important in a "one-lexical-scope" level. Haskell let's you be mostly-pure without much effort; other languages punt on this by saying "nothing can ever be pure, so fuck you". I think it's a good compromise.

I/O looks simple, but for predictable and safe I/O you'd usually end up using a library for enumerators. Writing enumerators, enumeratees, and iteratees is unintuitive and weird, especially compared to (less powerful) iterators/generators in other languages.

IO is hard in any language. Consider a construct like Python's "with":

    with open('file') as file:
        return file

That construct is meaningless, since the file is closed before the caller ever sees the descriptor object. But Python lets you write it, and guaranteeing correctness is up to you. In Haskell, that's not acceptable, and so IO works a little differently. Ultimately, some things in Haskell are a compromise before simplicity of concepts and safety guarantees at compile time. You can write lazy-list-based IO in Haskell, but you can run out of file descriptors very quickly. Or, you can use a library like iteratees, and have guarantees about the composability of IO operations and how long file descriptors are used for. It's up to you; you can do it the easy way and not have to learn anything, or you can do some learning and get a safer program. And that's the same as any other programming language.

substack · on Oct 3, 2011

Haskell is great for pure algorithms like that.

As for jumping in too quickly? I was a pretty heavy haskell user for about 3 years.

fab13n · on Oct 3, 2011

I've also never been productive with Haskell. It's cute, it raises interesting problems if you enjoy wrangling with mathy problem for the sake of it, but when it comes to getting stuff done in a deeply imperative, eager world, the impedance mismatch is simply overwhelming.

Moreover, I was very proficient in OCaml before I discovered Haskell, and it just spoiled be. It has all of Haskell's qualities which matter (type inference, algebraic data structures, a naturally functional mindset) without the parts you regularly have to fight (mandatory monads and monad transformers, algorithmic complexity in a lazy context, tedious interfacing to the underlying OS).

If you felt like Haskell had many amazing qualities, spoiled by a couple of unacceptable flaws, especially when it comes to acknowledging how the real world works, I'd suggest that you give a try to OCaml. You should be proficient with it within a couple of days.

gregwebs · on Oct 3, 2011

I believe you are attributing a library issue to a language. Before today (and by today I literally mean a month ago when Yesod released a cross-platform development server that automatically re-compiles your web application) there wasn't a productive set of libraries and tools to build a web application with in Haskell. 3 years ago when you started, and even until 1-2 years ago the library situation was absolutely horrible. Web frameworks with very little to offer, mediocre templating languages, not even an attempt at a database ORM. Tutorials would have you write a bunch of code to achieve a detail taken for granted in libraries used in web frameworks of other languages.

Please take a look at doing real-world, productive web development with Yesod. http://www.yesodweb.com

You are still going to take a productivity hit in Haskell due to lack of libraries in comparison to Ruby, Python, etc. So the practical reason for using Haskell today is to take advantage of the amazing performance, take advantage of Haskell non-web libraries in the backend, or for a high assurance project where its type system can rule out most common web development bugs.

oh, and Yesod is even faster than the mentioned Snap framework which is already much faster than Node (and unlike Haskell, Node does not scale to multi-core). Although Yesod isn't going to automatically cache the fibonacci sequence for this artificial benchmark because in the real world I have never once been tasked with writing code like that for a web application.

microtonal · on Oct 3, 2011

I believe you are attributing a library issue to a language.

Reasoning about laziness? Polymorphism that can only be implemented using existential types plus Typable? Even purity is a double-edged sword (some algorithms are inherently mutable)[1]. Some of Haskell's problems in real-life projects can definitely be attributed to the language itself.

So the practical reason for using Haskell today is to take advantage of the amazing performance,

My experience with everything from simple checksum functions to parameter estimators (ML) is that Haskell is generally at least 2-10x slower than C (even when introducing strictness where necessary, unboxing constructors, etc.). So, in practice you'll often end up doing heavy lifting in C anyway (whether it is a database server or a classifier that works in the background), and in the end it doesn't matter so much whether you use Haskell or a dynamic language (performance-wise) if a significant amount of time is required processing requests.

where its type system can rule out most common web development bugs

Right, this is where Haskell currently has an edge, because it does not only make it easy to make DSLs (as e.g. Ruby), but typechecks everything as well.

oh, and Yesod is even faster than the mentioned Snap framework which is already much faster than Node

Yes, but the benchmarks you implicitly point to (the pong benchmark) is very synthetic and says fairly little about real-life use. Until we see Snap and Yesod more in production, the jury is still out.

[1] Sure, you can do quicksort in the ST monad, but it will require a lot of unnecessary copying.

gregwebs · on Oct 3, 2011

Yes, reasoning about laziness and difficulties using types are library issues. Particularly if a library is forcing you to learn about existential types. In Yesod we are very conscientious about what types (even just polymorphism) that is exposed to the user, because they can make error messages, etc difficult.

I don't think the Pong benchamark http://www.yesodweb.com/blog/preliminary-warp-cross-language... is that synthetic - I think it demonstrates concurrency capabilities fairly well. We just have to keep in mind which web applications benefit from high concurrency.

As for raw performance of a single request, I agree that the average web application won't see a great difference for the 80% case. However, for most Ruby web applications that I have worked on I have had to spend time re-writing slow parts of the application because Ruby was truly the bottleneck, and I would have been much better off using almost any compiled language with types.

Ruby applications I have worked on always have more complicated deployments, worse response times, and huge memory usage due to the lack of async IO. Async IO is possible in Ruby & Python, but it still sucks because it is extra work and you have to always be on guard against blocking IO. So I hope we can at least agree that async IO is a big win, and that Haskell & Erlang are the best at async IO because it is built into the runtime and no callbacks are required. And likewise deployment to multi-core is no extra effort in Haskell/Erlang, whereas in Node, Ruby, or Python you will need to load balance across multiple processes that are using more RAM.

microtonal · on Oct 3, 2011

Yes, reasoning about laziness and difficulties using types are library issues.

I disagree, if the language were strict by default, this was not an issue. It is a language problem that is forced on libraries.

However, for most Ruby web applications that I have worked on I have had to spend time re-writing slow parts of the application because Ruby was truly the bottleneck,

My point was that Haskell is often a lot slower than C of C++, so people will rewrite CPU-intensive code anyway. Look at many of the popular Haskell modules where heavy-lifting is done (from compression to encryption), most of them are C bindings. That code will be nearly equally fast in Haskell as in, say Python.

BTW. I am not arguing that Haskell not faster than Python, Ruby, Clojure, etc. But for computationally intensive work C/C++ are still the benchmark, and that is what people will use in optimized code. Whether it is Haskell or Python.

Particularly if a library is forcing you to learn about existential types.

But why is it? Because the language does not support the kind of polymorphism that is commonly used, in an intuitive fashion. People need containers with mixed types that adhere to an interface in some applications. And a commonly-used method to realize this in Haskell is by using existential types.

we can at least agree that async IO is a big win

Yes.

And likewise deployment to multi-core is no extra effort in Haskell/Erlang, whereas in Node, Ruby, or Python you will need to load balance across multiple processes that are using more RAM.

Since most modern Unix implementations do COW for memory pages in the child of process of a fork, this is not so much of an issue as people make it out to be. The fact that you mention Erlang is curious, since spawn in Erlang forks a process, right? Forking is more expensive than threading, but again, in most applications negligible compared to handling of the request.

gregwebs · on Oct 3, 2011

The biggest reason why there are Haskell packages wrapping C libraries is not for performance, but to reuse good C libraries, and because Haskell has an excellent interface for C libraries. Many people prefer to write Haskell for computationally intensive tasks than C/C++. Depending on the problem it is possible to get within 2x the raw speed of C and you much nicer code to maintain and much easier concurrency/parallelism opportunities.

I have not found it to be the case that existential types are commonly needed (and need to be forced on the user). Maybe you are in a different problem domain. I find Haskell's regular polymorphism to work very well for 95+% of my use cases.

Fork is not negligible to handling a request, but pre-forking theoretically could be. In practice, COW fork does not automatically solve multi-core. The Ruby garbage collector is not COW friendly and thus there is little memory savings from COW (unless you use the REE interpreter which has a slower COW friendly garbage collector but saves on memory and GC time). I haven't looked at this for other languages but I assume this is still a limiting issue. Also, you are still stuck doing load-balancing between your processes, which will limit or complicate your deployment. I don't know much about Erlang other than async IO is built into the language, which is why I mention it in the same breath as Haskell.

pjscott · on Oct 3, 2011

In case anybody is wondering: both Erlang and Haskell have very lightweight user-space threads built in, which are mapped onto a small pool of OS threads to take advantage of however many cores you have. It's very slick and fast, and probably the Right Thing.

gtani · on Oct 4, 2011

Yes, there's also akka which is being incorporated as scala standard lib, and F# MailboxProcessor. The thing is that erlang/OTP and its behaviors have many people pounding on heavily loaded apps in production and improving its toolchain, whereas GHC and akka have recently (last couple years I think) been working ot get the stack working: dispatchers and load balancing (like erland reds), and bring GC up to snuff

pjscott · on Oct 5, 2011

Akka looks pretty sweet, but it looks like you still have to worry about blocking code in external libraries. In Haskell (and Erlang, IIRC), blocking code is deferred to a background thread automatically so you don't have to be consciously on-guard for it. You also get proper pre-emptive multithreading, while Akka looks like a hybrid of an event loop and a thread pool.

Is this a substantial headache with Akka, in practice?

gtani · on Oct 5, 2011

(late reply)

it's pretty hard to google akka deployments but:

http://www.quora.com/What-companies-are-using-Akka-commercia...

http://groups.google.com/group/akka-user/browse_thread/threa...

and in terms of memory overheads and how many erlang process-type things you can spin up:

http://akka.io/docs/akka/1.1/scala/tutorial-chat-server.html

thesz · on Oct 3, 2011

Why use quicksort over arrays when you can do mergesort over lists and get 1) stable behavior and 2) solution to maximum and k-max problems due to laziness? Do you really need arrays?

And quicksort for arrays in ST monad wouldn't copy anything unnecessary.

Actually, I've seen many claims that some algorithms are inherently mutable. So far none stand close scrutiny.

Matrix operations? You better copy intermediate results, that way you'll be safer and faster (parallel algorithms). Good compilers do that behind the curtain (array privatization).

Sorting? Use maps or lists, that way you won't forget something important.

Graph operations? Immutable (inductive) graphs are slower by a constant multiplier and sometimes are faster than their mutable counterparts (tree-based maps are faster for changes than arrays).

The last one is even more amusing when applied to compiler optimizations (i.e., to non-trivial graph algorithms): http://lambda-the-ultimate.org/node/2443 Pure version is less buggy, faster (!) and allows more optimizations.

microtonal · on Oct 3, 2011

Why use quicksort over arrays when you can do mergesort over lists

Sure, you can do merge sort. Except that the list split step in Haskell is O(n) in time, while it is constant when using arrays. As well as merging lists, since you have to 'reattach' the second list as the tail of the first list.

And quicksort for arrays in ST monad wouldn't copy anything unnecessary.

You have to copy the data from whatever representation you had to something that lives in a memory block in the ST monad.

Actually, I've seen many claims that some algorithms are inherently mutable. So far none stand close scrutiny.

You have probably never read Okasaki...

The rest of your argument proposes that slow is better because of persistence. First, persistence is often not required, second persistence can also be implemented in a mutable language.

thesz · on Oct 4, 2011

>Except that the list split step in Haskell is O(n) in time, while it is constant when using arrays.

Oh, no. You shouldn't split list by calculating length.

Try this instead:

   even (x:_:xs) = x : even xs
   even xs = xs
   odd = even . drop 1

   splitList xs = (even xs, odd xs)

Voila! Completely lazy, O(1).

So for merge. See here: http://lambda-the-ultimate.org/node/608?from=0&comments_... The solution contains proper merge algorithm.

And yes, I never read Okasaki in full. But, I use Haskell semi-professionally from 1999 and professionally from 2006.

eru · on Oct 3, 2011

I agree with you in most cases.

> Sure, you can do merge sort. Except that the list split step in Haskell is O(n) in time, while it is constant when using arrays. As well as merging lists, since you have to 'reattach' the second list as the tail of the first list.

It's no problem writing a merge-sort in Haskell that uses O(n log n) time. So who cares what the asymptotics of the individual elements of the algorithm are? (You may care about the actual speed of the whole thing and its parts, though.)

eru · on Oct 3, 2011

If you want a canonical example of an algorithm that's imperative and harder to do in a functional setting, cite union-find (See "A Persistent Union-Find Data Structure"). Searching is optimal in a functional setting, too. Just not the classic quicksort.

eru · on Oct 3, 2011

If you want a canonical example of an algorithm that's imperative and harder to do in a functional setting, cite union-find. Searching is optimal in a functional setting, too. Just not the classic quicksort.

_ezkx · on Oct 3, 2011

> Polymorphism that can only be implemented using existential types plus Typable?

I'm curious where you came across this. In an external library you were using, or in the process of trying to implement some kind of dynamic typing in your own code?

microtonal · on Oct 3, 2011

Both :). To give one specific example: I was working on a transformation-based learner for learning tree transformations. Say that a rule consists of an action and a list of condition that makes the action fire if they are true for a particular tree node. Obviously, you'll want to be able to add new conditions, so you make a type class for conditions:

    class Cond a l where
      applies :: a -> TreePos Full l -> Bool

Now, say that a rule contains a list of conditions which belong to the type class Cond (Cond a l => [a]). You can see the problem coming. Say I provide a condition of the type MyCondition, then the list will be of type [MyCondition]. However, in practice it would be inflexible to restrict a list of rules to one type. You want to be able to add new conditions outside the module or package binary. So, instead I used existential typing for conditions:

  data Condition l =
    forall c . (Cond c l, Eq c, Show c, Typeable c) => Condition c

dsharpdiabetes · on Oct 3, 2011

I had a similar experience with Erlang. Node has totally taken over the erlang shaped hole in my life ;)