I spent a couple of days reading Haskell tutorials, and writing a little
bit of code in it. I find it to be a very attractive and exciting language.
I put together a few of the reasons why I am interested in Haskell.
Strong Types, and Type Inference
When I used to program Java, I always hated the type system. It just felt like
it was constantly getting in your way, and it seldom gave you anything back.
In fact, it actively discouraged you from breaking things down into smaller
methods and shorter statements, because every time you did, you suddenly had to
add type declarations for everything again.
When I moved to Ruby, not having to declare types felt absolutely brilliant.
Of course, I knew what the type was, because I was the programmer, after all.
I was the master. It would be the type of whatever I assigned, duh. But from time
to time, I wished that I had declared types on things. It would be useful for
metaprogramming, you know. That’s about all it was ever good for in Java. And
having declared types on public library functions can’t do anything to harm
documentation efforts.
In Haskell, you really seem to have the best of both worlds. You can declare types
as much or as little as you want, and the type inference figures out the rest. In
general, though, you always declare inputs and outputs to public functions. The
supportive code for the high level functions can be totally undeclared, which is
awesome, because it dramatically reduces the friction in breaking down an algorithm.
Consider the following function.
toEnglish :: Int -> String
toEnglish x
| x < 0 = "negative " ++ toEnglish (-x)
| x < 20 = ones !! x
| mod x d == 0 && n == 0 = tens !! div x d
| mod x d == 0 = (toEnglish $ div x d) ++ delim ++ orderNames !! n
| otherwise = (toEnglish $ div x d * d) ++ delim ++ (toEnglish $ mod x d)
where n = length (takeWhile (x >=) orders) - 1
d = orders !! n
delim = if n == 0 then "-" else " "
orders = 10:100:[1000 ^ x | x <- [1..]]
ones = words "zero one two three four five six seven eight nine ten eleven twelve thirteen fourteen fifteen sixteen seventeen eighteen nineteen"
tens = words "zero ten twenty thirty forty fifty sixty seventy eighty ninety"
orderNames = words "ten hundred thousand million billion trillion quadrillion quintillion sextillion septillion octillion"
It takes an Int
and returns a String
. What else do you need to know? Between the types
and the name, you probably don’t need a lot more documentation to be able to use this
function. But notice the where
section at the bottom. Those are helper functions and
constants, with no types to be found. However, the compiler does know what their types
are, even though they aren’t declared. That’s what type inference is all about. The high-level
declarations serve as a sanity check that you and the compiler are on the same page about
what the types are.
By the way, I really love the way you break down your function into little pieces and drop
them into the where
section. I know you can do this with procs in Ruby, or with functions
in Javascript, but no where is it this easy to use private helper functions without poluting a
larger namespace.
For polymorphism, Haskell provides Type Classes, which are a little bit like Java Interfaces,
only way better. You can declare multiple type classes on a type, such as declaring a function
that takes a parameter that can behave like three different things. In practice, this is a lot
closer to duck-typing than you might think.
The part that sort of blows my mind is that Haskell can be polymorphic on both ends of a function.
Consider. In C you might call a function like atoi
to convert a string to an integer. Both types
are declared in the name of the function (“AsciiToInteger”, as I understand it). In Ruby you would
call to_i on the string. The return type is declared in the name of the function, but the input type
is inferred. If it’s a string, it calls the string implementation, but if it’s a different type that
defines to_i, it calls that implementation instead.
In Haskell you can be polymorphic on either end of the function. (I don’t know whether you could do
both ends at the same time to define a universal coerce
function.) To convert a string to an integer
you call read
. It figures out what you want, whether an integer, float, boolean, or whatever, based
on what you do with it once you have it. (In the terminal you would have to declare the type, like
read "12" :: Int
.)
Taking this idea further, you can even have something like polymorphic constants. Instead of calling
Integer.MAX_BOUND
, you can simply call maxBound
, and it will return the max bound for integers,
or characters, or whatever you’re talking about that can be bounded. I know this doesn’t shorten the
code a ton or anything, but it seems really cool to me that if you defined a new type that had natural
bounds, you could define it in this way, and any method anywhere that took a bounded type could call
your bounds. I suppose you could do this in Ruby with a convention like object.class.max_bound
or
something, but it isn’t a part of the core language.
Algebraic Types
In addition to the type inference system, the type system itself is just really, really fascinating.
In a sense, everything is an enumeration.
data Day = Sunday | Monday | Tuesday | Wednesday | Thursday | Friday | Saturday
This is an example of why you would want to implement the Bounded
type class. Each of those options
is a type constructor, so they’re in the containing namespace (you just say Sunday
, not Day.Sunday
).
Here’s a more interesting example.
data PayMethod = CreditCard CardNumber Name ExpirationDate VerificationCode
| Check CheckNumber
| Cash
The first word in each option is the name of the constructor, but the following words are properties.
They don’t have names in this syntax, they just are what they are. The type is the name of the property,
in a sense. There is another syntax where you give each property a separate name and type, but often
the type is enough. Obviously in this case you would probably just alias all of these types to String,
but I think it’s a really interesting tradeoff.
There’s more you can do with the type system, but what was the most interesting to me was that seeing
a couple of examples like this made me realize how much ORM has impovrished even our object-oriented
type systems. You could do a lot of this same stuff in Ruby, but you wouldn’t (at least in Rails) because
of the database. It’s kind of amazing to me how much programming languages end up being beholden to
persistance mechanisms, and relational databases in particular. It seems like robust persistance (not
serialization) would become a major topic during language design. But I understand why language-specific
persistance mechanisms are a poor solution.
Streaming
It’s always a good idea to write streaming code whenever you have a function that loads a lot of data.
In the MRI at least, memory is never released back to the operating system once it has been allocated
to the Ruby process. This is why people write monitoring packages that restart their Ruby servers if
they grow above a certain threshold.
In Haskell, everything is lazy by default, waiting until some other code asks for its return value
before it does any work. Monad code gets executed in order (at least IO code does), but the standard
IO operations stream automatically, making fast, memory-conscous IO code a piece of cake.
import Data.Char
main = interact $ map toUpper
This program reads standard in, and writes it out in all upper-case. Its memory usage is constant,
no matter how big the input is.
Point-free Style
Point-free style describes function composition where the arguments are passed through without having
to be declared. It is an example of DRY at the syntactical level. You do this in Ruby when you use
Symbol#to_proc
instead of a block.
Rather than
However, you can’t use Symbol#to_proc
when you need to do more than send a single method. I have seen
people hack around this limitation in Ruby, but it isn’t very well looked-on, and some people even denigrate
Symbol#to_proc
.
In Haskell, point-free style is well-loved and well-supported. Consider the following.
withA :: String -> String
withA s = unlines (filter (elem 'a') (lines s))
This creates a function withA
which takes a string and returns only the lines that contain the character 'a'
.
But this function is essentially a pipeline where the data passes through each function from right to left. (In
an object-oriented language, it would probably read from left to right.) You can remove the parameter (and some of
the parentheses, by rewriting the function in point-free style.
withA :: String -> String
withA = unlines . filter (elem 'a') . lines
Once you understand how the function composition syntax works, this is way easier to read and write than the version
with all the parentheses. To be honest, I consider this better, but still not as intuitive as an object-oriented version
which reads in the order the steps are performed. The cool thing is that we’ve eliminated the variable. Note that it’s
still clear that the function takes a string from the type declaration.