Consider this function:
tuple2Int :: (x -> Int) -> (a, b) -> (Int, Int)
tuple2Int f (a, b) = (f a, f b)
…which wouldn’t work. Why? Because of f a
, the type checker deduces that x~a
(x
is equivalent to a
); because of f b
, the type checker also deduces that x~b
. Therefore, a~x~b
, which is obviously false for most cases.
With RankNTypes
, we can instead write the type signature of the function as tuple2Int' :: (forall x. x -> Int) -> (a, b) -> (Int, Int)
. This way, the compiler knows that it should use two seperate x
’s for a
and b
. It therefore deduces that x1~a
and x2~b
and because it is not enforced that x1~x2
, there is no problem at all!
Go Deeper
What’s going on?
First, we need to learn the cruel and ugly fact that NO so-called “generic” function in Haskell is actually generic in essence. What does this mean? Let’s consider length :: [a] -> Int
. It looks pretty generic to me as I can use it on any type of list, you may say. This is true, but what happens under the hood is that the compiler specializes a version of the function with concrete type (i.e. no type variable like a
) everytime you use it. For example, when you do length [1, 2, 3]
, a specialized version length(Int) :: [Int] -> Int
is magically created by the compiler to accomodate you; when you do length "abcdefg"
, similarly a length(Char) [Char] -> Int
is created. Just remember this for now.
Just as we use \x -> ...
to define a lambda function with variable x
, we use forall a. ...
to define a type with type variable a
. Therefore, suppose that we can construct the type of a function by a data constructor FunctionType a b
, where a
and b
are values that represent types, our (forall x. x -> Int)
in the last paragraph would be equivalent to \x -> FunctionType x Int
. You also need to remember this.
And then, a crash course in lambda calculus. Consider this lambda function: \x y -> x + y + z
. We call x
and y
bound variables, because their appearance is bound by the \x y
part of the function. We call z
a free variable, because it is defined “elsewhere” and is thus not bound by the \x y
part. End of the crash course.
With the three things we just learned, we can touch on the problem. Firstly, we have a rule that no type variables in a type contruction can be free. For example, we can’t write a type-level function like \x -> FunctionType x y
, or formally forall x. x -> y
, because y
is free. This is a very strict rule and obviously we don’t want to be scolded by the compiler every now and then just because of it, so we have another catch-all rule: if a type variable is not explicitly bound, it is then treated as bound at the beginning of the expression. What does this mean? Suppose we have a type a -> a
(or FunctionType a a
with our syntax). Because you see no forall
in its type signature, technically a
is free. This is when the rule kicks in: because a
is not explicitly bound, we treat the whole type as forall a. a -> a
(or \a -> FunctionType a a
) and voila! The first rule is automatically satisfied. How nice.
Generic functions in Haskell are all defined this way. id
, for example, has type forall a. a -> a
. Our tuple2Int
, because it has x
, a
and b
that all appear to be free, has type forall x a b. (x -> Int) -> (a, b) -> (Int, Int)
by the second rule. This is where our problem arises: when we are using this function, the compiler needs to create a concrete version of it (in other words, it needs to assign types to x
, a
and b
). Furthremore, because x
is scoped on the whole type expression, it must takes the same value everywhere. An example:
foo = (\x -> x + (\y -> x + 3) 4) 5
bar = (\f -> f 2 + f 3 + f 4) (\x -> x + 2)
In foo
, because x
is bound at the very beginning, it is scoped over the whole expression. Therefore, it must always takes the value 5
. On the contrary, in bar
, because x
is bound only inside the second pair of brackets, it can take different values of 2
, 3
, and 4
elsewhere (i.e. inside the first pair of brackets).
Back to tuple2Int
. What type should the compiler assign to x
? It sees that a
can be applied to x -> Int
, so a
and x
must be the same. Similar deduction goes for b
and x
. Therefore, unless a
and b
are the same type, we would have a type contradiction.
Now make sure that you understand the problem, and we’ll proceed to its solution.