Cached at:
05/30/26, 02:08 PM
# Coalgebras and Automata
Source: [https://web.archive.org/web/20071014215938/http://homepage.mac.com/sigfpe/Computing/fold.html](https://web.archive.org/web/20071014215938/http://homepage.mac.com/sigfpe/Computing/fold.html)
The Wayback Machine \- https://web\.archive\.org/web/20071014215938/http://homepage\.mac\.com:80/sigfpe/Computing/fold\.html
```
COALGEBRAS AND AUTOMATA
-----------------------
That this document is literate Haskell which means that it
is a mixture of actual runnable code and comments.
(This web page used to have the wrong title. It has nothing to do with
the differentiation of datatypes and it's not really suitable for
non-functional programmers.)
(Also, I've since learnt that all of my definitions of fold and unfold
below can be wrapped up into a single pair of definitions. See, eg.
http://web.comlab.ox.ac.uk/oucl/work/jeremy.gibbons/publications/hodgp.pdf
)
An unHTMLified version of this document is available at
http://www.sigfpe.com/Computing/fold.lhs
which you can run in hugs or ghc.
INTRODUCTION
------------
Many years ago I tried to write some code for a finite state automaton. I
wanted to write a function that took as input a state label and input
and then returned a new state label. But then I wondered what the
point of the state label was. Couldn't I simply write a function that
took as argument some input and then returned another function of the
same type. In other words I wanted a type, X, such that X = A -> X.
Anyway, I decided that maybe I was being a bit perverse and ignored this
for many years. But then I tried playing around with generalisations of
fold and unfold and I was amazed to find this type reappearing in a very
natural way. What's more, it turns out that the function to convert a
state machine that uses state labels into one that makes no reference
to labels is actually a special case of the unfold operation.
F-ALGEBRAS
----------
To start at the beginning...
I was looking at problem 2.2.4.2 in Pierce's Basic Category Theory
for computer scientists. I constructed the required inverse but the
level of abstraction was getting me a bit confused and I was unable to
prove that it was the inverse. So I cheated and had a look at Wadler's
paper here:
http://homepages.inf.ed.ac.uk/wadler/papers/free-rectypes/free-rectypes.txt
I then decided that I ought to actually work out the details for some
examples and was surprised by what I found.
Given a functor F in a category C, an F-Algebra is an arrow
fin : F(X) -> X for some X. We'll also write an F-algebra
as (X,fin) to make the dependence on the object of the
underlying category clear.
We can draw an F-algebra as:
F(X)
|
| fin
V
X
We can define this in Haskell as follows:
class (Functor f) => Algebra f x where
fin :: (f x) -> x
(We use multiparameter classes so run this with
-fglasgow-exts in ghc or -98 in hugs.)
The F-Algebras form a category where the arrows are arrows
of C, h : X -> X, so that the following diagram commutes:
F(h)
F(X) --> F(Y)
| |
| |
V h V
X --> Y
Let's consider a simple example, F = Maybe. Remember that the Maybe
functor is defined by Maybe x = Nothing | Just x. An object of type
Maybe x is either the distinguished object 'Nothing' or an x.
We can write this as F(X) = 1+X where + is disjoint union and 1
is a terminal object (eg. a singleton set). An F-Algebra is
therefore a function fin : Maybe X -> X.
So when we apply fin to an element of F(X) we need to consider two
cases: the posibility that it's acting on Nothing, and the
possibility that it's acting on Just x. In the former case
fin(Nothing) is simply an element of X. Call it g. In the latter
case it's essentially a function X -> X which we call h. So
choosing a function Maybe X -> X is the same thing as choosing
an element of X and a function X -> X.
Let's define this example over the naturals:
instance Algebra Maybe Integer where
fin Nothing = 0
fin (Just a) = a+1
(Haskell doesn't give us a Natural class so we fake it with Integer.)
Notice how fin has been written as the two cases described above.
Now consider another Maybe-algebra (Y,f) with f(Nothing) = g and
f(Just x) = h(x).
Now define p : Integer -> Y by p(n) = h^n(g). (h^n means h applied
n times.) It should be clear that the following diagram
commutes:
1+N --> 1+Y
| |
|fin |f
V V
N --> Y
(Remember that a function f: X -> Y lifts to the function
f' : 1+X -> 1+Y where f(Nothing) = Nothing and
f(Just x) = Just f(x). See the definition of fmap for
Maybe in the Prelude.)
Let's make another Maybe-algebra:
instance Algebra Maybe Float where
fin Nothing = 1.0
fin (Just x) = 2.0*x
In this case we have
p :: Integer -> Float
p n = 1.0*2^n
We claim p is a Maybe-algebra arrow so we should compare
p . fin and fin . (fmap p) on a few values:
ex0 = map (fin . (fmap p)) [Nothing,Just 0,Just 1,Just 16]
ex1 = map (p . fin) [Nothing,Just 0,Just 1,Just 16]
You should see that ex0 = equals ex1.
So we have exhibited an F-Algebra arrow from the
F-Algebra (N,fin) to (Y,f). What's more, there is only one
possible F-Algebra arrow from (N,s) to (Y,f). As our construction
was completely general it applies to an (Y,f). This means that
for every F-Algebra there is a unique F-Algebra arrow to it
from (N,fin). In other words, (N,fin) is an initial object
in the category of F-Algebras. Suppose, for some F, we
have an initial F-Algebra (X,fin). Then for any F-Algebra
(Y,f) the unique map h making the following diagram commute
is clearly a function of f.
F(h)
F(X) --> F(Y)
| |
fin| | f
V h V
X --> X h = fold f
This function is called 'fold'. We may now define:
class (Algebra f x) => Initial f x | x -> f where
fold :: (f y -> y) -> (x -> y)
We have an arrow fin : F(N) -> N. As F is a functor we also have an arrow
F(fin) : F(F(N)) -> F(N). This is also an F-Algebra. So now consider
fold(F(fin)). This must map N -> F(N). It's not hard to show that this is
the inverse of fin. In effect it subtracts 1 but if we try to subtract 1
from 0 we get Nothing. It can be shown that in general fold(F(fin)) is the
inverse of fin for initial algebras. So initial algebras essentially give
a solution to the equation X = F(X) up to isomorphism. That fold(F(fin))
is the inverse to fout is essentially "Lambek's lemma". We may now
complete our definition of Initial by writing all this in Haskell as:
fout :: x -> (f x)
fout = fold (fmap fin)
fout is the inverse of fin. Haskell constructs fout for us
automatically using fold.
instance Initial Maybe Integer where
fold f 0 = f Nothing
fold f (n+1) = f $ Just $ fold f n
We can now demonstrate that fin is the inverse of fout by
applying it to an integer:
--ex2 = fin $ fout $ (4::Integer)
ex2 = map (fin . fout) [0,1,16 :: Integer]
ex3 = map (fout . fin) [Nothing,Just 0,Just 1,Just 16 :: Maybe Integer]
F-COALGEBRAS
------------
We can now start dualising our constructions to see what
new objects appear.
We start by defining F-coalgebras:
class (Functor f) => Coalgebra f x where
outf :: x -> f x
This time a F-coalgebra is an arrow f : X -> F(X) and
we'll write a coalgebra as (X,f) when we wish to show
the dependence on X. Arrows between F-coalgebras are
defined in the obvious way. We now consider, not
the initial objects in the category of F-coalgebras but
final objects. This time we have a unique arrow from any
F-coalgebra to the final F-coalgebra. If the F-coalgebra
is (Y,f) then the unique map from (Y,f) to the final
F-coalgebra is specified by a function we call 'unfold'.
Here is the complete signature for a final F-coalgebra.
class (Coalgebra f x) => Final f x | x -> f where
unfold :: (y -> f y) -> (y -> x)
Again we can construct the inverse to outf, this time using
unfold:
inf :: (f x) -> x
inf = unfold (fmap outf)
Time for an example. Define the following function on the
natural numbers.
collatz :: Integer -> Maybe Integer
collatz 1 = Nothing
collatz n
| even n = Just $ n `div` 2
| odd n = Just $ 3*n+1
This defines a Maybe-coalgebra
instance Coalgebra Maybe Integer where
outf n = collatz n
The famous Collatz conjecture states that if we repeatedly apply
collatz to any natural we eventually hit 1. Suppose
the Collatz conjecture is true. Then we can ask how many times
we need to apply collatz to a number before we reach 1.
Define:
count f x = case f x of
Nothing -> 0
Just x -> 1+count f x
This should also give a clue why we used Maybe Natural instead
of just Natural. We want to count the number of times we need
to iterate the function before we hit a chosen value, not just
zero. We can do this by saying our function maps our chosen
value to Nothing. So conveniently the function f serves to both
describe the iteration and say where it ends.
There is a catch however. If the Collatz conjecture is false
then c collatz x might not terminate. So the answer might be
'infinite'. So strictly speaking, c should map into the naturals
extended by infinity:
data Count = Infinity | Natural Integer deriving (Show)
plusOne (Natural n) = Natural (1+n)
plusOne Infinity = Infinity
The Count type is a Maybe-coagebra:
instance Coalgebra Maybe Count where
outf (Natural 0) = Nothing
outf (Natural (x+1)) = Just $ Natural x
outf Infinity = Just Infinity
So if we adjust count f so it is of type Count, instead of integer,
then it gives a Maybe-coalgebra arrow. It is the unique such arrow to
Count. We used nothing special about the collatz function so we have,
constructed unfold. We can now implement Count as a final Maybe-coalgebra:
instance Final Maybe Count where
unfold f x = case f x of
Nothing -> Natural 0
(Just x) -> plusOne $ unfold f x
Let's show that inf . outf is the identity:
ex4 = map (inf . outf) [Natural 0,Natural 1,Natural 17,Infinity]
You may notice a catch: it fails on the last element of the list.
unfold returns Infinity if repeated iteration of our function never
hits Nothing. But there's no way we can know this simply by iterating it.
So, mathematically speaking, inf is the inverse of outf. But we can't
always compute it.
LISTS AND TREES
---------------
At this point you're surely wondering why these functions are
called fold and unfold.
Well let's define a functor F:
data F a = One | Pair Integer a deriving (Eq,Show)
F a is essentially Maybe (Integer,a)
We need a function to extract data back out of F:
F is a functor because we can define:
instance Functor F where
fmap f One = One
fmap f (Pair a b) = Pair a (f b)
This satisfies the usual rules for functors.
It shouldn't be hard to see that ([Integer],fin), with fin
defined below, is actually an F-algebra:
instance Algebra F [Integer] where
fin One = []
fin (Pair a b) = a:b
(One slight proviso, we're really talking about
finite lists.)
fin glues together an Integer and a list of Integers to make
another list. It also gives a way to make an empty list. So as
before, we're using fin to encode a pair of things, a singled out
element [] and a function (:).
Suppose we have an element, g, of X, and a function
f: X -> Integer -> X. Then we can map any list of integers
to X using the standard prelude function foldr. We get
foldr f g : [Integer] -> X. In fact, if we view a list like
[1,2,3] as being a composition of functions:
(:) 1 ((:) 2 ((:) 3 [])) we get fold f g [1,2,3] be replacing
(:) with f and [] with g. It should now be easy to see that
([Integer],fin) is an initial F-algebra and that fold and foldr
are essentially the same thing:
instance Initial F [Integer] where
-- unpack f to be in a suitable form for foldr
fold f = foldr (cons f) (f One) where
cons f a b = f $ Pair a b
As usual we can check the isomorphism:
ex5 = fin $ fout [1::Integer,2,3,4,5]
Many papers discuss unfold. We'll call it unfoldr here because it
pairs naturally with foldr.
unfoldr p f g x = if p x then []
else f x : unfoldr p f g (g x)
g is the handle on a machine. Each time we crank the
handle g maps an object to a new object and we collect
an output from this machine by applying f to this object.
We collect up all of the outputs into a list until the
function p tells us to stop. For example
ex6 = unfoldr (>20) id (2*) 1
This returns the list of powers of 2 that don't exceed 20. We
crank the (2*) handle until the condition is satisfied.
unfoldr is essentially the unfold operation for lists (except
that this time we *are* allowing infinite lists - for example
what you get if you have p _ = False.
instance Coalgebra F [Integer] where
outf [] = One
outf (a:b) = Pair a b
instance Final F [Integer] where
-- unpack f to be in a suitable form for unfoldr
unfold f = unfoldr (isOne . f) (fst . pair . f) (snd . pair . f) where
isOne One = True
isOne _ = False
ex7 = inf $ outf [1::Integer,2,3,4,5]
We can now carry out a similar analysis for trees instead of lists.
Trees are G-algebras for the functor G:
data G a = One' | Triple Integer a a deriving (Show)
instance Functor G where
fmap f One' = One'
fmap f (Triple a b b') = Triple a (f b) (f b')
data Tree a = Empty | Tree a (Tree a) (Tree a) deriving (Show)
instance Algebra G (Tree Integer) where
fin One' = Empty
fin (Triple a b b') = Tree a b b'
instance Initial G (Tree Integer) where
fold f Empty = f One'
fold f (Tree a b b') = f (Triple a (fold f b) (fold f b'))
Just as fold can collapse down a list by applying a function
between successive elements, fold collapses down a tree in
just the same way.
Here's our usual example of the isomorphism at work:
ex8 = fin $ fout $ Tree
(6::Integer) (Tree 2 Empty Empty) (Tree 7 Empty (Tree 0 Empty Empty))
Now we get to the interesting example.
AUTOMATA
--------
Consider the functor H
data H x = H (Char -> Maybe x)
instance Functor H where
fmap f (H g) = H (fmap f . g)
For convenience we define unH:
An H-coalgebra is a function X -> H(X). In essence it's a function
X -> Char -> Maybe X. Forget about the Maybe for the moment. We have
a function f : X -> Char -> X. This is essentially an automaton with X
labelling the states and the function f defining the transitions.
This is the kind of regular expression you use to recognise strings of
characters. But we need one last thing, a way to mark that a string has
been recognised. This is why we use Maybe. A value of Nothing, instead of
a state, means the string has been matched. (This is known as a partial
automaton as it grinds to a halt at this point.)
Here's an example:
f :: Integer -> Char -> Maybe Integer
f 0 'c' = Just 1
f 0 _ = Just 0
f 1 'a' = Just 2
f 1 _ = Just 0
f 2 't' = Nothing
f 2 _ = Just 0
This function can be used to recognise the string "cat" when it appears
as a substring of another string. In fact, we can define
runMachine f state "" = False
runMachine f state (c:cs) = case f state c of
Nothing -> True
(Just state') -> runMachine f state' cs
ex9 = map (runMachine f 0) ["cat","dog","wildcats"]
The Integer labels are now safely locked up inside (runMachine f 0)
but the price we pay for that is that we can no longer refer to states.
We've managed to hide the labels at the expense of completely hiding
away the states. What we want is to manipulate staes without bothering
with labels.
So now we construct the H-coalgebra. I call this a Hypertree as it has
some similarities to the tree example above. In a sense a Hypertree is
a tree with one child for each element of X (though using Maybe means
that the child "may be" a dead end with no further children.)
data Hypertree x = HTree (x -> Maybe (Hypertree x))
Note that I didn't just pluck that definition out of nowhere.
I've explicitly defined something that looks like it ought to give a
solution to H(X) = X.
instance Coalgebra H (Hypertree Char) where
outf (HTree f) = H f
instance Final H (Hypertree Char) where
unfold f x = HTree (\c -> case unH (f x) c of
Nothing -> Nothing
Just n -> Just (unfold f n)
)
run :: Hypertree Char -> [Char] -> Bool
run _ "" = False
run (HTree f) (c:cs) = case f c of
Nothing -> True
Just x -> run x cs
This time we can write a version of runMachine that makes no reference
to the fact that our states were labelled with integers.
So here goes. We'll take the function f and turn in into a Hypertree
using unfold:
catDetector = unfold (H . f) 0 :: Hypertree Char
No reference to Integer anywhere in the type of catDetector. And yet
the catDetector itself represents an automaton state and we can obtain
more states from it. We have succeeded in hiding the labels away where
nobody can get at them. (There must be a connection with existential
types here.)
Let's see if it works:
ex10 = map (run catDetector) ["cat","dog","wildcat"]
And now we can test Lambek's lemma:
catDetector' = inf $ outf $ catDetector
(At this point I have no idea how inf . outf actually works...)
ex11 = map (run catDetector') ["cat","dog","wildcat"]
So there we have it. A common class for all partial automata (that accept
Char inputs) and a function, unfold, that converts a description of such
an automaton into an actual automaton.
EPILOGUE
--------
Can you see where this is going? I'll let you fill in the details below...
data Command = Halt | Push Integer | Pop | Add | Swap | Print
interpret Halt stack = Nothing
interpret (Push n) stack = Just (n:stack,"")
interpret Pop (_:stack) = Just (stack,"")
interpret Add (m:n:stack) = Just ((m+n):stack,"")
interpret Swap (m:n:stack) = Just (n:m:stack,"")
interpret Print (m:stack) = Just (stack,show m)
data Interface x = I (Command -> Maybe (x,String))
data Executable = E (Command -> Maybe (Executable,String))
instance Functor Interface
instance Coalgebra Interface Executable
instance Final Interface Executable
...but see http://www.cs.swan.ac.uk/~csmichel/articles/finco_revised.pdf
for the paper that inspired me to start on this last example.
I have to add - at first I thought I'd found something new but it turns
out that this is the very first baby step in a whole subfield of computer
science. The book "Vicious Circles" by Barwise and Moss also discusses
this in a slightly less computational context.
APPENDIX
--------
All of the examples in one big list. (And we may as well exploit
extensions while we have them.)
data AnyExample = forall a.Show a => AnyExample a
instance Show AnyExample where
show (AnyExample a) = show a
ex = [
AnyExample ex0,
AnyExample ex1,
AnyExample ex2,
AnyExample ex3,
AnyExample (take 3 ex4), -- because of that Infinity
AnyExample ex5,
AnyExample ex6,
AnyExample ex7,
AnyExample ex8,
AnyExample ex9,
AnyExample ex10,
AnyExample ex11
]
```