Write a Forth in Haskell: Part 03
Up until now, the execution of the interpreter was fairly straightforward.
We could fold over a string of instructions, applying our
process
function on each token.
initStack :: Stack
processBuffer :: Text -> Stack
processBuffer buffer = foldl' process initStack (words buffer)
│ │ └─── Input
│ └─── Initial state
└─── Business logic
But something isn’t quite right. Yes, we have implemented some stack operations, but there is a world beyond those.
In this part and the next, we will refactor the token processor from parts one and two, into an interpreter allowing operations on state, as well as IO capabilities.
Having a Monad Stack
Operating in a Monad is pretty awesome. You get access to cool functions like get
and put
if you are in the
State Monad
, or even proper error
management mechanisms that do not rely on runtime exceptions, in the case of the Error Monad
But what is better than one monad? Several of them. Hence the expression “Monad Stack”. You stack them on top of each-other, so you can enjoy their
capabilities in the same place.
In our case, we are interested in two things: state management, and IO. And by IO I mean “being able to output strings”, because right now we don’t
need more.
But Monad Stacks can sometimes be inconvenient. In our case, our stack would look like that:
StateT [Text] IO
And while there is nothing inherently wrong with this, the order in which we write the components of this stack matters, so when refactoring our program, we would have to change the order in which they appear, as it would have an impact on how we would access those monad’s features.
That is why we are going to use the MTL style of writing monad stacks.
MTL to the rescue!
MTL is the “Monad Transformers Library”. Sitting on top of its sibling transformers
,
it provides typeclasses that we can use to declare the shape of our monad stack, without having to think about the order in which we write its type signature.
As you (probably) know, a program that runs in IO while returning nothing has the following type signature:
returningNothing :: IO ()
Now, with the MTL style, our type signature would look like that:
returningNothing :: (MonadIO m) => m ()
Nothing really changes in terms of execution, but now we gain some control over the nature of the monad in which we operate.
Combining State and IO
I will go straight to the point : This will be no more complicated than
MonadState AppState m, MonadIO m
We combine the State Monad (containing our application state), and the IO Monad.
This enables us to write such functions:
add :: (MonadIO m, MonadState AppState m) => m ()
add = do
element1 <- pop
element2 <- pop
push (element1 + element2)
(Don’t mind the function body for now, we still have some stuff to cover before attaining this level of care-free stack programming)
Operating in a combination of MonadState
and MonadIO
allows us to use IO when needed for side operations like on-disk logging if you want to
maintain a history of the operations, or even launching nukes on the side.
Building our state
Our AppState
will be quite basic, and will grow as we add more features to the interpreter:
data AppState = AppState
{ buffer :: [Text]
, stack :: Stack
} deriving stock (Show)
buffer
will contain the forth instructions that are given to the interpreter, and stack
will be the stack we’re operating on.
We can put this definitions and the other ones in a separates Types.hs
files:
-- src/Types.hs
module Types where
import Control.Monad.State.Strict (MonadState)
import Control.Monad.State.Strict qualified as State
import Data.List (List)
import Data.List qualified as List
import Control.Monad.IO.Class (MonadIO)
newtype Stack = Stack {getStack :: [Integer]}
deriving newtype (Show, Eq)
data AppState = AppState
{ buffer :: [Text]
, stack :: Stack
} deriving stock (Show)
initStack :: Stack
initStack = Stack []
Stack operations
Now that we have the state available from the functions that require it, it has become possible for us to query it and modify it in a cleaner way.
So, let us rework the functions we have implemented in part 01,
starting with add
and sub
. In the classical Forth fashion, the addition is composed of two pop
s and a push
.
A bit like this, actually:
add :: (MonadState AppState m) => m ()
add = do
element1 <- pop
element2 <- pop
push (element1 + element2)
sub
is also much more straightforward:
sub :: (MonadState AppState m) => m ()
sub = do
element1 <- pop
element2 <- pop
push (element2 - element1)
And that’s it. But what of dup
, drop
and all the friends we made along the way?
Well they’re coming with us.
dup :: (MonadIO m, MonadState AppState m) => m ()
dup = do
element <- pop
push element
push element
drop :: (MonadIO m, MonadState AppState m) => m ()
drop = do
_ <- pop
pure ()
swap :: (MonadIO m, MonadState AppState m) => m ()
swap = do
element1 <- pop
element2 <- pop
push element1
push element2
over :: (MonadIO m, MonadState AppState m) => m ()
over = do
element1 <- pop
element2 <- pop
push element2
push element1
push element2
rot :: (MonadIO m, MonadState AppState m) => m ()
rot = do
element1 <- pop
element2 <- pop
element3 <- pop
push element2
push element1
push element3
Let’s get poppin'!
Finally. Clean code to express our operations on the stack.
Gone are the size checks on the stack and the manual splitting!
You can sweep all the nasty, list-manipulating code we wrote earlier under
the rug, because this Dark Age is over. Or is it?
We cannot entirely get rid of “low-level” operations on Lists. As Fred Hébert says so well, complexity has to live somwhere.
We can’t forget it’s there but we can manage to centralise it, and this is exactly
what we are going to do. In this new version, push
will make us of the
modify
function from
MonadState. We pass it a callback whose first argument is
the current state and returns the new state.
To avoid burdening the reading of the function, a new helper is defined:
addToStack :: Integer -> Stack -> Stack
addToStack element (Stack stack) = Stack $ element : stack
and we have push
:
push :: (MonadState AppState m) => Integer -> m ()
push newElement =
State.modify' (\state ->
let newStack = addToStack newElement (stack state)
in state {stack = newStack})
Its companion pop
does a bit more, though. This is the place where
bound checking happens, and where error
may be used:
pop :: (MonadState AppState m) => m Integer
pop = do
stack' <- gets stack
if (checkSize 1 stack')
then do
let (hd, tl) = List.splitAt 1 (getStack stack')
updateStack (Stack tl)
pure $ List.head hd
else
error "Stack underflow!"
It could have been written in a more “compositional” style, but control flow is
the main point of this function, so the if/then/else
construct is appropriate here.
To ease reading, two helpers are defined:
updateStack :: (MonadState AppState m) => Stack -> m ()
updateStack newStack = do
oldState <- get
put $ oldState{stack=newStack}
Its name is quite explicit: We pass in a Stack
, and the function will update our state with it.
checkSize :: Int -> Stack -> Bool
checkSize requiredSize stack =
(length $ getStack stack) >= requiredSize
This provides a predicate that checks whether or not our stack is full enough to allow operand to be applied.
Outputting elements
Empowered, and emboldened, by this addition to our collection, we can use our ability to do IO in controlled places to implement three new functions:
period
<- outputs the top of the stack on stdoutemit
<- outputs the ASCII character for the element on top of the stackcr
<- outputs a newline
Like what we wrote before, we are able to express those operations with the building blocks of stack manipulation.
period :: (MonadIO m, MonadState AppState m) => m ()
period = do
element <- pop
putStr $ (show element) <> " " -- Yes, we *do* want to add a space after each element
emit
is less straightforward since you need to know how to convert from an integer
to an ASCII char. Thankfully,
Data.Char.chr
has
our back. We can convert the ASCII code to a character.
However, chr
takes an Int
, and we have a List full of… Integer
s! Which
are not the same thing! Whereas Integer
is an arbitrary-precision number
(also called “bigint” in some other languages), Int
is a signed 64-bit integer.
There must be a conversion step from one to the other. And there is.
fromInteger :: Integer -> a
The documentation says “Conversion from an Integer”, so that must be it.
Let’s try it:
ghci> fromInteger (3 :: Integer) :: Int
3
it :: Int
And that is indeed what we want.
But beware, sized integers like Int
will “loop around” if you reach their limit, which is why
converting from arbitrarily big numbers to sized-constrained numbers can be a lossy operation:
ghci> fromInteger (9223372036854775802343224 :: Integer) :: Int
-5656776
it :: Int
Here is how we are going to do it:
- We get the top of the stack, with
pop
, and inline the conversion to anInt
withfromInteger
and(<$>)
, which is the infix version offmap
. - Then, in just one step, we convert the number into an ASCII character,
and print it without adding a newline. This requires a bit of plumbing
due to the fact that we are operating in a Monad
m
that is constrained onMonadIO
, and not directly onIO
. Hence the usage ofliftIO
to achieve such as thing:
emit :: (MonadIO m, MonadState AppState m) => m ()
emit = do
element <- fromInteger <$> pop
liftIO $ putChar (chr element)
And finally, cr
, short for carriage return,
shall be implemented as such:
cr :: (MonadIO m) => m ()
cr = putStr "\n"