Write a Forth in Haskell: Part 03

Up until now, the execution of the interpreter was fairly straightforward.
We could fold over a string of instructions, applying our process function on each token.

initStack :: Stack

processBuffer :: Text -> Stack
processBuffer buffer = foldl' process initStack (words buffer)
                                │       │          └─── Input
                                │       └─── Initial state
                                └─── Business logic

But something isn’t quite right. Yes, we have implemented some stack operations, but there is a world beyond those.

In this part and the next, we will refactor the token processor from parts one and two, into an interpreter allowing operations on state, as well as IO capabilities.

Having a Monad Stack

Operating in a Monad is pretty awesome. You get access to cool functions like get and put if you are in the State Monad, or even proper error management mechanisms that do not rely on runtime exceptions, in the case of the Error Monad But what is better than one monad? Several of them. Hence the expression “Monad Stack”. You stack them on top of each-other, so you can enjoy their capabilities in the same place. In our case, we are interested in two things: state management, and IO. And by IO I mean “being able to output strings”, because right now we don’t need more.

But Monad Stacks can sometimes be inconvenient. In our case, our stack would look like that:

StateT [Text] IO

And while there is nothing inherently wrong with this, the order in which we write the components of this stack matters, so when refactoring our program, we would have to change the order in which they appear, as it would have an impact on how we would access those monad’s features.

That is why we are going to use the MTL style of writing monad stacks.

MTL to the rescue!

MTL is the “Monad Transformers Library”. Sitting on top of its sibling transformers, it provides typeclasses that we can use to declare the shape of our monad stack, without having to think about the order in which we write its type signature.

As you (probably) know, a program that runs in IO while returning nothing has the following type signature:

returningNothing :: IO ()

Now, with the MTL style, our type signature would look like that:

returningNothing :: (MonadIO m) => m ()

Nothing really changes in terms of execution, but now we gain some control over the nature of the monad in which we operate.

Combining State and IO

I will go straight to the point : This will be no more complicated than

MonadState AppState m, MonadIO m

We combine the State Monad (containing our application state), and the IO Monad.
This enables us to write such functions:

add :: (MonadIO m, MonadState AppState m) => m ()
add = do
  element1 <- pop
  element2 <- pop
  push (element1 + element2)

(Don’t mind the function body for now, we still have some stuff to cover before attaining this level of care-free stack programming)

Operating in a combination of MonadState and MonadIO allows us to use IO when needed for side operations like on-disk logging if you want to maintain a history of the operations, or even launching nukes on the side.

Building our state

Our AppState will be quite basic, and will grow as we add more features to the interpreter:

data AppState = AppState
  { buffer :: [Text]
  , stack :: Stack
  } deriving stock (Show)

buffer will contain the forth instructions that are given to the interpreter, and stack will be the stack we’re operating on.

We can put this definitions and the other ones in a separates Types.hs files:

-- src/Types.hs
module Types where

import Control.Monad.State.Strict (MonadState)
import Control.Monad.State.Strict qualified as State
import Data.List (List)
import Data.List qualified as List
import Control.Monad.IO.Class (MonadIO)

newtype Stack = Stack {getStack :: [Integer]}
  deriving newtype (Show, Eq)

data AppState = AppState
  { buffer :: [Text]
  , stack :: Stack
  } deriving stock (Show)

initStack :: Stack
initStack = Stack []

Stack operations

Now that we have the state available from the functions that require it, it has become possible for us to query it and modify it in a cleaner way.

So, let us rework the functions we have implemented in part 01, starting with add and sub. In the classical Forth fashion, the addition is composed of two pops and a push.

A bit like this, actually:

add :: (MonadState AppState m) => m ()
add = do
  element1 <- pop
  element2 <- pop
  push (element1 + element2)

sub is also much more straightforward:

sub :: (MonadState AppState m) => m ()
sub = do
  element1 <- pop
  element2 <- pop
  push (element2 - element1)

And that’s it. But what of dup, drop and all the friends we made along the way?
Well they’re coming with us.

dup :: (MonadIO m, MonadState AppState m) => m ()
dup = do
  element <- pop
  push element
  push element

drop :: (MonadIO m, MonadState AppState m) => m ()
drop = do
  _ <- pop
  pure ()

swap :: (MonadIO m, MonadState AppState m) => m ()
swap = do
  element1 <- pop
  element2 <- pop
  push element1
  push element2

over :: (MonadIO m, MonadState AppState m) => m ()
over = do
  element1 <- pop
  element2 <- pop
  push element2
  push element1
  push element2

rot :: (MonadIO m, MonadState AppState m) => m ()
rot = do
  element1 <- pop
  element2 <- pop
  element3 <- pop
  push element2
  push element1
  push element3

Let’s get poppin'!

Finally. Clean code to express our operations on the stack.
Gone are the size checks on the stack and the manual splitting! You can sweep all the nasty, list-manipulating code we wrote earlier under the rug, because this Dark Age is over. Or is it?

We cannot entirely get rid of “low-level” operations on Lists. As Fred Hébert says so well, complexity has to live somwhere.

We can’t forget it’s there but we can manage to centralise it, and this is exactly what we are going to do. In this new version, push will make us of the modify function from MonadState. We pass it a callback whose first argument is the current state and returns the new state.

To avoid burdening the reading of the function, a new helper is defined:

addToStack :: Integer -> Stack -> Stack
addToStack element (Stack stack) = Stack $ element : stack

and we have push:

push :: (MonadState AppState m) => Integer -> m ()
push newElement =
  State.modify' (\state ->
    let newStack = addToStack newElement (stack state)
     in state {stack = newStack})

Its companion pop does a bit more, though. This is the place where bound checking happens, and where error may be used:

pop :: (MonadState AppState m) => m Integer
pop = do
  stack' <- gets stack
  if (checkSize 1 stack')
  then do
    let (hd, tl) = List.splitAt 1 (getStack stack')
    updateStack (Stack tl)
    pure $ List.head hd
  else
    error "Stack underflow!"

It could have been written in a more “compositional” style, but control flow is the main point of this function, so the if/then/else construct is appropriate here. To ease reading, two helpers are defined:

updateStack :: (MonadState AppState m) => Stack -> m ()
updateStack newStack = do
  oldState <- get
  put $ oldState{stack=newStack}

Its name is quite explicit: We pass in a Stack, and the function will update our state with it.

checkSize :: Int -> Stack -> Bool
checkSize requiredSize stack =
  (length $ getStack stack) >= requiredSize

This provides a predicate that checks whether or not our stack is full enough to allow operand to be applied.

Outputting elements

Empowered, and emboldened, by this addition to our collection, we can use our ability to do IO in controlled places to implement three new functions:

period <- outputs the top of the stack on stdout
emit <- outputs the ASCII character for the element on top of the stack
cr <- outputs a newline

Like what we wrote before, we are able to express those operations with the building blocks of stack manipulation.

period :: (MonadIO m, MonadState AppState m) => m ()
period = do
  element <- pop
  putStr $ (show element) <> " " -- Yes, we *do* want to add a space after each element

emit is less straightforward since you need to know how to convert from an integer to an ASCII char. Thankfully, Data.Char.chr has our back. We can convert the ASCII code to a character.

However, chr takes an Int, and we have a List full of… Integers! Which are not the same thing! Whereas Integer is an arbitrary-precision number (also called “bigint” in some other languages), Int is a signed 64-bit integer. There must be a conversion step from one to the other. And there is.

fromInteger :: Integer -> a

The documentation says “Conversion from an Integer”, so that must be it.

Let’s try it:

ghci> fromInteger (3 :: Integer) :: Int
3
it :: Int

And that is indeed what we want.

But beware, sized integers like Int will “loop around” if you reach their limit, which is why converting from arbitrarily big numbers to sized-constrained numbers can be a lossy operation:

ghci> fromInteger (9223372036854775802343224 :: Integer) :: Int
-5656776
it :: Int

Here is how we are going to do it:

We get the top of the stack, with pop, and inline the conversion to an Int with fromInteger and (<$>), which is the infix version of fmap.
Then, in just one step, we convert the number into an ASCII character, and print it without adding a newline. This requires a bit of plumbing due to the fact that we are operating in a Monad m that is constrained on MonadIO, and not directly on IO. Hence the usage of liftIO to achieve such as thing:

emit :: (MonadIO m, MonadState AppState m) => m ()
emit = do
  element <- fromInteger <$> pop
  liftIO $ putChar (chr element)

And finally, cr, short for carriage return, shall be implemented as such:

cr :: (MonadIO m) => m ()
cr = putStr "\n"