Cats is a library which provides abstractions for functional programming in the Scala programming language. The name is a playful shortening of the word category.
When you google functional programming, you can find many definitions. Most have different nuances, but purity is always present.
Functional programming is a way of writing software applications using only pure functions and immutable values.
This is often described in relation to inputs, outputs, mutability, and the outside world. In short, a pure function’s output should only depend on its input. Furthermore, the function shouldn’t affect or be affected by the outside world.
def sum(a: Int, b: Int): Int = a + b
assert(sum(1, 2) == 3)
assert(sum(1, 2) == 3)
...
Based on our definition, the following function isn’t pure.
def randomInt(): Int = {
scala.util.Random.nextInt()
}
randomInt
takes no inputs. It should always return the same value. The unpredictable output makes it hard to reproduce behavior, e.g. tests, debugging.
To make the function pure, it must take an input that will determine the output.
def randomInt(seed: Long): Int = {
val rnd = new scala.util.Random(seed)
rnd.nextInt()
}
By initializing Random
with a seed value, the result is reproducible. This makes testing this function very easy.
assert(randomInt(123L) == randomInt(123L))
assert(randomInt(123L) != randomInt(456L))
To build a bigger application, the function must be composable.
val seed = 123L
val a = randomInt(seed)
val b = randomInt(seed)
val c = randomInt(seed)
Because using the same seed always return the same value, it must change after use. To avoid mutability and impure functions, randomInt
must return the next seed with its result.
def randomInt(seed: Long): (Long, Int) = {
val rnd = new scala.util.Random(seed)
(rnd.nextLong(), rnd.nextInt())
}
val seed = 123L
val (seedA, a) = randomInt(seed)
val (seedB, b) = randomInt(seedA)
val (seedC, c) = randomInt(seedB)
A type alias can standardize the concept.
type State[A,B] = A => (A, B)
val randomInt: State[Long, Int] = { seed =>
val rnd = new scala.util.Random(seed)
(rnd.nextLong(), rnd.nextInt())
}
val randomBoolean: State[Long, Boolean] = ???
val randomChar: State[Long, Char] = ???
def randomString(length: Int): State[Long, String] = ???
This standard can be safer using a proper type.
case class State[A,B](run: A => (A, B)) {
def map[C](f: B => C): State[A, C] =
State({ a =>
val (a1, b) = run(a)
(a1, f(b))
})
def flatMap[C](f: B => State[A, C]): State[A, C] =
State({ a =>
val (a1, b) = run(a)
f(b).run(a1)
})
}
val randomInt: State[Long, Int] = State({ seed =>
val rnd = new scala.util.Random(seed)
(rnd.nextLong(), rnd.nextInt())
})
map
, and flatMap
allows State
to be used in for comprehensions. This hides the management of the seed and removes the risk of reusing an old value.
val abc = for {
a <- randomInt
b <- randomInt
c <- randomInt
} yield (a, b, c)
val seed = 123L
val (nextSeed, (a, b, c)) = abc.run(seed)
Random numbers are great, but they don’t help to highlight real-world uses cases for State
. Before jumping to the obvious “cats has it”, here is a more relatable example.
Tic-tac-toe, noughts and crosses, or Xs and Os, is a game that shouldn’t need an introduction. If your childhood was deprived of this masterpiece, or if you need a refresher, have a look at the wikipedia page.
The following information represents the game.
sealed trait Player
case object X extends Player
case object O extends Player
sealed trait Cell
case object AA extends Cell
case object AB extends Cell
case object AC extends Cell
case object BA extends Cell
case object BB extends Cell
case object BC extends Cell
case object CA extends Cell
case object CB extends Cell
case object CC extends Cell
sealed trait Outcome
case object Draw extends Outcome
case class Win(player: Player) extends Outcome
An object must record the game’s state after each turn. It can track whose turn it is, which player marked a cell, and other game-related information.
case class GameState(
player: Option[Player],
cells: Map[Cell, Player],
start: DateTime,
end: Option[DateTime],
...
)
A handful of functions must interact with the GameState
. It could be a global variable or stored in a database. This would need impure interactions. A better solution is to pass the latest GameState
as input, and expect the updated one as output. This looks a lot like State
.
// Please ignore the IO, it is hard to build an example without it
def drawCells: State[GameState, Unit]
def readCell: State[GameState, Cell]
def markCell(c: Cell): State[GameState, Unit]
def computeOutcome: State[GameState, Option[Outcome]]
A for comprehension will compose the functions. That will remove the risk of reusing an old GameState
by hiding intermediate values.
def turn: State[GameState, Outcome] =
for {
_ <- drawCells
cell <- readCell
_ <- markCell(cell)
outcome <- computeOutcome.flatMap {
case None => turn
case Some(outcome) => State((_, outcome))
}
} yield outcome
The application starts by calling turn with a new GameState
. The function calls itself until it finds an outcome. State
allows the whole flow to be pure, except for some silly user interaction.
As mentioned before, State
is available in cats. It comes with many bells, and whistles, but works very similarly to the one implemented above.
import cats.data.State
val randomInt: State[Long, Int] = State({ seed =>
val rnd = new scala.util.Random(seed)
(rnd.nextLong(), rnd.nextInt())
})
randomInt.run(123L).value
If you are encumbered with mutable variables, give State
a try. It will make your code cleaner, and safer.