Datatype-generic programming with Bifunctor 

Moving on the section 3.6 in Datatype-Generic Programming called “Datatype genericity.” Gibbons tries to call this Origami programming, but I don’t think the name stuck, so we’ll go with datatype-generic programming.

As we have already argued, data structure determines program structure. It therefore makes sense to abstract from the determining shape, leaving only what programs of different shape have in common. What datatypes such as List and Tree have in common is the fact that they are recursive — which is to say, a datatype Fix

data Fix s a = In {out :: s a (Fix s a)}

Here are three instances of Fix using different shapes: lists and internally labelled binary trees as seen before, and also a datatype of externally labelled binary trees.

data ListF a b = NilF | ConsF a b
type List a = Fix ListF a
data TreeF a b = EmptyF | NodeF a b b
type Tree a = Fix TreeF a
data BtreeF a b = TipF a | BinF b b
type Btree a = Fix BtreeF a

From Why free monads matter we saw on day 8 we actually know this is similar as Free datatype, but the semantics around Functor etc is going to different, so let’s implement it from scratch:

scala> :paste
// Entering paste mode (ctrl-D to finish)
sealed abstract class Fix[S[_], A] extends Serializable {
  def out: S[Fix[S, A]]
}
object Fix {
  case class In[S[_], A](out: S[Fix[S, A]]) extends Fix[S, A]
}

// Exiting paste mode, now interpreting.

defined class Fix
defined object Fix

Following Free, I am putting S[_] on the left, and A on the right.

Let’s try implementing the List first.

scala> :paste
// Entering paste mode (ctrl-D to finish)
sealed trait ListF[+Next, +A]
object ListF {
  case class NilF() extends ListF[Nothing, Nothing]
  case class ConsF[A, Next](a: A, n: Next) extends ListF[Next, A]
}
type GenericList[A] = Fix[ListF[+?, A], A]
object GenericList {
  def nil[A]: GenericList[A] = Fix.In[ListF[+?, A], A](ListF.NilF())
  def cons[A](a: A, xs: GenericList[A]): GenericList[A] =
    Fix.In[ListF[+?, A], A](ListF.ConsF(a, xs))
}

// Exiting paste mode, now interpreting.

defined trait ListF
defined object ListF
defined type alias GenericList
defined object GenericList

scala> import GenericList.{ cons, nil }
import GenericList.{cons, nil}

Here’s how we can use it:

scala> cons(1, nil)
res0: GenericList[Int] = In(ConsF(1,In(NilF())))

So far this is similar to what we saw with the free monad.

Bifunctor 

Not all valid binary type constructors s are suitable for Fixing; for example, function types with the parameter appearing in contravariant (source) positions cause problems. It turns out that we should restrict attention to (covariant) bifunctors, which support a bimap operation ‘locating’ all the elements.

Cats ships with Bifunctor:

/**
 * A typeclass of types which give rise to two independent, covariant
 * functors.
 */
trait Bifunctor[F[_, _]] extends Serializable { self =>

  /**
   * The quintessential method of the Bifunctor trait, it applies a
   * function to each "side" of the bifunctor.
   */
  def bimap[A, B, C, D](fab: F[A, B])(f: A => C, g: B => D): F[C, D]

  ....
}

Here is the Bifunctor instance for GenericList.

scala> import cats._, cats.data._, cats.implicits._
import cats._
import cats.data._
import cats.implicits._

scala> import cats.functor.Bifunctor
import cats.functor.Bifunctor

scala> :paste
// Entering paste mode (ctrl-D to finish)
implicit val listFBifunctor: Bifunctor[ListF] = new Bifunctor[ListF] {
  def bimap[S1, A1, S2, A2](fab: ListF[S1, A1])(f: S1 => S2, g: A1 => A2): ListF[S2, A2] =
    fab match {
      case ListF.NilF()         => ListF.NilF()
      case ListF.ConsF(a, next) => ListF.ConsF(g(a), f(next))
    }
}

// Exiting paste mode, now interpreting.

listFBifunctor: cats.functor.Bifunctor[ListF] = $anon$1@7a453673

Deriving map from Bifunctor 

It turns out that the class Bifunctor provides sufficient flexibility to capture a wide variety of recursion patterns as datatype-generic programs.

First, we can implement map in terms of bimap.

scala> :paste
// Entering paste mode (ctrl-D to finish)
object DGP {
  def map[F[_, _]: Bifunctor, A1, A2](fa: Fix[F[?, A1], A1])(f: A1 => A2): Fix[F[?, A2], A2] =
    Fix.In[F[?, A2], A2](Bifunctor[F].bimap(fa.out)(map(_)(f), f))
}

// Exiting paste mode, now interpreting.

defined object DGP

scala> DGP.map(cons(1, nil)) { _ + 1 }
res1: Fix[[α]ListF[α,Int],Int] = In(ConsF(2,In(NilF())))

The above definition of map is independent from GenericList, abstracted by Bifunctor and Fix. Another way of looking at it is that we can get Functor for free out of Bifunctor and Fix.

scala> :paste
// Entering paste mode (ctrl-D to finish)
trait FixInstances {
  implicit def fixFunctor[F[_, _]: Bifunctor]: Functor[Lambda[L => Fix[F[?, L], L]]] =
    new Functor[Lambda[L => Fix[F[?, L], L]]] {
      def map[A1, A2](fa: Fix[F[?, A1], A1])(f: A1 => A2): Fix[F[?, A2], A2] =
        Fix.In[F[?, A2], A2](Bifunctor[F].bimap(fa.out)(map(_)(f), f))
    }
}

// Exiting paste mode, now interpreting.

defined trait FixInstances

scala> {
  val instances = new FixInstances {}
  import instances._
  import cats.syntax.functor._
  cons(1, nil) map { _ + 1 }
}
res2: GenericList[Int] = In(ConsF(2,In(NilF())))

Intense amount of type lambdas, but I think it’s clear that I translated DB.map into a Functor instance.

Deriving fold from Bifunctor 

We can also implement fold, also known as cata from catamorphism:

scala> :paste
// Entering paste mode (ctrl-D to finish)
object DGP {
  // catamorphism
  def fold[F[_, _]: Bifunctor, A1, A2](fa: Fix[F[?, A1], A1])(f: F[A2, A1] => A2): A2 =
    {
      val g = (fa1: F[Fix[F[?, A1], A1], A1]) =>
        Bifunctor[F].leftMap(fa1) { (fold(_)(f)) }
      f(g(fa.out))
    }
}

// Exiting paste mode, now interpreting.

defined object DGP

scala> DGP.fold[ListF, Int, Int](cons(2, cons(1, nil))) {
         case ListF.NilF()      => 0
         case ListF.ConsF(x, n) => x + n
       }
res3: Int = 3

Deriving unfold from Bifunctor 

The unfold operator for a datatype grows a data structure from a value. In a precise technical sense, it is the dual of the fold operator.

The unfold is also called ana from anamorphism:

scala> :paste
// Entering paste mode (ctrl-D to finish)
object DGP {
  // catamorphism
  def fold[F[_, _]: Bifunctor, A1, A2](fa: Fix[F[?, A1], A1])(f: F[A2, A1] => A2): A2 =
    {
      val g = (fa1: F[Fix[F[?, A1], A1], A1]) =>
        Bifunctor[F].leftMap(fa1) { (fold(_)(f)) }
      f(g(fa.out))
    }
  // anamorphism
  def unfold[F[_, _]: Bifunctor, A1, A2](x: A2)(f: A2 => F[A2, A1]): Fix[F[?, A1], A1] =
    Fix.In[F[?, A1], A1](Bifunctor[F].leftMap(f(x))(unfold[F, A1, A2](_)(f)))
}

// Exiting paste mode, now interpreting.

defined object DGP

Here’s how we can construct list of numbers counting down:

scala> def pred(n: Int): GenericList[Int] =
         DGP.unfold[ListF, Int, Int](n) {
           case 0 => ListF.NilF()
           case n => ListF.ConsF(n, n - 1)
         }
pred: (n: Int)GenericList[Int]

scala> pred(4)
res4: GenericList[Int] = In(ConsF(4,In(ConsF(3,In(ConsF(2,In(ConsF(1,In(NilF())))))))))

There are several more we can derive, too.

Tree 

The point of the datatype-generic programming is to abstract out the shape. Let’s introduce some other datatype, like a binary Tree:

scala> :paste
// Entering paste mode (ctrl-D to finish)
sealed trait TreeF[+Next, +A]
object TreeF {
  case class EmptyF() extends TreeF[Nothing, Nothing]
  case class NodeF[Next, A](a: A, left: Next, right: Next) extends TreeF[Next, A]
}
type Tree[A] = Fix[TreeF[?, A], A]
object Tree {
  def empty[A]: Tree[A] =
    Fix.In[TreeF[+?, A], A](TreeF.EmptyF())
  def node[A, Next](a: A, left: Tree[A], right: Tree[A]): Tree[A] =
    Fix.In[TreeF[+?, A], A](TreeF.NodeF(a, left, right))
}

// Exiting paste mode, now interpreting.

defined trait TreeF
defined object TreeF
defined type alias Tree
defined object Tree

Here’s how to create this tree:

scala> import Tree.{empty,node}
import Tree.{empty, node}

scala> node(2, node(1, empty, empty), empty)
res5: Tree[Int] = In(NodeF(2,In(NodeF(1,In(EmptyF()),In(EmptyF()))),In(EmptyF())))

Now, all we have to do should be to define a Bifunctor instance:

scala> :paste
// Entering paste mode (ctrl-D to finish)
implicit val treeFBifunctor: Bifunctor[TreeF] = new Bifunctor[TreeF] {
  def bimap[A, B, C, D](fab: TreeF[A, B])(f: A => C, g: B => D): TreeF[C, D] =
    fab match {
      case TreeF.EmptyF() => TreeF.EmptyF()
      case TreeF.NodeF(a, left, right) =>
        TreeF.NodeF(g(a), f(left), f(right))
    }
}

// Exiting paste mode, now interpreting.

treeFBifunctor: cats.functor.Bifunctor[TreeF] = $anon$1@60e4a628

First, let’s try Functor:

scala> {
  val instances = new FixInstances {}
  import instances._
  import cats.syntax.functor._
  node(2, node(1, empty, empty), empty) map { _ + 1 }
}
res6: Tree[Int] = In(NodeF(3,In(NodeF(2,In(EmptyF()),In(EmptyF()))),In(EmptyF())))

Looking good. Next, let’s try folding.

scala> def sum(tree: Tree[Int]): Int =
         DGP.fold[TreeF, Int, Int](tree) {
           case TreeF.EmptyF()       => 0
           case TreeF.NodeF(a, l, r) => a + l + r
         }
sum: (tree: Tree[Int])Int

scala> sum(node(2, node(1, empty, empty), empty))
res7: Int = 3

We got the fold.

Here’s a function named grow that generates a binary search tree from a list.

scala> def grow[A: PartialOrder](xs: List[A]): Tree[A] =
          DGP.unfold[TreeF, A, List[A]](xs) {
            case Nil => TreeF.EmptyF()
            case x :: xs =>
              import cats.syntax.partialOrder._
              TreeF.NodeF(x, xs filter {_ <= x}, xs filter {_ > x})
          }
grow: [A](xs: List[A])(implicit evidence$1: cats.PartialOrder[A])Tree[A]

scala> grow(List(3, 1, 4, 2))
res8: Tree[Int] = In(NodeF(3,In(NodeF(1,In(EmptyF()),In(NodeF(2,In(EmptyF()),In(EmptyF()))))),In(NodeF(4,In(EmptyF()),In(EmptyF())))))

Looks like unfold is working too.

For more details on DGP in Scala, Oliveira and Gibbons wrote a paper translating the ideas and many more called Scala for Generic Programmers (2008), and its updated version Scala for Generic Programmers (2010).

Datatype-generic programming described here has a limitation of datatype being recursive, so I don’t think it’s very usable as is, but there are some interesting concepts that can be exploited.

The Origami patterns 

Next, Gibbons claims that the design patterns are evidence of a “evidence of a lack of expressivity in those mainstream programming languages.” He then sets out to replace the patterns using higher-order datatype-generic programming.