Here are a few questions I've been thinking about:
How should I express data or API?
How should the data be represented in Java or Scala?
How do I convert the data into wire formats such as JSON?
How do I evolve the data without breaking binary compatibility?
limitation of case class
The sealed trait and case class is the idiomatic way to represent datatypes in Scala, but it's impossible to add fields in binary compatible way. Take for example a simple case class Greeting, and see how it would expand into a class and a companion object:
If you're interested in gigahorse-github itself, README contains the full documentation.
I also wrote Extending Gigahorse page describing the overview of how to write a Gigahorse plugin, which is more or less the same as how one would write a Dispach plugin. As I wrote there, the JSON data binding is auto generated from a schema.
For me, gigahorse-github was as much a proof of concept for sbt-datatype as it was for Gigahorse. It did end up exposing minor bugs on all components along the stack, so it was a fruitful exercise.
There's a "pattern" that I've been thinking about, which arises in some situation while persisting/serializing objects.
To motivate this, consider the following case class:
scala>caseclass User(name: String, parents: List[User])
defined class User
scala>val alice = User("Alice", Nil)
alice: User = User(Alice,List())
scala>val bob = User("Bob", alice :: Nil)
bob: User = User(Bob,List(User(Alice,List())))
scala>val charles = User("Charles", bob :: Nil)
charles: User = User(Charles,List(User(Bob,List(User(Alice,List())))))
scala>val users = List(alice, bob, charles)
users: List[User]= List(User(Alice,List()), User(Bob,List(User(Alice,List()))),
The important part is that it contains parents field, which contains a list of other users.
Now let's say you want to turn users list of users into JSON.
This is part 3 on the topic of sjson-new. See also part 1 and part 2.
Within the sbt code base there are a few places where the persisted data is in the order of hundreds of megabytes that I suspect it becomes a performance bottleneck, especially on machines without an SSD drive.
Naturally, my first instinct was to start reading up on the encoding of Google Protocol Buffers to implement my own custom binary format.
microbenchmark using sbt-jmh
What I should've done first, is start benchmarking. Using @ktosopl (Konrad Malawski)'s sbt-jmh, setting up a microbenchmark is easy. All you have to do is pop that plugin into your build. and create a subproject that enables JmhPlugin.
Two months ago, I wrote about sjson-new. I was working on that again over the weekend, so here's the update.
In the earlier post, I've introduced the family tree of JSON libraries in Scala ecosystem, the notion of backend independent, typeclass based JSON codec library. I concluded that we need some easy way of defining a custom codec for it to be usable.
roll your own shapeless
In between the April post and the last weekend, there were flatMap(Oslo) 2016 and Scala Days New York 2016. Unfortunately I wasn't able to attend flatMap, but I was able to catch Daniel Spiewak's "Roll Your Own Shapeless" talk in New York. The full flatMap version is available on vimeo, so I recommend you check it out.
sbt internally uses HList for caching using sbinary:
and I've been thinking something like an HList or Shapeless's LabelledGeneric would be a good intermediate datatype to represent JSON object, so Daniel's talk became the last push on my back.
In this post, I will introduce a special purpose HList called LList.
sjson-new comes with a datatype called LList, which stands for labelled heterogeneous list. List[A] that comes with the Standard Library can only store values of one type, namely A. Unlike the standard List[A], LList can store values of different types per cell, and it can also store a label per cell. Because of this reason, each LList has its own type. Here's how it looks in the REPL: