Gigahorse 

Gigahorse is an HTTP client for Scala with multiple backend support. For the internal backend you can choose from Apache HTTP HttpAsyncClient, Async Http Client, Square OkHttp, or Akka HTTP.

Setup 

For Apache HTTP HttpAsyncClient:

libraryDependencies += "com.eed3si9n" %% "gigahorse-apache-http" % "0.7.0"

For Async HTTP Client:

libraryDependencies += "com.eed3si9n" %% "gigahorse-asynchttpclient" % "0.7.0"

For Square OkHttp 3.x Client:

libraryDependencies += "com.eed3si9n" %% "gigahorse-okhttp" % "0.7.0"

Akka HTTP support is experimental:

libraryDependencies += "com.eed3si9n" %% "gigahorse-akka-http" % "0.7.0"

Dependencies 

Credits 

  • The implementation was originally based from The Play WS API, including the way AHC is called and the choice of default values. In particular, it uses Lightbend Config and @wsargent’s SSL Config, which uses securer defaults.
  • API design is also strongly influenced by that of Dispatch Reboot by @n8han.
  • All datatypes are generated using Contraband, which @Duhemm and I worked on.
  • @alexdupre contributed AHC 2.0 migration and WebSocket support.
  • Finally, props to underlying HTTP libraries for the actual HTTP work.

License 

Apache v2

Quick start 

Here’s a quick example of how to make a GET call using Gigahorse:

scala> import gigahorse._, support.apachehttp.Gigahorse
scala> import scala.concurrent._, duration._
scala> val http = Gigahorse.http(Gigahorse.config)
scala> val r = Gigahorse.url("https://api.duckduckgo.com").get.
         addQueryString(
           "q" -> "1 + 1"
         )
scala> val f = http.run(r, Gigahorse.asString andThen {_.take(60)})
scala> Await.result(f, 120.seconds)
scala> http.close()

Gigahorse

Basic concepts 

Gigahorse 

Gigahorse is a helper object to create many useful things.

  • For Apache HTTP backend, use gigahorse.support.apachehttp.Gigahorse
  • For AHC backend, use gigahorse.support.asynchttpclient.Gigahorse.
  • For OkHttp backend, use gigahorse.support.okhttp.Gigahorse.
  • For Akka HTTP backend, gigahorse.support.akkahttp.Gigahorse.

HttpClient 

The HttpClient represents an HTTP client that’s able to handle multiple requests. When it’s used it will spawn many threads, so the lifetime of an HttpClient must be managed with care. Otherwise your program will run out of resources.

There are two ways of creating an HttpClient. First is creating using Gigahorse.http(Gigahourse.config). If you use this with Apache HTTP or AHC, you must close the client yourself:

scala> import gigahorse._, support.apachehttp.Gigahorse
scala> val http = Gigahorse.http(Gigahorse.config)
scala> http.close() // must call close()

Second way is using the loan pattern Gigahorse.withHttp(config) { ... }:

import gigahorse._, support.apachehttp.Gigahorse
Gigahorse.withHttp(Gigahorse.config) { http =>
  // do something
}

This will guarantee to close the HttpClient, but the drawback is that it could close prematurely before HTTP process is done, so you would have to block inside to wait for all the futures.

Config 

To create an HttpClient you need to pass in a Config. Gigahorse.config will read from application.conf to configure the settings if it exists. Otherwise, it will pick the default values.

scala> Gigahorse.config

Request 

The Request is an immutable datatype that represents a single HTTP request. Unlike HttpClient this is relativey cheap to create and keep around.

To construct a request, call Gigahorse.url(...) function:

scala> val r = Gigahorse.url("https://api.duckduckgo.com").get.
         addQueryString(
           "q" -> "1 + 1",
           "format" -> "json"
         )

You can chain calls like the above, which keeps returning a new request value.

http.run(r, f) 

There are many methods on HttpClient, but probably the most useful one is http.run(r, f) method:

abstract class HttpClient extends AutoCloseable {
  /** Runs the request and return a Future of A. Errors on non-OK response. */
  def run[A](request: Request, f: FullResponse => A): Future[A]

  ....
}

The first parameter take a Request, and the second parameter takes a function from FullResponse to A. There’s a built-in function called Gigahorse.asString that returns the body content as a String.

Since this is a plain function, you can compose it with some other function using andThen:

scala> import gigahorse._, support.apachehttp.Gigahorse
scala> import scala.concurrent._, duration._
scala> val http = Gigahorse.http(Gigahorse.config)
scala> val r = Gigahorse.url("https://api.duckduckgo.com").get.
         addQueryString(
           "q" -> "1 + 1"
         )
scala> val f = http.run(r, Gigahorse.asString andThen {_.take(60)})
scala> Await.result(f, 120.seconds)
scala> http.close()

Note: Using OkHttp or Akka HTTP, if you don’t consume the response body, you must call close() method on the FullResponse to let go of the resource.

Future 

Because run executes a request in a non-blocking fashion, it returns a Future. Normally, you want to keep the Future value as long as you can, but here, we will block it to see the value.

One motivation for keeping the Future value as long as you can is working with multiple Futures (HTTP requests) in parallel. See Futures and Promises to learn more about Futures.

http.runStream(r, f) 

Instead of running on the full reponse, Gigahorse can also treat the incoming response as a Reactive Stream, and process them by chunk, for example line by line.

Building a Request value 

To construct a Request value, call Gigahorse.url(...) function:

scala> import gigahorse._, support.asynchttpclient.Gigahorse
scala> val url = "https://api.duckduckgo.com"
scala> val r = Gigahorse.url(url)

Next you can chain the methods defined on Request to construct new values.

HTTP verbs 

There are methods for HTTP verbs: GET, POST, PATCH, PUT, DELETE, HEAD, and OPTIONS.

scala> import java.io.File
scala> Gigahorse.url(url).get
scala> Gigahorse.url(url).post("")
scala> Gigahorse.url(url).post(new File("something.txt"))

post(...), put(...), and patch(...) methods have a variety that accepts type paramter A that has a context bounds for A: HttpWrite, so potentially this could be extended to accept any custom types.

  /** Uses GET method. */
  def get: Request                                   = this.withMethod(HttpVerbs.GET)
  /** Uses POST method with the given body. */
  def post[A: HttpWrite](body: A): Request           = this.withMethod(HttpVerbs.POST).withBody(body)
  /** Uses POST method with the given body. */
  def post(body: String, charset: Charset): Request  = this.withMethod(HttpVerbs.POST).withBody(EncodedString(body, charset))
  /** Uses POST method with the given file. */
  def post(file: File): Request                      = this.withMethod(HttpVerbs.POST).withBody(FileBody(file))

Request with authentication 

If you need to use HTTP authentication, you can specify it in the request, using a username, password, and an AuthScheme. Valid case objects for the AuthScheme are Basic, Digest, NTLM, Kerberos, and SPNEGO.

scala> Gigahorse.url(url).get.withAuth("username", "password", AuthScheme.Basic)

There’s also an overload for withAuth(...) method that accepts a Realm value, which you can use to specify more details.

Request with query parameters 

Parameters can be specified as a series of key/value tuples.

scala> Gigahorse.url(url).get.
         addQueryString(
           "q" -> "1 + 1",
           "format" -> "json"
         )

Request with content type 

Content-Type header should be specified when posting a text.

scala> import java.nio.charset.Charset
scala> Gigahorse.url(url).post("some text").
         withContentType(MimeTypes.TEXT, Gigahorse.utf8)

Request with additional headers 

Headers can be specified as a series of key/value tuples.

scala> Gigahorse.url(url).get.
         addHeaders(
           HeaderNames.AUTHORIZATION -> "bearer ****"
         )

Request with virtual host 

A virtual host can be specified as a string.

scala> Gigahorse.url(url).get.withVirtualHost("192.168.1.1")

Request with timeout 

If you wish to specify a request timeout, overriding the one specified by the Config, you can use withRequestTimeout to set a value. An infinite timeout can be set by passing Duration.Inf.

scala> import scala.concurrent._, duration._
scala> Gigahorse.url(url).get.withRequestTimeout(5000.millis)

Submitting form data 

To build a Request value for posting url-form-encoded data, a Map[String, List[String]] needs to be passed into post.

scala> val r = Gigahorse.url("http://www.freeformatter.com/json-validator.html").
         post(Map("inputString" -> List("{}")))

Submitting a file 

A Request value can be created for submitting a file using post, put, or patch method.

scala> Gigahorse.url(url).post(new File("something.txt"))

Processing the FullResponse 

Once you build a Request value, you can pass it to HttpClient to execute the request using run, download, processFull, runStream methods.

http.run(r, f) 

There are many methods on HttpClient, but probably the most useful one is http.run(r, f) method. As we saw in Basic Concepts page this take a Request value, and a function FullResponse => A.

Gigahorse provides Gigahorse.asString function to return Future[String], but we can imagine this could be expanded to do more.

Another thing to note is that run method will only accept HTTP 2XX statuses, and fail the future value otherwise. (By default 3XX redirects are handled automatically)

Post-processing a Future 

In addition to passing in a function, a Future can easily be post-processed by mapping inside it.

import gigahorse._, support.okhttp.Gigahorse
import scala.concurrent._, duration._
import ExecutionContext.Implicits._
val http = Gigahorse.http(Gigahorse.config)

val r = Gigahorse.url("https://api.duckduckgo.com").get.
  addQueryString(
    "q" -> "1 + 1"
  )
val f0: Future[FullResponse] = http.run(r, identity)
val f: Future[String] = f0 map { Gigahorse.asString andThen (_.take(60)) }
Await.result(f, 120.seconds)

Whenever an operation is done on a Future, an implicit execution context must be available — this declares which thread pool the callback to the future should run in.

For convenience there’s an overload of run that takes only the Request parameter.

Lifting the FullResponse to Either 

One of the common processing when dealing with a Future that can fail is to lift the inner A value to Either[Throwable, A].

There’s a convenient website called http://getstatuscode.com/ that can emulate HTTP statuses. Here’s what happens when we await on a failed Future.

val r = Gigahorse.url("http://getstatuscode.com/500")
val f = http.run(r, Gigahorse.asString)
Await.result(f, 120.seconds)

Gigahorse provides a mechanism called Gigahorse.asEither to lift the inner A value to Either[Throwable, A] as follows:

val r = Gigahorse.url("http://getstatuscode.com/500")
val f = http.run(r, Gigahorse.asEither)
Await.result(f, 120.seconds)

asEither can be mapped over as a right-biased Either.

val r = Gigahorse.url("http://getstatuscode.com/200")
val f = http.run(r, Gigahorse.asEither map {
          Gigahorse.asString andThen (_.take(60)) })
Await.result(f, 120.seconds)

http.processFull(r, f) 

If you do not wish to throw an error on non-2XX responses, and for example read the body text of a 500 response, use processFull method.

val r = Gigahorse.url("http://getstatuscode.com/500")
val f = http.processFull(r, Gigahorse.asString andThen (_.take(60)))
Await.result(f, 120.seconds)

Asynchronous processing with Reactive Stream 

Thus far we’ve been looking at processing FullResponse, which already retrieved the entire body contents in-memory. When the content is relatively small, it’s fine, but for things like downloading files, we would want to process the content by chunks as we receive them.

Downloading a file 

A file can be downloaded using http.download method:

scala> import gigahorse._, support.okhttp.Gigahorse
scala> import scala.concurrent._, duration._
scala> import ExecutionContext.Implicits._
scala> import java.io.File
scala> val http = Gigahorse.http(Gigahorse.config)
scala> {
         val file = new File(new File("target"), "Google_2015_logo.svg")
         val r = Gigahorse.url("https://upload.wikimedia.org/wikipedia/commons/2/2f/Google_2015_logo.svg")
         val f = http.download(r, file)
         Await.result(f, 120.seconds)
       }

This will return Future[File].

http.runStream(r, f) 

We can treat the incoming response as a Reactive Stream, and work on them part by part using http.runStream(r, f).

  /** Runs the request and return a Future of A. */
  def runStream[A](request: Request, f: StreamResponse => Future[A]): Future[A]

Note that the function takes a StreamResponse instead of a FullResponse. Unlike the FullResponse, it does not have the body contents received yet.

Instead, StreamResponse can create Stream[A] that will retrieve the parts on-demand. As a starting point, Gigahorse provides Gigahorse.asByteStream and Gigahorse.asStringStream.

Here’s how Stream[A] looks like:

import org.reactivestreams.Publisher
import scala.concurrent.Future

abstract class Stream[A] {
  /**
   * @return The underlying Stream object.
   */
  def underlying[B]: B

  def toPublisher: Publisher[A]

  /** Runs f on each element received to the stream. */
  def foreach(f: A => Unit): Future[Unit]

  /** Runs f on each element received to the stream with its previous output. */
  def fold[B](zero: B)(f: (B, A) => B): Future[B]

  /** Similar to fold but uses first element as zero element. */
  def reduce(f: (A, A) => A): Future[A]
}

Using this, processing stream at relative ease. For example, download is implementing as follows:

  def download(request: Request, file: File): Future[File] =
    runStream(request, asFile(file))

....

import java.nio.ByteBuffer
import java.nio.charset.Charset
import java.io.{ File, FileOutputStream }
import scala.concurrent.Future

object DownloadHandler {
  /** Function from `StreamResponse` to `Future[File]` */
  def asFile(file: File): StreamResponse => Future[File] = (response: StreamResponse) =>
    {
      val stream = response.byteBuffers
      val out = new FileOutputStream(file).getChannel
      stream.fold(file)((acc, bb) => {
        out.write(bb)
        acc
      })
    }
}

stream.fold will write into the FileOutputStream as the parts arrive.

Newline delimited stream 

Here’s another example, this time using Akka HTTP. Suppose we are running $ python -m SimpleHTTPServer 8000, which serves the current directory over port 8000, and let’s say we want to take README.markdown and print each line:

scala> import gigahorse._, support.akkahttp.Gigahorse
import gigahorse._
import support.akkahttp.Gigahorse

scala> import scala.concurrent._, duration._
import scala.concurrent._
import duration._

scala> Gigahorse.withHttp(Gigahorse.config) { http =>
         val r = Gigahorse.url("http://localhost:8000/README.markdown").get
         val f = http.runStream(r, Gigahorse.asStringStream andThen { xs =>
           xs.foreach { s => println(s) }
         })
         Await.result(f, 120.seconds)
       }
Gigahorse
==========

Gigahorse is an HTTP client for Scala with Async Http Client or Lightbend Akka HTTP underneath.
....

It worked. This could be used for process an infinite stream of JSON.

Configuring Gigahorse 

Gigahorse.config will read from application.conf to configure the settings.

  • gigahorse.followRedirects: Configures the client to follow 301 and 302 redirects (default is true).
  • gigahorse.useProxyProperties: To use the JVM system’s HTTP proxy settings (http.proxyHost, http.proxyPort) (default is true).
  • gigahorse.userAgent: To configure the User-Agent header field.
  • gigahorse.compressionEnforced: Set it to true to use gzip/deflater encoding (default is false).

Configuring Gigahorse with SSL 

To configure Gigahorse for use with HTTP over SSL/TLS (HTTPS), see Play WS’s Configuring WS SSL. Just place the configuration under gigahorse.ssl:

gigahorse.ssl {
  trustManager = {
    stores = [
      { type = "JKS", path = "exampletrust.jks" }
    ]
  }
}

Configuring Timeouts 

There are 3 different timeouts in Gigahorse. Reaching a timeout causes the request to interrupt.

  • gigahorse.connectTimeout: The maximum time to wait when connecting to the remote host (default is 120 seconds).
  • gigahorse.requestTimeout: The total time you accept a request to take (it will be interrupted even if the remote host is still sending data) (default is 120 seconds).
  • gigahorse.readTimeout: The maximum time the request can stay idle (connection is established but waiting for more data) (default is 120 seconds).

The request timeout can be overridden for a specific connection with withRequestTimeout() (see Building a Request value).

Advanced configuration 

The following advanced settings can be configured.

Refer to the AsyncHttpClientConfig Documentation for more information.

  • gigahorse.maxRedirects: The maximum number of redirects (default: 5).
  • gigahorse.maxRequestRetry: The maximum number of times to retry a request if it fails (default: 5).
  • gigahorse.disableUrlEncoding: Whether raw URL should be used (default: false).
  • gigahorse.keepAlive: Whether connection pooling should be used (default: true).
  • gigahorse.pooledConnectionIdleTimeout: The time after which a connection that has been idle in the pool should be closed.
  • gigahorse.connectionTtl: The maximum time that a connection should live for in the pool.
  • gigahorse.maxConnections: The maximum total number of connections. -1 means no maximum.
  • gigahorse.maxConnectionsPerHost: The maximum number of connections to make per host. -1 means no maximum.

Extending Gigahorse 

Gigahorse can be extended to provide specific support for some file formats, or even a suite of RESTful API. Once again, we come back to the basic pattern of Gigahorse: http.run(r, f)

A Gigahorse plugin should provide the following things:

  1. A helper to build Request datatype, including the handling of authentication.
  2. Functions to process Response to something more useful.

On this page, we will go through one way of writing a plugin.

Request builder 

First, we’ll define RequestBuilder with a method that returns a Request:

scala> import gigahorse._, support.okhttp.Gigahorse
scala> import scala.concurrent._, duration._
scala> :paste
abstract class RequestBuilder {
  protected val baseUrl = "https://api.github.com"
  def build: Request
}

To wrap GET /repos/:owner/:repo, we’ll define a case class representing the request as follows:

scala> :paste
case class Repos(owner: String, name: String) extends RequestBuilder {
  def build: Request = Gigahorse.url(s"$baseUrl/repos/$owner/$name")
}

Authentication wrapper 

Normally a RESTful API gives you some way of authorization. The following creates a wrapper that can provide OAuth handling for each request.

scala> import collection.immutable.Map
scala> :paste
/** AbstractClient is a function to wrap API operations */
abstract class AbstractClient {
  def httpHeaders: Map[String, String] =
    Map(HeaderNames.ACCEPT -> MimeTypes.JSON)
  def complete(request: Request): Request =
    if (httpHeaders.isEmpty) request
    else request.addHeaders(httpHeaders.toList: _*)
  def apply(builder: RequestBuilder): Request =
    complete(builder.build)
}
case class NoAuthClient() extends AbstractClient {
}
case class OAuthClient(token: String) extends AbstractClient {
  override def httpHeaders: Map[String, String] =
    super.httpHeaders ++ Map("Authorization" -> "bearer %s".format(token))
  override def toString: String =
    s"OAuthClient(****)"
}

Helper object 

Similar to Gigahorse object, we can provide Github object that puts all the useful functions at one spot.

scala> :paste
object Github {
  def noAuthClient = NoAuthClient()
  def oauthClient(token: String) =
    OAuthClient(token)
  def repo(owner: String, name: String): Repos =
    Repos(owner, name)
}

This can be invoked as follows:

scala> val client = Github.noAuthClient
client: NoAuthClient = NoAuthClient()

scala> val http = Gigahorse.http(Gigahorse.config)
http: gigahorse.HttpClient = gigahorse.support.okhttp.OkhClient@182ab1c0

scala> {
         val f = http.run(client(Github.repo("eed3si9n", "gigahorse")), Gigahorse.asString andThen (_.take(60)) )
         Await.result(f, 120.seconds)
       }
res0: String = {"id":64110679,"name":"gigahorse","full_name":"eed3si9n/giga

JSON databinding using contraband 

Next we would like to provide a parser for the returned JSON value. You could either manually define case classes and JSON codecs, or use JSON data binding feature in contraband. This will generate both the datatype and the codec from a schema that looks like this:

{
  "codecNamespace": "example.github.response",
  "fullCodec": "CustomJsonProtocol",
  "types": [
    {
      "name": "Repo",
      "namespace": "example.github.response",
      "type": "record",
      "target": "Scala",
      "fields": [
        {
          "name": "url",
          "type": "String",
          "since": "0.0.0"
        },
        {
          "name": "name",
          "type": "String",
          "since": "0.0.0"
        },
        {
          "name": "id",
          "type": "long",
          "since": "0.0.0"
        }
        .....
      ],
      "extra": [
      ]
    }
  ]
}

This will generate a pseudo case class named Repo, its JSON codec called RepoFormats, and the full codec CustomJsonProtocol that puts all the formats together.

/**
 * This code is generated using [[http://www.scala-sbt.org/contraband/ sbt-contraband]].
 */

// DO NOT EDIT MANUALLY
package gigahorse.github.response
final class Repo(
  val url: String,
  val name: String,
  ....) extends Serializable {
  ....
}

object Repo {
  def apply(url: String, name: String, id: Long ....
}

We can now define asRepo function by composing it with a JSON parser.

import import gigahorse._, support.asynchttpclient.Gigahorse
import github.{ response => res }
import sjsonnew.JsonFormat
import sjsonnew.support.scalajson.unsafe.Converter
import scala.json.ast.unsafe.JValue
import java.nio.ByteBuffer

object Github {
  import res.CustomJsonProtocol._
  def noAuthClient = NoAuthClient()
  def oauthClient(token: String) =
    OAuthClient(token)
  def repo(owner: String, name: String): Repos =
    Repos(owner, name)

  val asJson: Response => JValue =
    (r: Response) => {
      import sjsonnew.support.scalajson.unsafe.Parser
      val buffer = ByteBuffer.wrap(r.bodyAsBytes)
      Parser.parseFromByteBuffer(buffer).get
    }
  def as[A: JsonFormat]: Response => A =
    asJson andThen Converter.fromJsonUnsafe[A]
  val asRepo: Response => res.Repo = as[res.Repo]
}

This can be called as follows:

scala> Gigahorse.withHttp(Gigahorse.config) { http =>
         val f = http.run(client(Github.repo("eed3si9n", "gigahorse")), Github.asRepo)
         Await.result(f, 2.minutes)
       }
res0: Repo = Repo(https://api.github.com/repos/eed3si9n/gigahorse, gigahorse, 64110679,...

For more details, check out the source of gigahorse-github.