search term:

granularity of testing

In the context of sbt, Bazel, and likely many other build tools, the term test could encompass various levels, and it’s useful to disamgibuate this, especially when we want to configure pre- and post-hooks and parallel execution. In other words, what do we mean when we say “test”?

There are four levels to test:

  1. test command
  2. test modules
  3. test class
  4. test method/expression

test as commandline interface

The top-most level is the test command that the build tools provide to the users.

test as module

The common theme is that test-as-command aggregates test modules, and runs them in parallel.

One thing to note about Bazel is that it’s good at handling test modules, like really good. Test results are cached by default, the caching can be configured to be remote caching, and the execution can also be configured to remote machines, which means hundreds of jobs can potentially be triggered from a laptop. Targets are often created more granularly than traditional build tools, and in theory you could declare scala_test(...) per .scala file to run them in parallel (in different machines).

test as class

In JVM test frameworks such as JUnit, MUnit, ScalaTest, Specs2, Hedgehog, Verify etc related test methods are grouped together in a class or an object. In Scala, these test classes are sometimes named suite, like FunSuite, however, in JUnit Suite is a special kind of test class created as an alias to aggregate multiple test classes.

test as method/expression

In JVM test frameworks, individual test code is written in a method, or an expression such as test("...") { ... }. To differentiate from test classes, test methods are sometimes called test examples as well.

The parallelism of test method executions are up to the implementation of the runner.

The ability to select a specific test method and execute it remains to be an open question, which Kamil Podsiadło seems to be working on for Metals support.

sbt/sbt#911 potentially should be reopened for standardization of test method selection from sbt. Currently test frameworks implement this feature as argument processing that can be passed into testOnly --:

parallelism

As we have seen, we can consider parallelism at each of the four levels.

At the command level, we can think of the parallelization of test command as executing in different CI workers to test multiple JDKs at the same time.

As the build tool, module level parallelism is the most coarse granularity, and both sbt and Bazel run independent test modules in parallel by default. The how this parallelism is scheduled is up to the implementation, but with sbt, there’s an experimental feature to associate tasks with parallelism budget tags along with Global / concurrencyRestrictions. This mechanism could be used to run specific modules exclusively, while running the rest in parallel. In terms of the execution environment, Bazel by default forks to another process, while sbt by default runs in a sandboxed thread.

At the class level, sbt implements parallelism by default by mapping each test class to a task. In general, test framework specific knowledge is required to interact with this notion, which gives JVM specific build tools some upper hand.

Method level parallelism is implemented by the test framework. This means that the parallelism at the method level is limited to threadind, as opposed to having forks.

As a reference, Maven’s Surefire Plugin provides parallel attribute, which can be configured to methods, classes, both, suites, suitesAndMethods, classesAndMethods, or all.

setup and teardown

We can also consider setup and teardown at four levels.

A command level setup does not exist, but it would be something like pretest;test;posttest;, where pretest command would set some environment up, run test, and clean up in posttest.

At module level, sbt can append Tests.Setup( loader => ... ) to Test / testOptions. In theory this could be used as way of setting up some environment.

At class level, the setups are sometimes called fixture, and often supported by the framework. JUnit for instance provides @BeforeClass and @AfterClass annotation.

A method level setups are called before each test method. JUnit provides @Before and @After annotation.

summary

When we discuss testing there are four different levels: test command, test module, test class, and test method/expression.

Potentially each of these levels can be parallelized, and depending on the build tool or the runner provided by the test framework, the execution might be forked in separate processes or executed as threads. While build tools are capable of parallelizing execution of test modules, support for selection and parallelization of test classes and test methods remain patchy.

Bazel lacks testOnly, but it has powerful testing capability by supporting granular test targets, named aggregates, remote caching, and remote execution.