The Scala Library Author's Dilemma

If you write a library in Scala, you eventually have to answer an annoying question: which effect system do you target?

It’s not a small decision. The moment you pick one, you’ve drawn a line through your potential users. Whichever side you land on, the people on the other side either pull in a runtime they never wanted or simply move on. And lately the line has more sides than the traditional ZIO vs Cats Effect divide: Kyo is moving fast, direct-style with Ox is a real option, and plenty of people are still perfectly happy with Future.

Not everyone thinks the answer is to support all of them. Voytek Pituła argued recently that the plurality of effect systems is a problem for the ecosystem, a “price of anarchy” we’d be better off ending by converging on one (his pick is Cats Effect) and having everyone else provide a shim to it. I understand the feeling. The cost he describes is real, and I’ve paid it more than once. I don’t think library authors can wait for the ecosystem to pick a winner, though. This post is about what I tried instead.

Sage, the Redis/Valkey library that I recently released, is where I tried a different solution, and it’s the first time I feel like I got it mostly right, instead of just picking the least-bad compromise. Let me start with the approaches I’ve used or seen used before, because the new one only makes sense against them.

How it’s usually solved

One canonical effect

The first library I created, Caliban (GraphQL client and server in Scala), is ZIO-first. The core is built on ZIO, end to end, and that was a deliberate choice: ZIO was what I knew best and it made the internals and UX clean.

But people wanted to use it with Cats Effect, so there’s an interop module that lets you do exactly that. It works, and I know quite a few people who use it in production, but it is definitely not ideal for them. When a Cats Effect user runs Caliban, ZIO is still running underneath. The ZIO runtime is on their classpath, the fibers are ZIO fibers, and there’s a translation layer sitting at the boundary converting one world into the other. That comes with several costs: loss of performance, loss of fiber context, code complexity, and subtle differences in behavior.

That’s the first model, and the cost lands on the end user: runtime weight, awkward UX, and dependencies they didn’t ask for.

A neutral imperative core

Last year I developed Proteus, a Protobuf and gRPC library. The gRPC layer is built on top of grpc-java, which is the standard on the JVM (other Scala libraries use it underneath). grpc-java is effect-agnostic: it’s callbacks, listeners, and imperative code.

Supporting several backends became a matter of wrapping that imperative core for each one of them. It works, but that wrapping has a cost. The interesting logic that turns callbacks into a nice effectful API gets written again per backend, with small variations, and now I’m maintaining several copies of code that all do roughly the same thing. Fix a bug in one, remember to fix it in the others.

That’s the second model, and the cost lands on me, the maintainer, as duplication.

What about a typeclass?

There’s a third option, and it’s the one a Scala FP person would likely suggest before either of the above: write everything against a small effect typeclass, and let each backend supply an instance. This is what sttp and tapir do for the part of their codebase that needs an effect abstraction. Their core is parameterized over a MonadError[F], and that trait is essentially the whole abstraction the library leans on:

trait MonadError[F[_]] {
  def unit[T](t: T): F[T]
  def map[T, T2](fa: F[T])(f: T => T2): F[T2]
  def flatMap[T, T2](fa: F[T])(f: T => F[T2]): F[T2]
  def error[T](t: Throwable): F[T]
  def handleError[T](rt: => F[T])(h: PartialFunction[Throwable, F[T]]): F[T]
  def eval[T](t: => T): F[T]
  def suspend[T](t: => F[T]): F[T]
  def ensure[T](f: F[T], e: => F[Unit]): F[T]
  // ...a few more in the same spirit
}

with one extra trait adding a single async for callback-based effects. That’s it. For an HTTP client this is a great fit: a call is usually one request and one response, the work is dominated by the network, so the per-operation dispatch through the typeclass instance barely registers and you often don’t need much beyond sequencing and error handling.

A Redis client is a different animal. Pipelining, pub/sub dispatch, the multiplexed connection, cluster failover: that’s fibers, races, queues, and timeouts all over the place. MonadError gives me none of that, which leaves two ways forward.

I could grow my own typeclass until it has the concurrency primitives I need. I went that route at first, but soon stopped when I realized I was (poorly) rebuilding Cats Effect.

Or I could build directly on Cats Effect’s own typeclasses, Async and Concurrent, which already have all of it. But then two of my backends fall off the table. A lawful Async needs a lawful effect, and Future isn’t one (it’s eager and memoized, not referentially transparent), while Ox is direct-style and isn’t an F[_] at all, so there’s nothing to write the instance for (and it is also eager and imperative). Even for the backends that do fit, routing ZIO through a Cats Effect abstraction means running ZIO’s interop layer, which is the exact foreign-runtime tax I was trying to get away from.

So the typeclass route is either too small, and tempts me into rebuilding an effect system, or too demanding, and locks out the unlawful and direct-style backends.

What I wanted for Sage

None of those felt right for my new library.

A Redis client has a fair amount of genuinely hard code in it. Sockets, reconnection, request pipelining, pub/sub dispatch, cluster redirects, client-side caching. None of that is effect-system-specific, and all of it is the kind of code you do not want to write and debug five times. At the same time, I really didn’t want the Caliban outcome where one ecosystem is “real” and the others pay a tax to visit.

So the goal was: write the hard parts once, and still hand every user their own native effect type with nothing borrowed underneath. Those two goals usually pull against each other. The thing that let me get close enough was kyo-compat, a library recently created by Flavio Brasil as part of his Kyo project.

The idea behind kyo-compat

At its core, kyo-compat gives you a carrier type, CIO[A]. You write your logic against CIO, and CIO is an opaque type that resolves to a different concrete effect depending on which backend you compile for. For ZIO it becomes ZIO[Any, Throwable, A]. For Cats Effect it becomes cats.effect.IO[A]. For Future it’s a Future. For Kyo it’s Kyo’s own effect type, and for Ox it’s a plain direct-style computation inside an Ox scope. The operations on it (map, flatMap, zip, race, resource handling, and so on) are all inline def, so at the call site they lower to the backend’s native primitive. There’s no typeclass being threaded around at runtime, and no adapter object wrapping every value. The abstraction is mostly a compile-time thing, which is exactly what I wanted.

None of this per-backend wiring is assembled by hand. Alongside the library, kyo-compat ships an sbt plugin built on sbt-projectmatrix: a compatLibrary step takes one source tree and stamps out a project cell per backend, each compiling the same sources against that backend’s kyo-compat artifact. In Sage’s build it’s two lines, and out come sage-client-zio, sage-client-ce, sage-client-ox, and the rest, each an ordinary published library with its own native types:

.compatLibrary(ZioLib, CeLib, OxLib, PekkoLib)(VirtualAxis.jvm)(Seq(scala3Lts))
.compatLibrary(KyoLib)(VirtualAxis.jvm)(Seq(scala3Next)) // Kyo builds on Scala Next

How Sage actually uses it

Sage is split in two. There’s a pure sans-IO core: the RESP protocol state machine, the command model, the codecs. It has no idea what an effect even is. Then there’s a single runtime layer, written once, where all the complexity lives.

That runtime is itself composed of two parts. One part is written against CIO and its streaming counterpart CStream, and covers everything where I want clean sequencing and resource handling. But the hot paths, for example socket read and write loops, buffering, and byte handling, don’t go through CIO at all. There I reach straight for low-level Java primitives, because on a loop that runs for every byte of every reply, the right effect abstraction is no effect abstraction. That split matters: I use the abstraction where it makes the code easier to reason about, and get out of its way where it would cost throughput.

Because CIO is opaque (outside its own module it has no operations), it isn’t a type I can hand to a user. That is exactly why the runtime’s client is parameterized over its effect type: it is produced as a Client[CIO, String], and each backend wraps that one object in a thin facade that re-types it to the backend’s native effect, lowering CIO and exposing the result instead. Here’s the entire bridge, written a single time in the shared module:

abstract class LoweredClient[F[_]](underlying: Client[CIO, String]) extends Client[F, String] {

  protected def lower[A](c: CIO[A]): F[A]
  protected def lift[A](fa: F[A]): CIO[A]

  final def run[A](command: Command[A]): F[A] =
    lower(underlying.run(command))

  final def cached[A](command: Command[A], ttl: FiniteDuration): F[A] =
    lower(underlying.cached(command, ttl))

  // ...the rest of the surface, every method expressed through lower/lift
}

The whole client API is defined in terms of two natural transformations: lower from CIO to the backend effect, and lift back the other way. A backend supplies just those two. For Cats Effect, that’s the entire implementation:

final private class Lowered(underlying: Client[CIO, String]) extends LoweredClient[IO](underlying) {
  protected def lower[A](c: CIO[A]): IO[A] = c.lower
  protected def lift[A](fa: IO[A]): CIO[A] = CIO.lift(fa)
}

The c.lower call is where kyo-compat unboxes the carrier into the concrete IO, and because this is all inline, it does not become a runtime bridge object sitting between the user and IO.

The type a user actually sees is just an alias:

// Cats Effect
type SageClient = Client[IO, String]

// ZIO
type SageClient = Client[Task, String]

// Pekko
type SageClient = Client[Future, String]

// Kyo
type SageClient = Client[[A] =>> A < (Abort[Throwable] & Async), String]

// Ox, direct style
type SageClient = Client[[A] =>> Ox ?=> A, String]

So when a Cats Effect user writes client.get(key), they get back an IO[Option[String]] instead of a wrapper type or a CIO.

One test suite, five runtimes

There’s a payoff to this that I didn’t fully appreciate until I was deep in it, and it might be my favorite part: the same build setup runs my tests on every backend too.

My integration suite, the one that talks to real Redis and Valkey servers, is written once, in the shared module, against Client[CIO, String]. The tests read like ordinary client code, because that is exactly what they are:

test("COPY copies and only overwrites with replace") {
  withClient { client =>
    for {
      _         <- client.set("src", "v1")
      _         <- client.set("taken", "v2")
      fresh     <- client.copy("src", "dst")
      ontoTaken <- client.copy("src", "taken")
      replaced  <- client.copy("src", "taken", replace = true)
      copied    <- client.get[String]("dst")
    } yield {
      assertEquals(fresh, true)
      assertEquals(ontoTaken, false)
      assertEquals(replaced, true)
      assertEquals(copied, Some("v1"))
    }
  }
}

withClient connects to a container, runs the CIO body, and closes the client afterwards. No backend is named anywhere in the test.

That suite isn’t compiled just once. Every backend cell recompiles and runs it, so the same assertions execute against a real server with CIO lowered to a real IO on the Cats Effect cell, a real Task on ZIO, and so on. Nothing is mocked here: each run drives that backend’s actual runtime against actual Redis.

That allowed me to run the entire test suite against every backend without having to write separate tests for each one. On top of that are a few small per-backend smoke tests written in each ecosystem’s native idiom (real ZIO.scoped, real ZIO.foreachPar, and so on) just to prove the native surface feels right, but most of the behavior is written once and run everywhere.

Tradeoffs

This isn’t free, and I’d rather not pretend otherwise.

The first cost is the facade itself. Since CIO can never be exposed, every backend needs its own, so adding an ecosystem always means writing one. It isn’t only mechanical boilerplate, though: it is the right home for the things that differ between ecosystems and can’t live in a shared core. For Sage, each facade is somewhere around 250 lines, against 7,500 shared.

There are a couple of real limitations too. Errors are fixed to Throwable. The portable carrier doesn’t carry a typed error channel, so I don’t get ZIO’s typed errors even on the ZIO build, and there’s no per-backend environment or context typing either. The usual risk with an abstraction like this is that you’re left with the lowest common denominator across ecosystems, but here it isn’t a blocker: I can still expose backend-specific features in the facades.

Handling different behaviors

The most interesting facade is Pekko’s, because Future can’t be canceled, and that limitation can’t be ignored.

Most backends have an xConsume method that tails a Redis consumer group forever and you cancel it by interrupting the effect. On ZIO you get back a Task[Unit] and .interrupt the fiber. On a Future, there’s no such thing. You can’t interrupt a Future that’s parked on a blocking read.

So the Pekko build is honest about it. xConsume there doesn’t return a bare Future; it returns a small RunningConsumer handle with a cooperative stop() built on a Pekko KillSwitch:

class RunningConsumer(killSwitch: Option[UniqueKillSwitch], val completion: Future[Done]) {
  def stop(): Future[Done] = {
    killSwitch.foreach(_.shutdown())
    completion
  }
}

And because the loop can only stop between entries, an infinite blocking read would make stop() unresponsive. So the Pekko module flatly rejects a “block forever” timeout where the others happily accept it.

I think this is the right kind of leak. The abstraction handles everything that’s the same across backends, and where an ecosystem has a real semantic difference, that difference shows up in the API for that ecosystem instead of being faked. The shared core didn’t have to know or care.

Was it worth it?

Yes, definitely.

Five Scala backends (ZIO, Cats Effect, Kyo, Ox, Pekko), one implementation of every hard thing. When I fix a reconnection bug, I fix it once and every backend gets the fix. When I add a command, I add it once. The per-backend code is small enough that adding a sixth ecosystem someday would be only a couple hundred lines. I am quite happy with this approach so far.

If you want to see it in the wild, Sage is on GitHub. And if you’re about to start a Scala library and you’re staring at that “which effect system?” question, I hope this gives you another option worth considering.