GraphQL in Scala: Advanced Schema Generation

In my Beginner's Guide to GraphQL in Scala, I explained how Caliban can automatically transform Scala types into GraphQL types through a process called schema derivation. This mechanism enables you to generate a Schema for user-defined types using a simple import or instruction.

However, there may be instances where Scala types and GraphQL types differ. For example, you might want different names between Scala and GraphQL, and you don't want to (or can't) rename the Scala types. There are also situations where your Scala types are not simple case classes, and the automatic derivation doesn't know how to handle them. In this article, we'll explore how to address these scenarios.

All code snippets can be reproduced in Scala CLI (using Scala 3, which is the default) by adding the following directive and imports (when more imports are required, they will be added to individual snippets):

//> using dep com.github.ghostdogpr::caliban-quick:2.5.0
import caliban.*
import caliban.schema.*

Annotations as an Easy Way to Customize Schemas

The simplest way to customize the GraphQL types derived from Scala types is to add annotations on the Scala side. These annotations will be read at compile-time (with no runtime impact!) by the derivation logic and will alter the resulting GraphQL types (whether you use automatic or semi-automatic derivation). These annotations are available in the caliban.schema.Annotations package and are all prefixed with @GQL.

Let's examine a straightforward example.

case class Character(name: String) derives Schema.SemiAuto

println(render[Character])
// type Character {
//   name: String!
// }

As expected, the generated GraphQL type matches the Scala type precisely. However, what if we wanted to give this type a different name in GraphQL?

import caliban.schema.Annotations.*

@GQLName("Person")
case class Character(name: String) derives Schema.SemiAuto

println(render[Character])
// type Person {
//   name: String!
// }

Adding the @GQLName annotation before the type modifies the generated GraphQL type by giving it a different name. This annotation can also be used before field names.

Other commonly used annotations include:

@GQLDescription adds a description to the annotated type or field
@GQLExcluded excludes a field from the generated GraphQL type
@GQLDeprecated adds the @deprecated annotation to the corresponding field in the generated GraphQL type, with a given reason

Another useful annotation is @GQLInputName, but it requires some explanation first. In GraphQL, there is a distinction between object types and input object types. Objects can only be returned as a response, while input objects can only be provided as arguments. This means that if you want the same object to be passed as an input and returned as a response, you need to create two different GraphQL types.

In Scala, however, you can use the same object as both an argument and a return type. Let's see how Caliban handles this.

case class Character(name: String)
  derives Schema.SemiAuto, ArgBuilder
case class CharacterArgs(character: Character)
  derives Schema.SemiAuto, ArgBuilder
case class Query(addCharacter: CharacterArgs => Character)
  derives Schema.SemiAuto

println(render[Query])
// input CharacterInput {
//   name: String!
// }
//
// type Character {
//   name: String!
// }
//
// type Query {
//   addCharacter(character: CharacterInput!): Character!
// }

As you can see, Caliban generates two GraphQL types: the regular object type is named after the Scala type (Character), while the input type has the Input suffix added (CharacterInput). This is Caliban's default behavior for naming input object types. However, by using the @GQLInputName annotation, you can rename the GraphQL type only when it is used as an input.

import caliban.schema.Annotations.*

@GQLName("Person")
@GQLInputName("IPerson")
case class Character(name: String)
  derives Schema.SemiAuto, ArgBuilder
case class CharacterArgs(character: Character)
  derives Schema.SemiAuto, ArgBuilder
case class Query(addCharacter: CharacterArgs => Character)
  derives Schema.SemiAuto

println(render[Query])
// input IPerson {
//   name: String!
// }
//
// type Person {
//   name: String!
// }
//
// type Query {
//   addCharacter(character: IPerson!): Person!
// }

See the Caliban documentation for a complete list of available annotations.

Using annotations is nice and easy, but there may be situations where we cannot use them. For instance, we might not own the type we want to annotate because it comes from a dependency, or we may not want to "pollute" our code with GraphQL-related elements. One option is to create dedicated types solely for defining your GraphQL API, but this approach requires additional boilerplate and transformations between your original types and your API types. A more efficient solution is to create custom schemas by reusing existing or automatically-generated ones.

Creating Schemas from Existing Schemas

Before delving deeper, you'll need a basic understanding of type classes in Scala. I recommend checking out this excellent introduction to type classes in Scala 3 as well as the dedicated section in the Scala documentation. Why is this relevant? Because Caliban's Schema is a type class: when you call the render function as we did earlier, it requires a given Schema (an instance of the Schema type class) in scope for the type passed as a parameter, and for all the nested types referenced within that type. In other words, when I call render[Character], it will only compile if there is a Schema for Character in scope.

Fortunately, most of the time we don't need to worry about this because Caliban already comes with type class instances for the most commonly used Scala types. Adding derives Schema.SemiAuto or import caliban.schema.Schema.auto.* in your code creates the required given for you transparently. However, creating a customized Schema will require that you provide your own given.

Suppose you want to use a Scala type named Id that is actually an Int. Scala has a useful feature called opaque types that allows you to create an alias for a type that will be opaque to the rest of the code (the rest of the code won't know the underlying type; it is only known in the scope where the opaque type is defined).

object Types {
  opaque type Id = Int
}

This defines an opaque type Id that conceals the underlying Int. It's important to note that there is no runtime overhead (at runtime, the type will simply be Int).

However, when attempting to use our opaque type with Caliban, we encounter an issue.

import Types.*
case class Character(id: Id) derives Schema.SemiAuto
//   1 |case class Character(id: Id) derives Schema.SemiAuto
//     |                                            ^
//     |Cannot find a Schema for type Types.Id.

The compiler complains that it cannot find a Schema for Id. This makes sense: Caliban has no idea what Id is (because it's opaque), so it cannot generate a Schema for it automatically or reuse an already defined one.

To solve this problem, we need to create a Schema for Id ourselves and use the given keyword to ensure it's in scope wherever we need a Schema. However, we don't have to write it from scratch: we can reuse the schema for Int provided by Caliban (defined by Schema.intSchema). For now, you can ignore Any in the following examples; I will address it in a future article.

object Types {
  opaque type Id = Int
  given Schema[Any, Id] = Schema.intSchema
}

import Types.*
case class Character(id: Id) derives Schema.SemiAuto

println(render[Character])
// type Character {
//   id: Int!
// }

Within our Types object, Id and Int are equivalent, so Schema[Any, Id] is the same as Schema[Any, Int]. This means we can simply assign Schema.intSchema as a value for our Schema[Any, Id]. This given is found when we attempt to create a schema for Character, preventing any compile errors, and we see that the corresponding GraphQL type is an Int as intended.

What if we wanted our Id to be a String in the GraphQL schema, but keep it as an Int in Scala? That's easy to achieve: instead of reusing Caliban's Schema for Int, we can reuse its Schema for String. The only additional step we need to take is to tell Caliban how to transform an Id into a String so that our resolver returns something that matches our schema. This can be done using the contramap function, as demonstrated below.

object Types {
  opaque type Id = Int
  given Schema[Any, Id] = Schema.stringSchema.contramap(_.toString)
}

import Types.*
case class Character(id: Id) derives Schema.SemiAuto

println(render[Character])
// type Character {
//   id: String!
// }

In summary, contramap enables the creation of a Schema for A given an existing Schema for B and a function from B to A. This makes it easy to support many "unknown" types as long as they closely resemble any type that already has a Schema.

For an additional example, here's how to define a Schema for Vector that simply reuses the Schema for List:

given vectorSchema[A](using Schema[Any, A]): Schema[Any, Vector[A]] =
  Schema.listSchema.contramap(_.toList)

Another simple action you can perform with an existing Schema is renaming the type using the rename function. You can use Schema.gen to obtain the auto-generated Schema when it's not directly provided by Caliban as with Int and String above, and then apply rename or contramap to it.

case class Character(name: String)

given Schema[Any, Character] =
  Schema.gen[Any, Character].rename("Person")

println(render[Character])
// type Person {
//   name: String!
// }

However, there are still cases where this is not sufficient: what if there is no existing Schema that we can reuse, or if we cannot use schema derivation at all (a common issue is that derivation sometimes struggles with recursive types)? You can always fall back to creating your instance of Schema from scratch.

Manually Creating Schemas

Under the hood, the Schema type class is defined as follows:

trait Schema[-R, T] {
  def toType(isInput: Boolean, isSubscription: Boolean): __Type
  def resolve(value: T): Step[R]
}

A Schema of T with an incoming context R requires two things:

A function toType that defines the shape of the GraphQL type generated from T. It returns a __Type, which represents a GraphQL type as defined in the GraphQL spec.
A function resolve that describes how to compute a resolver value of type T. It returns a Step, which represents a tree of operations that will eventually produce a ResponseValue (the content of the data field in the GraphQL response).

In this article, R is always Any, meaning we're not making use of the context. I will cover this in a future article.

While you can create a new Schema by implementing these two methods yourself, doing so requires a deeper understanding of the GraphQL spec and how Step works. If you only want to define types manually, there are helper functions to make this process easier.

The most useful one is Schema.obj for creating a Schema for an object type. It takes a name as a parameter (along with an optional description and list of directives) and a list of fields, which you can create by calling Schema.field. The field helper requires the name of the field and a function defining how to resolve the field from its parent object.

case class Character(name: String, age: Int)

given Schema[Any, Character] =
  Schema.obj("Character") { case given FieldAttributes =>
    List(
      Schema.field("name")(_.name),
      Schema.field("age")(_.age)
    )
  }

println(render[Character])
// type Character {
//   name: String!
//   age: Int!
// }

The code above generates the same Schema that would be produced if you simply called given Schema[Any, Character] = Schema.gen or used other derivation methods.

This boilerplate-y method of creating a Schema is quite similar to what many GraphQL libraries require and what Caliban aims to avoid, thanks to its use of derivation. However, it is available for cases where other methods cannot be used.

Note that you can define a single type manually like this while keeping everything else defined automatically; you don't have to use the same generation method everywhere.

Another useful helper function is scalarSchema for creating custom scalars:

import java.util.UUID
import caliban.Value.*

given Schema[Any, UUID] =
  Schema.scalarSchema(
    name = "ID",
    description = None,
    specifiedBy = None,
    directives = None,
    makeResponse = uuid => StringValue(uuid.toString)
  )

This example demonstrates how to use the ID scalar in GraphQL when using Java's UUID on the Scala side.

Going Further

Here are a few resources to explore for more advanced use cases of Schema derivation:

Caliban's documentation on schemas
Caliban's code defining Schema for the most commonly used Scala types

In conclusion, Caliban offers a powerful and flexible approach to generating GraphQL schemas. Through automatic schema derivation, compile-time annotations, schema transformations, and full manual schema definition, you can create GraphQL APIs that meet your specific needs.

Next time, we'll take a deep dive into laziness, asynchronous computations, and effects!

Pierre Ricadat's Tech Blog