2016-04-11

An API Design Conundrum

Over the past months, I've been working through a tricky API design problem in the longevity project. After a number of trials, I think I've finally come up with something that does not feel kludgy, and is easy for the user to understand and use. In this post, I will describe the problem I was having, and what makes it difficult. I will show some of my solutions that didn't work so well, and finally, present the solution I've landed on. It's not perfect, and never will be, so I am eager to hear any thoughts you might have on it.

The crux of the problem is this. I have two types: a Persistent and a PType. The Persistent is the thing you want to store in the database, and the PType contains meta-information about that type, such as which properties of type are unique, and where fast lookup is required. I'm calling these things keys and indexes. These kinds of issues are meta because they do not describe qualities of a single Persistent entity, but rather, qualities of collections of those entities. Here's my stock example, a user in a blogging application. We want the user's username and email to both be unique, and we want to be able to perform fast lookups on lastName/firstName combinations:

case class User(
  username: String,
  email: Email,
  firstName: String,
  lastName: String)
extends Persistent

object User extends PType[User] {
  // 1. we need some way to indicate that username and email
  // are unique
  // 2. we need some way to indicate that lastName/firstName
  // searches should be fast
}
I'm not entirely happy with the terms "key" and "index", because the sort of imply they are database schema issues, where I am trying to present them as domain issues. And indeed they are domain issues. Ask yourself this question: would a domain expert care about whether a User's username is unique? Yes, of course they would. This is a domain issue, not a database issue. Similarly, if there is a requirement that fullname searches should be fast, this is a domain-level requirement. Remember that domain experts should not care about your database schema. But this does not make the database schema irrelevant! As with anything else, the database schema is there to support and implement the domain model. I'm sticking with these standard database schema terms mainly because I couldn't think of anything better, and the basic concepts behind the terms "key" and "index" will be immediately apparent to most engineers.

But here's the tricky thing: longevity repositories need a set of the keys, and a set of the indexes, in order to maintain the database schema, and to build lookup requests. On the other hand, the user needs each key to be individually named, so that they can use the key to build key values to retrieve entities. So on the one hand, longevity would like to see something like this:

trait PType[P <: Persistent] {
  val keySet: Set[Key[P]]
  val indexSet: Set[Index[P]]
}
Whereas the longevity user would like to see something like this:

object User extends PType[User] {
  val usernameKey = key("username")
  val emailKey = key("email")
  val fullnameIndex = index("lastName", "firstName")
}
(Full disclosure: the key and index methods don't actually take String arguments, they take Props. But I'd like to gloss over that point for the purposes of this post.)

The user wants to get a key value like so:

val username: String = getUsernameFromSomewhereElse()
val keyVal: KeyVal[User] = User.usernameKey(username)
They can then use the key value to look up a user:

val userWrapper = userRepo.retrieve(keyVal)
My first attempt at resolving the tension between these two requirements was to have PType methods key and index, that both built the artifact, and added it into the appropriate set. In other words, the call to key("username") above would build the key, add it to the keySet, and then return it. But there is a big problem with this: there is no obvious way to make sure that the calls to methods key and index have happened before the PType is fully initialized, and passed to the repo. For instance, consider the following alternative to the example above:

object User extends PType[User] {
  object keys {
    val username = key("username")
    val email = key("email")
  }
  object indexes {
    val fullname = index("lastName", "firstName")
  }
}
The code here is much better organized. But because User.keys and User.indexes are static objects, they are initialized independently of the containing object User! In this case, User.keySet and User.indexSet will both be empty sets when the repository gets its hands on them. Not good! You can work around this problem like so:

object User extends PType[User] {
  object keys {
    val username = key("username")
    val email = key("email")
  }
  keys // initialize keys now
  object indexes {
    val fullname = index("lastName", "firstName")
  }
  indexes // initialize indexes now
}
Of course, I do not want to ask my users to do this!

What if I simply require the users to build the sets themselves? For example:

object User extends PType[User] {
  object keys {
    val username = key("username")
    val email = key("email")
  }
  val keySet = Set(keys.username, keys.email)
  object indexes {
    val fullname = index("lastName", "firstName")
  }
  val indexSet = Set(indexes.fullname)
}
This is a little less kludgy, but it is asking a lot of the user. They have to declare each key and index in two different places.

At this point, I decided I would build the keySet and indexSet with reflection, by scanning over the inner objects keys and indexes. This works well, and the User companion object once again looks like this:

object User extends PType[User] {
  object keys {
    val username = key("username")
    val email = key("email")
  }
  object indexes {
    val fullname = index("lastName", "firstName")
  }
}
In my mind, using reflection to solve an API problem is a last-resort kind of solution. But I get comfort by reminding myself that most frameworks that are solving these sorts of problems resort to some sort of extra-lingual technique, be it reflection, bytecode rewriting, and/or annotations. The user will get a runtime exception if the inner objects are missing, but I suppose I could use a macro to convert that into a compile time exception if I wanted to. (Or you could! I would be thrilled to see your pull request!)

I really wish I could enforce the presence of the inner objects in a typesafe way, but it is simply not possible. I did consider doing something like this:

object User extends PType[User] {
  val keys = new AnyRef {
    val username = key("username")
    val email = key("email")
  }
  val indexes = new AnyRef {
    val fullname = index("lastName", "firstName")
  }
}
But this doesn't work particularly well, as accessing User.keys.username is now going to be a dynamic method lookup. Not only is this too much of a compromise in itself, it also requires the user to import scala.language.dynamics. That's a non-starter.

I'm still left with one problem: What if the longevity user wants to "do things their way", and is not happy about having to create the inner objects according to convention? It turns out that is not a problem at all! Internally, PType looks something like this:

trait PType[P <: Persistent] {
  val keySet: Set[Key[P]] = reflectivelyScanForKeys()
  val indexSet: Set[Index[P]] = reflectivelyScanForIndexes()
  private def reflectivelyScanForKeys() = // ...
  private def reflectivelyScanForIndexes() = // ...
}
If the longevity user doesn't want the reflective scanning to occur on the inner objects keys and indexes, all they have to do is override keySet and indexSet, so that the reflective scanning methods to not get called. For instance, supposing a user was adamant about declaring the keys directly in the companion object, as in my initial example:

object User extends PType[User] {
  val usernameKey = key("username")
  val emailKey = key("email")
  val fullnameIndex = index("lastName", "firstName")
}
They can easily make this work by providing two overrides:

object User extends PType[User] {
  val usernameKey = key("username")
  val emailKey = key("email")
  val fullnameIndex = index("lastName", "firstName")

  override val keySet = Set(usernameKey, emailKey)
  override val indexSet = Set(fullnameIndex)
}
All in all, I'm pretty happy with the solution I came up with. This brings me a feeling of satisfaction, as it took me many months to get to this solution! But there are probably a dozen other potential solutions that are just as good, if not better. Do you have any ideas? What do you think about the inner object approach I settled on?

For more details on how to build longevity keys and indexes, please see the chapter on the persistent type in the longevity user manual.

No comments:

Post a Comment