Simplifying internal validations using Dry-Validation

When building APIs for other developers, it’s often important to draw the line between other programmers input data and the internal world of your library. This process is called data validation and you’re probably familiar with this name.

What you may not know, is the fact that it can be achieved in many ways.

One that I particularly like is by using the dry-validation library. Here’s an example on how you can separate the validation from the actual business logic without actually changing the API of your library.

The inline way

The easiest way to provide validations is to embed the checks in a place where you receive the data.

This approach is great for super simple cases like the one below:

def sum(arg1, arg2)
  raise ArgumentError unless arg1
  raise ArgumentError unless arg2

  arg1.to_i + arg2.to_i
end

sum(2, nil) #=> ArgumentError (ArgumentError)

However, if you decide to follow this road, you will quickly end up with something really unreadable.

This code sample is taken from the ruby-kafka library and it’s used to validate the method input. I’ve removed the business logic parts as they aren’t relevant to the context of this article:

def build(
  ca_cert_file_path: nil,
  ca_cert: nil,
  client_cert: nil,
  client_cert_key: nil,
  client_cert_chain: nil,
  ca_certs_from_system: nil
)
  return nil unless ca_cert_file_path ||
                    ca_cert ||
                    client_cert ||
                    client_cert_key ||
                    client_cert_chain ||
                    ca_certs_from_system

  if client_cert && client_cert_key
    # business irrelevant to the checks
    if client_cert_chain
      # business irrelevant to the checks
    end
  elsif client_cert && !client_cert_key
    raise ArgumentError, "initialized with ssl_client_cert` but no ssl_client_cert_key"
  elsif !client_cert && client_cert_key
    raise ArgumentError, "initialized with ssl_client_cert_key, but no ssl_client_cert"
  elsif client_cert_chain && !client_cert
    raise ArgumentError, "initialized with ssl_client_cert_chain, but no ssl_client_cert"
  elsif client_cert_chain && !client_cert_key
    raise ArgumentError, "initialized with ssl_client_cert_chain, but no ssl_client_cert_key"
  end

  # business
end

Despite looking simple, the if-elsif validation is really complex and it brings many things to the table:

  • it checks several variables,
  • mixes the checks together due to the if-flow,
  • in the end it actually only checks the presence of the variables,
  • despite expecting string values, it will work with anything that is provided,
  • it forces us to spec out the validation cases with the business logic as they are coupled together.

Luckily for us, there’s a better way to do that.

The private schema way

We can achieve the same functionality and much more just by extracting the validations into a separate internal class. Let’s build up only the interface for now.

Note, that I’m leaving the ArgumentError and the external API intact, as I don’t want this change to impact anything that is outside of this class:

require 'dry-validation'

# Empty schema for now, we will get there
SCHEMA = Dry::Validation.Schema {}

def build(
  ca_cert_file_path: nil,
  ca_cert: nil,
  client_cert: nil,
  client_cert_key: nil,
  client_cert_chain: nil,
  ca_certs_from_system: nil
)
  input = {
    ca_cert_file_path: ca_cert_file_path,
    ca_cert: ca_cert,
    client_cert: client_cert,
    client_cert_key: client_cert_key,
    client_cert_chain: client_cert_chain,
    ca_certs_from_syste: ca_certs_from_system
  }

  # Do nothing if there's nothing to do
  return nil if input.values.all?(&:nil?)

  results = SCHEMA.call(input)
  raise ArgumentError, results.errors unless results.success?

  # Business logic
end

We’ve managed to extract the validation logic outside.

Thanks to that, now we have:

  • separation of responsibilities,
  • business applying method that we can test against only valid cases,
  • validation object that we can test in isolation,
  • much cleaner API that can be easier expanded (new arguments, new data types supported, etc) and/or replaced,
  • way to handle more complex validations (types, formats, etc),
  • support for reporting multiple issues with the input at the same time.

We can now perform all the checks and only when everything is good, we will run the business. But what about the validation itself?

Actually all the validations below are copy-pasted from the karafka repository. Here’s the dry-validation documentation

require 'dry-validation'

SCHEMA = Dry::Validation.Schema do
  %i[
    ca_cert
    ca_cert_file_path
    client_cert
    client_cert_key
    client_cert_chain
  ].each do |encryption_attribute|
    optional(encryption_attribute).maybe(:str?)
  end

  optional(:ca_certs_from_system).maybe(:bool?)

  rule(
    client_cert_with_client_cert_key: %i[
      client_cert
      client_cert_key
    ]
  ) do |client_cert, client_cert_key|
    client_cert.filled? > client_cert_key.filled?
  end

  rule(
    client_cert_key_with_client_cert: %i[
      client_cert
      client_cert_key
    ]
  ) do |client_cert, client_cert_key|
    client_cert_key.filled? > client_cert.filled?
  end

  rule(
    client_cert_chain_with_client_cert: %i[
      client_cert
      client_cert_chain
    ]
  ) do |client_cert, client_cert_chain|
    client_cert_chain.filled? > client_cert.filled?
  end

  rule(
    client_cert_chain_with_client_cert_key: %i[
      client_cert_chain
      client_cert_key
    ]
  ) do |client_cert_chain, client_cert_key|
    client_cert_chain.filled? > client_cert_key.filled?
  end
end

The execution effect is also really good:

build(ca_cert: 2) #=> {:ca_cert=>["must be a string"]} (ArgumentError)

Summary

Whenever you find yourself adding some inline validations, stop and think twice, there’s probably a better and more extendable way to do it.