Languages and patterns

I’ll be demoing some work-in-progress tools I wrote to help myself learn languages, segueing into a discussion on agglutinative languages and vowel harmony, and using this as an opportunity to talk about how some tricks with patterns in Haskell (and and or patterns) can help us write cleaner code.

Links from the talk

Golden round-trip testing

In this talk I introduce golden round-trip testing.

Golden tests1 and property-based round-trip testing2 are two heavy-hitters that deserve to be in every programmer’s testing arsenal.

It turns out that a mashup of the two is even better!



A taste of topos theory

For a computer programmer, graphs might seem commonplace and not particularly interesting. But in fact they secretly come with a lot of extra structure. For instance:

  • we can add and multiply graphs
  • the graph homomorphisms between any two graphs form another graph
  • for every graph G, there’s a “powergraph” of subgraphs of G
  • subgraphs of G correspond to predicates on G, but where instead of the booleans we use for predicates on sets, there’s a special graph Ω to use instead.

So where does this structure come from? The answer is the mathematical field of topos theory.

There are many ways to approach topos theory. One view is that the concept of “topos” allows us to extract an interface from set theory. Now we can apply this interface to other domains — provided they also fit the definition of a topos. In other words, we can continue to use the language of sets in other, more complicated domains. Luckily, there are lots of other topoi!

This interface turns out to be a typed lambda calculus, and one with a lot of nice features: it has sums and products, it’s dependently typed, it has refinement types with intersection and union i.e. we can form subtypes by applying a predicate, and very frequently it comes with other goodies like modalities.

By happy coincidence some of the most common constructions in discrete maths and computer science - graphs (of multiple flavours), automata - are topoi, or at least quasitopoi, a minor variation.

I’ll show how to derive the special graph Ω that serves as a replacement for the booleans when forming predicates on graphs.

Dependent types made difficult

In the spirit of Mathematics Made Difficult and Monads Made Difficult, let’s take a deeper look at dependent types to see if we can come to understand them a little better.

The field of categorical semantics tells us that the simply typed lambda calculus has a natural interpretation in any Cartesian closed category: it is their “internal language”.

What’s an internal language good for? On one hand, having a high-level type theory for talking about low-level properties can greatly simplify proofs.

On the other hand, this language can have very practical consequences. As he describes in his paper “Compiling to Categories”1, Conal Elliott was able to put this correspondence to work in the form of a GHC plugin that allows the same program to be interpreted in multiple different CCCs — yielding applications including hardware circuits, automatic differentiation, incremental compilation and interval analysis.

What’s the story for dependent types i.e. is there a categorical semantics for dependent types?

It turns out that dependent type theory is the internal logic of locally Cartesian closed categories: ones where the pullback functor has both left and right adjoints.

Pullback corresponds to substitution into a dependent type. These left and right adjoints are called dependent sum and product, and correspond to dependent sum and product types.

I’ll explore what this looks like via explicit constructions in the category of sets (with pictures!).


Compile-time parsing

Sometimes, rather than constructing a domain object in code, it’s clearer to specify it as YAML or JSON. But at the same time, we don’t want to give up the compile-time guarantees we get from a programming language, as this would expose us to runtime errors if we made a mistake in the YAML/JSON.

Can we write in another language and then check it at compile time? We can, through the power of macros!

I’ll show how to build useful compiler checking of arbitrary strings.

Key ingredients will be:

  • macros that produce macros
  • string interpolators, and
  • compile-time evaluation.

Conditional contexts

Wouldn’t it be nice if Scala would let you write a parametric function that could do different things depending on whether or not the type parameter was an instance of some type class?

For instance

  • a Set[A](a: A*) function that creates either a TreeSet or a HashSet, depending on whether an Ordering[A] is in scope.
  • a numerical algorithm doCalc[N : Numeric](n: Matrix[N]) that works for all numeric types N, but uses a (more expensive, but numerically stable) algorithm for floating point numbers, but not integers (which don’t need it).

I’ll run through the use and implementation of a Haskell library (ifcxt) which provides this functionality, then show the (surprisingly simple) Scala implementation.

Notably, this has to be an extension to the type system in Haskell, but practically comes for free in Scala.

Accompanying Scala code allows you to play with three different implementations.

Monadic matching mishaps

The following Scala code fails to compile:

  // Search for a file by filename glob
  def find(glob: String): Error \/ File

  // retrieve the first and last lines
  def firstLastLine(file: File): Error \/ (String, String)

  // summarize the first text file
  val summary = for {
    file          <- find("*.txt")
    (first, last) <- firstLastLine(file)
  } yield file + ": " + first.take(20) + "..." + last.takeRight(20)

with the error

could not find implicit value for parameter M: scalaz.Monoid[Error]

but it’s not immediately clear why, or why it does work when we replace \/ with Option.

I’ll discuss why this occurs, a simple fix to make it work, and compare with similar situations in other languages with pattern-matching and monads (Haskell, Idris, Purescript), ending by identifying a Scala feature that perhaps could be improved in future compiler versions.

How do functional programmers do dependency injection?

How should we think about dependency injection?

In the context of imperative languages there are many well-established frameworks for doing dependency injection. Is there a functional programming equivalent?

We often hear that the FP answer is “use a Reader monad”. However, I’ll argue this isn’t a good answer.

A functional programmer would instead turn the question around and ask “What operations would you do if you had these dependencies?”. By reframing the original problem we reveal a more fundamental outlook. Namely, what are the languages of operations these dependencies allow us to express, and what are the interpreters for those languages?

To achieve reuse, both languages and interpreters will need to be modular, i.e. support good notions of composition.

So how do we go about doing this?

I’ll compare and contrast two solutions of this problem. The first, popularised in Wouter Swierstra’s Functional Pearl Datatypes à la Carte (and familiar to some through Rúnar Bjarnason’s talk “Compositional application architecture with reasonably priced free monads”) builds up a data structure of abstract operations, which are then run by an interpreter.

The second approach — popularised by Oleg Kiselyov as Typed tagless final interpreters — takes advantage of Scala’s support for higher-kinded types. Rather than building up and tearing down a data structure, we simply add a type class constraint to our program. “Interpretation” is now a compile-time operation: it’s just specialisation to a type that satisfies the constraints. This is both simpler and more efficient, and sufficient for almost all purposes.

Stop paying for free monads!

Free monads, as used in the Datatypes à la carte pattern, are a useful way to structure code, separating of specification from implementation and enabling modularisation.

But they also come with a runtime performance penalty…

The complementary typed tagless final interpreters approach (popularised in the Haskell and OCaml worlds by Oleg Kiselyov) offers a performance boost over free monads.

Yet, frustratingly, available sample code centres around simple expression languages, and “real world” examples that you might be able to put to use in your day job are hard to come by.

In this talk, we’ll compare and contrast the typed tagless approach with datatypes à la carte and explore how the typed tagless approach can be used in real production code — with configuration, state, side effects and error handling.

Haskell code for the associated LambdaJam workshop.

Macro-based smart constructors

Smart constructors are a nice pattern for building data types that need extra validation or calculation.

But in the case when we’re building our data type from statically known values, they leave us with two annoyances:

• they’re more clunky

• we have a run-time test for something that’s known at and should be checked at compile time.

We’ll investigate how Scala macros can help us address these problems and thus achieve both additional safety and greater convenience.

Accompanying Scala code demonstrating the techniques in action.

Seven trees in one

It’s a rather surprising fact that there’s an O(1) bijection between the datatype of unlabelled (planar) binary trees

data Tree = Leaf | Node Tree Tree

and 7-tuples of these trees. This presentation explores the fascinating story of how this bijection comes about, and the interesting connections to notions of isomorphism of types and distributive categories.

Haskell code containing a QuickCheck test of the bijection.

Getting higher: Higher-kinded and higher rank types in Scala

Some of us may have heard the terms “higher-kinded type” or “higher rank type” and wondered what they meant, or what the difference was.

In this talk I’ll attempt to demystify these terms and give a few examples of how they can be useful in practice.

Much ado about monoids

I’ll introduce the Monoid and Foldable type classes from Scala’s scalaz library and show how the compositionality property they satisfy means they can be used to do basic data analysis in a single pass over the data.

Blog post.

Spot an error or typo? Please let me know.