Haskell and Web Applications

Tuesday, February 6, 2007

Introduction

Lately I've been spending a lot of my free time on writing a web application in Haskell. Originally I planned to write one big article documenting the effort but once in a while I end up doing things that are so cool I feel like a fourteen year old girl who just got her first cellphone for Christmas - I just can't shut up about it. One of these things is generic introspection of data structures (more commonly referred to as reflection) and its application to rendering data structures into HTML in generic manner.

In most web applications I've worked on a huge chunk of development time ended up going into doing repetitive work. Of course most smart people refuse to do what computers can do and hence in recent years many frameworks popped up that largely succeeded in automating most repetitive chores about building web applications. Perhaps the most famous technique is RoR's "scaffolding" - introspecting data structures to generate boilerplate code that makes the process of building basic web front ends automatic. If you've ever wondered how to do something similar in Haskell you've come to the right place. Grab some coffee, sit back, and enjoy the ride.

The Problem

Essentially all web applications have some data structures that represent the concepts an application is attempting to model. This data needs to be presented on the front-end in various ways: as a detailed information pane, a form, a summary, etc. After writing code to transform data structures into HTML we end up realizing that much of the code is extremely repetitive - aside from rare special cases the code does the same thing over and over again. Consider the following data structure:

data User = User {
       firstName :: String,
       lastName :: String,
       title :: String,
       age :: Int
   } deriving (Read, Show, Typeable, Data)

We've just defined a user that has a first and last name, a title, and an age. We may want to present this data structure in the web browser like this:

First Name: Joe
Last Name: Average
Title: Programmer
Age: 33

Using the following HTML:

<div>
   <span>First Name:</span>
   <span>Joe</span>
   <br />
   <span>Last Name:</span>
   <span>Average</span>
   <br />
   <span>Title:</span>
   <span>Programmer</span>
   <br />
   <span>Age:</span>
   <span>33</span>
</div>

This is simple enough. If we have to do this often (which, more than likely, we do) we'll write a routine to automate the task of rendering the User data structure in the way described above. However, consider what happens when we define another data structure (say, Company). The way we render it to HTML is not very different than the way we render User, we only change the fields. If we want to abstract this behavior (you do abstract this behavior, don't you?) we have to write a generic routine that works on any data structure. This isn't very hard to do with reflection but the trick is to automate the rendering in a way that allows to easily handle special cases when they arise. We need to be able to specialize our code for particular data structures, otherwise we can't use the framework for real world applications - we'll be stuck the minute we hit a special case.

Fortunately all this (and a lot more) is very easy to do in Haskell. In a few dozen lines of code we'll create a flexible, extensible framework that allows us to do everything described in this section and more. Let's catch our breath and move on!

Type Reflection

In order to render an object of type User (or any type, for that matter) in a generic manner into HTML we need to be able to get a list of its field names at runtime. We also need to be able to correlate the field names with their respective values. In Haskell this is pretty simple. The following Haskell snippet takes a generic data structure and returns a list of tuples where each tuple contains a field name and its value in the structure:

introspectData :: Data a => a -> [(String, String)]
introspectData a = zip fields (gmapQ gshow a)
    where fields = constrFields $ toConstr a

If we run introspectData on a sample user we'll get a list that contains everything we need:

introspectData someUser
-> [("firstName", "Joe"),
    ("lastName", "Average"),
    ("title", "Programmer"),
    ("age", "33")]

The hard part is done. We can now take this list and easily convert it to HTML. However, before we get to the renderers, let's discuss two more utility functions. The function renderFieldName converts a camel case field name into a human readable string:

renderFieldName = capitalize . unwords
                      . groupBy (\_->not . isUpper)
    where capitalize (x:xs) = toUpper x : xs

It's easiest to describe its behavior with a couple of examples:

renderFieldName "firstName"
-> "First Name"
renderFieldName "title"
-> "Title"

We also define a helper function mapFields that we'll be using in our renderers. It simply takes a function and a data object as arguments and applies the function to each field/value pair in the data object.

mapFields renderField i =
    concatHtml $ map renderField (introspectData i)

Now that the utility code is out of the way let's get to the fun part - rendering HTML.

Generic Renderer

Let's define a function, renderData, that will take a data object and render it to HTML using helper functions defined in the previous section:

renderData :: (Data a) => a -> Html
renderData i = thediv << mapFields renderField i
    where renderField (name, val) =
        thespan << renderFieldName (name ++ ":")
        +++ thespan << val
                +++ br

The first line says our function will take any type that belongs to the Data class and will return a value of type Html. The Data class is really a set of functions that expose the reflection mechanism. Html is a type that's defined in a Haskell library along with a number of domain specific operators for expressing HTML within Haskell source code. Note how we use thediv, thespan, br, <<, and +++. These operators allow us to output HTML markup from within Haskell in a concise way without ever messing with strings.

The function renderField renders each field - we pass it to mapFields so it's called once for each field in our object. The function renderData merely wraps our field output in a <div> (we only need to specify the tags once, the library closes them for us automatically).

We now have a nice way of rendering objects generically, but what about specialization? Suppose we want to render an Address data structure and we don't want the state, city, and zip code in separate spans because it makes styling our HTML uncomfortable. We could simply create a different function for rendering addresses - renderAddress but that would force us to use a different function name every time we want to render a specialized data structure. Can we do better?

Specialized Renderers

Now that we've dealt with the crux of the problem let's take care of the details. A major problem with rendering all structures in a generic manner is that the user interface will feel clunky - while most data structures are handled in the same way there are always special cases that need to be handled separately. Consider the address data structure mentioned above:

data Address = Address {
       street :: String,
       city :: String,
       state :: String,
       zipCode :: String
   } deriving (Read, Show, Typeable, Data)

If we use the generic mechanism outlined above to render an address we'll end up with city, state, and zipcode presented on separate lines. This detail won't make or break our application but the UI won't have the polish our users have grown to expect.

These cases are very easy to handle by creating specialized instances of our renderers. Remember how the renderers defined earlier are based on a type variable - they take any type and render it using reflection. We can create specialized instances that work on specific types. All we have to do is add a type class for rendering data and changing the behavior for Address will be a piece of cake!1

class Data a => DataRenderer a where
    renderData :: a -> Html

instance DataRenderer Address where
    renderData i = ...

Now every time we try to evaluate renderData with a piece of data of type Address Haskell will call our specialized instance! Every time we need specialized behavior we can simply define it once and it will automatically be used throughout our application. Note that Haskell compiler figures out which instance to use at compile time. Not only do we get virtual functions on steroids, we also don't need to take the performance hit2.

It's rather interesting to contrast Haskell's type classes with Alexandrescu's techniques to achieve "compile time polymorphism" presented in Modern C++ Design. This comparison would be far beyond the scope of this article but I encourage everyone to do it on their own.

Multiple Renderers

What's interesting about web applications is that the same piece of data may be presented to the user in many different ways. We may present information about a user in a form, or simply as lines of text, or in a single row of a table. In order to illustrate the flexibility of our approach, let's create another generic renderer that renders Haskell data structures into HTML forms. We'll follow the process outlined above. First we'll create a type class for rendering forms:

class Data a => FormRenderer a where
    renderForm :: a -> Html

Then we'll create a generic renderer instance:

instance Data a => FormRenderer a where
    renderForm i =
        form ! [method "post"] << (mapFields renderField i
                 +++ button ! [name "action", value "submit"]
		     << "Submit")
        where renderField (name, val) =
	    thespan << renderFieldName name
            +++ textfield name ! [value val]
                +++ br

At this point we can take any Haskell data structure and render it into HTML in different ways in one line of code!

-- Render the data structure
renderData someUser

-- Render into a form
renderForm someUser

Sing, goddess, the eulogy to the old ways of writing web applications, which put pains thousandfold upon the the programmers. Manual HTML generation is no more3.

What's Next?

While the code presented in this article exposes a very powerful approach for generically rendering data into HTML, it currently has limitations that need to be addressed. It only renders "simple" data structures - ones with a single constructor and primitive fields. Additionally, the code does not contain a generic way for consistently dealing with relationships - an employee that contains a field with a company id will render an integer instead of the company name. Also, the renderers described in this article expose no mechanisms for customizations - renderData renders structures in the only way it knows how. It seems fitting to parametrize standard renderers to allow larger degree of flexibility.

Comments?

If you have any questions, comments, or suggestions, please drop a note at coffeemug@gmail.com. I'll be glad to hear your feedback.

1We'll also need to make the original renderData function an instance of DataRenderer class.

2Note that in this particular case it doesn't matter because the reflection mechanism is considerably more expensive than a single virtual function call.

3Of course I'm not the first one to try publishing data to the web in this manner (though I may be the first to do it in Haskell). There are other ways to achieve similar effects with a comparable degree of elegance in other languages. It would be fun to do this in Lisp with multimethods.