What I Do, Part 1: Programming languages

[This is the second in an occasional series of posts explaining what I do in my “day job” as a computer science PhD student. The idea is to write a series of posts of increasing specificity, but all aimed at a general mathematical audience. Previous posts in the series can be found here: Part 0.]

In the zeroth post in this series, I explained what it is that computer scientists actually study—not computers, but computation. And since research is always driven by questions, I gave a list of questions that can be asked about computation. My research, broadly speaking, falls within the scope of two of those questions: What are different ways of describing a computational process? and How can information be structured to make computational processes easier to write, more efficient, or more beautiful? In this post, I’ll flesh out the first of those two questions; next time I’ll tackle the second.

What are different ways of describing a computational process?

My “subfield” within computer science is programming languages. A programming language is a system for describing computational processes—which (usually) can then be carried out by a computer. There are many, many different programming languages: Java, C, C++, C#, Python, Perl, Ruby, Scala, Clojure, Lisp, Racket, F#, Go, R, J, Haskell, OCaml, Prolog, Agda, and Coq, to name just a few. (Remember Coq?) LaTeX (used for typesetting, especially for documents involving mathematics) is actually a programming language. Mathematica and Maple each have their own built-in programming language. PostScript is a programming language for describing computations to be carried out by printers. Javascript is a programming language for describing computations to be carried out by your web browser.1 And this is really just the tip of the iceberg. Wikipedia’s list of programming languages has over six hundred entries—and those are only the “notable” ones!

So what does it look like to study programming languages? Is it like being an entomologist—collecting, studying, describing, cataloging, and comparing lots of different languages? Well… no. That could be interesting from a historical or sociological point of view—what sorts of programming languages do people invent? Why do some become popular and others languish in obscurity? etc.—but not really from a technical or mathematical point of view. Collecting butterflies is interesting because we didn’t create them, so there’s a lot we don’t know about them; there might even be some kinds we haven’t discovered yet! But the same isn’t true of programming languages.

Instead, it’s more like being an explorer—mapping out new mathematical territory in order to create new programming languages2 (or add new features to existing ones) which allow thinking at a higher level, help prevent certain kinds of mistakes, and so on. In a way, it’s really about inventing new ways to think, and mathematics is at the very heart of the process.

As a simple but far-reaching example of the interplay between mathematics and inventing new ways to describe computational processes, consider the task of taking a list of integers and adding one to each number in the list. For example, if we start with the list 1,5,8,2,4 we want to end up with the list 2,6,9,3,5. In many languages this computational process can be described something like this:

int i;
for (i = 0; i < n; i++) {
    list[i] = list[i] + 1;
}

Essentially, this says to let i take on all the values from 0 up to n (where we’re assuming n is the length of the list), and for each value of i, to add one to the value at position i in the list.

If you have written this kind of program before, this probably makes sense to you. If you haven’t, you might be kind of confused and disappointed (or at least, you should be!). My description of the computational process in English was elegant and succinct: “adding one to each number in the list”. But the above program seems to have introduced lots of extra confusing details. What’s this i thing? Why does it start at 0 instead of 1? Why do we have to mention list[i] twice? And so on.

But there’s a new mathematical idea which can help: functions which take other functions as arguments. We often define functions like f(x) = x + 3; why not something like f(g) = g(3) + g(2)? This might seem strange… but then again, perhaps it is not so strange after all. For example, if you know some calculus, note that differentiation is exactly such a function-of-functions: the differentiation operator takes a function as input and produces another function (its derivative) as output! There is a mathematical theory describing these sorts of generalized functions, called the lambda calculus—which, I should point out, was created about a decade before the first electronic digital computers (in the early 1930’s), but is now used as the basis for many programming languages.

Here’s how you would describe our example computational process in Haskell:

map (+1) list

That’s it! (+1) is a function which adds one, that is, it takes a number as input and outputs one more than that number. map is a function which takes two arguments: the second argument is a list, and the first argument is another function which describes some computational process which should be applied to every element of the list. So this mathematical idea of functions taking other functions as arguments turns out to bear practical fruit in designing languages for describing computational processes. This Haskell code is an almost direct transliteration of my English description, with only a bit of reordering: “add one” (+1) “to each number in” (map) “the list” (list).

And that’s my research in a nutshell: taking mathematical ideas and turning them into new ways of describing computation. In future installments I’ll explain what that really looks like in more detail.


  1. “What about HTML?” I hear you ask. Well, HTML doesn’t really count, because you can’t use it to describe computational processes, only what a web page looks like. It is certainly a language, but not a programming language.

  2. You might wonder, if there are already so many programming languages, why invent more? Don’t we already have enough? To which I would ask a counter question: if there already so many recipes for delicious food, why invent more? Don’t we already have enough?

This entry was posted in computation, programming and tagged , , , . Bookmark the permalink.

9 Responses to What I Do, Part 1: Programming languages

  1. This is the most succinct description of a functional language I’ve ever seen, brilliant!

  2. Nirakar Neo says:

    ok..in the Haskell case, we are utilising two functions. These functions will have code for themselves. Won’t this have effect on the runtime of the program? However, functional programming will have an advantage of portability and easier debugging, right?

    • Brent says:

      There is nothing *inherent* in using functions that would make it run more slowly. For any higher-level abstraction it is always going to be more difficult to compile it into something that runs efficiently. But there is a lot of really clever compiler technology for functional languages. OCaml and Haskell can produce very fast programs that are even close to the speed of C in many cases.

  3. Steve says:

    I heard about another expressive yet beautiful programming language, I think it’s called Befunge (http://quadium.net/funge/spec98.html), have you ever heard of it?

  4. I like what you are doing with this series of posts. I struggle with explaining what I do to my family or friends. Typically I just inwardly sigh and mumble something unintelligable. Being able to adequately explain your work to people with various levels of education and familiarity with your field is difficult. So far you seem to be doing a great job. I might steal your approach in the future.

  5. Steve says:

    In the programming language R, you can do:
    lst <- c(1,5,8,2,4)
    lst+1
    Which is slightly shorter.

    I've often thought someone ought to invent a graphical computer language. Rather than try to pack structures with complicated topologies into linear strings of characters with all sorts of weird unintuitive encodings, you could represent the structure directly. Trees and loops and arrays actually *would* be trees and loops and rectangular arrays. Variables would be specified by locations rather than names. (Although you could label it with a name, if you want.) Pointers would point, like a robot arm.

    It might look a bit Heath-Robinson, but watching a program run is sure to be fun.

  6. Nick says:

    I’m confused: Why is LaTeX a programming language and HTML not? To my (admittedly amateurish eye) they do pretty much the same thing, no?

    What is the definition of a programming language? Is LaTeX Turing complete?

    • Brent says:

      Great question! Although LaTeX and HTML are used for superficially similar purposes, they are quite different. LaTeX is, indeed, Turing-complete: you can actually use it to perform arbitrary computations in order to decide what to typeset (and people do!). You cannot do that with HTML.

      I do not know of a good definition of “programming language” that draws a bright line between things that are and things that aren’t (nor do I think there really can be such a definition). For example, “Turing-complete” is certainly not the definition, because there are systems which clearly deserve to be called programming languages which are intentionally not Turing-complete (e.g. Coq and Agda). The boundary is necessarily somewhat fuzzy; but in any event HTML and LaTeX are firmly on opposite sides of the divide.

Comments are closed.