List comprehension: an elegant Python feature inspired by mathematical set theory

Even though I have now deeply entered into the fascinating world of statistical machine learning and computational chemistry, my original background is very much in pure mathematics. Having spent some of my intellectually formative years in this highly purified and abstract universe, I still love to think in terms of sets, ordered tuples and well-defined functions whenever I have the luxury of being able to do so. This might be why list comprehension is one of my favourite features in Python.

List comprehension allows you to efficiently map a function over a list using elegant notation inspired by mathematical set theory. Let us first consider a (mathematical) set

A := \{1, 3, 7 \}.

Then

B := \{ a^2 \ \vert \ a \in A  \} = \{1, 9, 49 \}

describes a new set which is obtained by mapping the function f(x) = x^2 over the set A. In Python, you can use the syntax of list comprehension to do something completely analogous with lists!

To see this, lets look at the following Python list:

A = [1, 3, 7].

Assume we want to map the square function f(x) = x^2 over this list. A slow and rather clumsy way to achieve this is by using a for-loop and the “append” command:

B = []

for a in A:
    B.append(a**2)

# B = [1, 9, 49]

However, the exact same result can be generated using the following concise and elegant expression:

B = [a**2 for a in A] # B = [1, 9, 49].

Such expressions are called list comprehensions in Python. The analogy to the set theory notation from above is obvious.

But wait, there is more. What if we want to filter the original list A in some way before mapping a function over it? For example, what if we only want to consider elements of A which are larger than 2?

Instead of writing

B = []

for a in A:
    if a > 2:
        B.append(a**2)

# B = [9, 49]

we just write

B = [a**2 for a in A if a > 2] # B = [9, 49].

You get the idea.

Of course you can also use list comprehensions merely for filtering, technically by mapping the identity function over the original list:

C = [a for a in A if a > 2] # C = [3, 7].

Not only are list comprehensions much more compact and elegant than for-loops, they also often tend to be faster. Lets do a simple experiment on my laptop. Let

R = list(range(10**7))

be the list of the first 10^7 natural numbers starting at 0. Evaluating

S = []
for r in R:
    S.append(r**2)

takes \mathbf{3.56} seconds while evaluating

S = [r**2 for r in R]

only takes \mathbf{2.33} seconds. So in this case, we get a reduction in computational time of about \mathbf{35 \%}. Time estimates were obtained by averaging over 10 independent trials.

Note that list comprehensions are certainly not restricted to lists of numbers, but can also easily be used with lists of strings, lists of lists, or lists of anything, really. There are also set- and dictionary comprehensions in Python which make it possible to directly create Python sets and Python dictionaries from Python lists in a set-theoretic manner.

My appreciation of list comprehensions is shared by the well-known MIT researcher, AI enthusiast and podcaster Lex Fridman, who even dedicated a short YouTube video to this feature. I used his video as a source for this blog entry and encourage you to watch it here:

In summary, having the concept of list comprehension at your fingertips when programming in Python will allow you to write code which is more compact, efficient and beautiful. Go play around with it!

Author