Review: The little book of semaphores by Allen Downey

I’m finally getting round to typing my thoughts on this book up. I read this a little while ago (definitely before the last-merry-holiday period) so am partially using this post as a revision aid.
The plan is to first focus on the book itself and then go a little into how it fits into the larger picture of me playing around with multi-threading.


First off, I have to confess I didn’t finish all of the book. I worked through chapters 1 to 5, skipped 6 and 7 and, finally, skimmed 8 and 9. This is not to say that I didn’t enjoy the book. I think this is a truly great book (and actually pretty clever in the way it is organised too)!

In my mind, the book splits pretty neatly into three parts:

  • Chapters 1 to 3 – Introduction to synchronisation and synchronisation patterns

    In this part, the problem space is introduced (chapters 1 and 2): The start is fairly slow, but things are clearly and carefully explained. Chapter 3 is the key chapter of the book. If you only have time for one chapter, read this one. It introduces useful building blocks for tackling the synchronisation exercises in the following chapters. Beyond this, these are also the conceptual building blocks which one needs to reason about multi-threading problems. Hopefully, this will be useful when thinking through parallelisation problems in practice. (What I particularly like is that the author does not go on and on about the fact that these are design patterns. Normally this kind of things is delivered with a little rah rah rah about the theory behind patterns, etc. Instead he just presents them in an easily accessible way as building blocks for the rest of the book.)

  • Chapters 4 to 7 – Synchronisation problems

    In a sense these chapters are now “just practice exercises”. Yes – some new concepts are introduced (e.g. starvation). However, the focus here is definitely on helping readers critically understand and apply the concepts learnt in chapter 3. The problems are pretty imaginative and are the kind where you can do one on a tube journey with pen and paper (which is what I did). After a while, however, they do get a little repetitive and this is why I “quit” after chapter 5.

  • Chapters 8 & 9 – Synchronisation in practice

    These give a little flavour – not much more than that – of how to execute actual code for the synchronisation problems discussed in the earlier chapters in Python and C. Since I’m currently reading Williams’ C++ Concurrency in Action, which is more relevant for my work with C++, I only gave these a cursory glance. It seems that python multithreaded code is fairly close to the pseudo-code that is used in the rest of the book. For the C code, you’ll need to do a little more leg work. (Oh well….).

In summary, I would really recommend reading this book – in particular – if you are interested in multi-threading concepts and in developing the ability to reason about multi-threaded environments. To enjoy and gain from his book, you do not need a full understanding off the mechanics of a multi-threaded platform. (However, the book also will not give you that understanding.)

How does the book fit into my multi-threading study plan? A summary of things I learnt

The stuff the book covered fits pretty well with the big headline of understanding “multi-threaded concepts in the abstract as well as classical synchronisation problems”. I have a much better understanding of semaphores and their applications:

It’s essentially a mutex with an internal counter which gets decremented every time you call lock/wait and gets incremented every time you call unlock/signal. A thread locks if the counter goes below 0. Every time you call unlock/signal and there are sleeping threads, these will be woken up.

I also have a better understanding of atomics, race conditions, deadlocks, and starvation and how they might come about:

An atomic operation is an operation which cannot be interrupted. From an abstract point of view, it is hard/impossible to tell which high-level operations are actually un-interruptible at a hardware level. The C++ 11 standard deals with this conundrum with introducing atomics as a language facility and a memory model. (This the chapter I’m tackling next in Williams’ concurrency book so the last sentence may not actually make any sense!).

Race conditions occur when the order of execution of the threads influences the output. The classical example is that of two threads incrementing a global variable. Depending on the order of reads & writes, the global variable may end up with a value that is one higher than the start or with a value that is two higher.

A deadlock is the (well known) situation in which a thread cannot proceed because another thread is blocking access to a shared resource. However, the second thread also cannot proceed because the first thread is holding on to a resource which the second thread is looking to acquire before releasing its resource. Neither thread can proceed. (Normally this is described with two threads as above, but it could of course be a problem with multiple threads as in the dining philosopher problem). As far as I can tell, it normally comes about because of the order in which locking is done.

In a starvation scenario, not all threads may be formally locked so there is no deadlock. However, a particular thread may in practice never proceed because it is always overtaken by other threads in a queue and is waiting for them to proceed first. The book describes this in the context of the reader-writer problem where a writer may wait forever.

However, some of the other concepts I mentioned in the study plan post (lock-free programming, condition variables) are covered in much more detail by C++ concurrency in action. I’ll mention those in the review of that book.

Beyond those items that I originally mentioned in the post, I really learnt to appreciate some of the conceptual building blocks of multithreaded code such as:

  • Rendezvous (ensure ordering of statements in two threads)
  • Mutex (ensure only one thread can enter a critical section at a time)
  • Multiplex (ensure only a maximum number of threads can enter a critical section – this is what a semaphore does by default)
  • Barrier (ensure all threads have executed an initialisation before executing a further statement)
  • Turnstile (a wait statement followed by a signal – ensuring that threads pass but can be stopped in front of the turnstile)
  • Reusable barrier (same as the barrier but locking the turnstile after all threads are through)
  • Queues
  • Fifo queues (ensure ordering of threads)
  • Lightswitch (first person in a room turns on the light, last person turns it off. If the light is turned off, a different kind of person ….err… thread may enter. See reader-writers for a better description)

Multi-Threading: My study plan

One of the things I’m trying to understand better is multi-threading. This is one of those concepts where you can go as deep down the rabbit hole as you have an appetite to go down.

I have always felt that my multi-threading knowledge is “ok” for my daily needs:
I understand the normal synchronization problems with multi-threading (deadlocks; race conditions). I have an understanding of the mechanics of locks and mutexes. I have some appreciation of Windows threads – syntax, how they are created/executed with the main program I work on, how to debug them etc.
However, those things have left me uneasy that I don’t really know what is going on.

For some time, I have therefore wanted to go back to basics and build up a better understanding of multi-threading from the bottom up. My current study plan looks as follows:

  1. Better understand synchronization concepts and classic synchronization problems in the abstract

    So you know about locks & mutexes… What about semaphores and condition variables? How do atomics fit into this? What about lock-free programming? So you know about deadlocks and race conditions… What about starvation? How do you solve typical synchronization problems? (E.g. dining philosophers; readers-writers problem)

    At the moment, my plan is to devour Downey’s Little Book of Semaphores for this step: take a whole lot of notes, solve the exercises in the book and then google all of the concepts I come across I am not familiar with.

  2. Understand language specific threading concepts and implementations

    After understanding threading concepts in the abstract, it’s important to actually get round to do some implementing (think: “learning by coding” or some similar non-sense). To do that, I first need to understand what concepts languages offer. For now, I’m planning to start with the language I am most comfortable in (C++). Here, the standard book on the topic (incorporating C++ 11 threading details) seems to be Williams’ C++ Concurrency in Action. For C#, I was planning on looking through Albahari’s free ebook on threading.

    (While the C++ book comes with lots of positive reviews, I haven’t really found anything on this. So this is taking a little bit of a chance.)

    For javascript, I was only planning to have a little bit of a google around.

    (I imagine there will be a fair number of differences here depending on what execution environment we are talking about…)

  3. Implement the problems described in Downey’s book

    After that, it’ll be time to actually get my teeth into implementing the problems I have looked and solved at in the abstract in code.

Given that I am also planning on spending a fair amount of time on software processes, TDD and such Agile rubbish, I think getting through this list may well take till the end of January or so. I’ll add some posts on how I am getting on with this.