Flying a lot, it happens once in a while that I arrive at the airport early enough to be offered to check in on an earlier flight. Usually the check-in Kiosk offers the option and lists the flight. Last year, I tried to took advantage of this offer, only to be told that the fligt was no longer available after selecting the earlier flight.
Well, not a big deal. I went to the gate, and waited for the later flight. As I tried to board it, my boarding pass in hand, I was told that there was no record of my reservation for this flight.
So what had happened? In this case, I can only speculate. But likely, a race condition occurred. Someone was being added to the earlier flight just as I was added. But before the system was able to check that my seat was actually available, I was removed from the later flight. Finally, as the change failed, the system “forgot” to place me back on the later flight.
Race conditions are very common in shared applications like web applications. Several users may try to order a limited resource. However, race conditions are not uncommon in other applications. For example in accounting, an account may be over drawn by deducting money multiple times before the balance has been adjusted.
Detecting and fixing race conditions is difficult. Usually, race conditions escape traditional testing as they frequently require multiple users to issue requests just at the wrong time, or one user sending requests carefully timed, sometimes multiple requests at exactly the same time.
This issue is usually best solved during the architecture and design phase. Critical functions that may be subject to race conditions need to be identified. Once identified, the functions can be isolated and by using techniques like proper database transactions, it is possible to isolate these operations sufficiently and roll them back completely if they should fail at any point.