Pages

How to deadlock

It is easy to write code that could lead to a deadlock, less easy to see it in action, given that everything related to threads is in same way out of developer control, since it is the operating system - or maybe the hardware - that has the last word on when and where a thread should be executed.

But manipulating a bit the code we could magnify the effect of bad code.

Let see a class that is (badly) designed to be used in multithreading environment:
class Deadlock : boost::noncopyable
{
private:
int v1_; // 1
int v2_;
const int SIZE_;
boost::mutex m1_;
boost::mutex m2_;

public:
Deadlock(int size) : v1_(0), v2_(0), SIZE_(size) {}

int getV1() { return v1_; }
int getV2() { return v2_; }

void f1() // 2
{
for(int i = 0; i < SIZE_; ++i)
{
boost::lock_guard<boost::mutex> l1(m1_);
boost::this_thread::yield(); // 3
boost::lock_guard<boost::mutex> l2(m2_);

std::cout << '_'; // 4
++v1_; // 5
++v2_;
}
}

void f2() // 6
{
for(int i = 0; i < SIZE_; ++i)
{
boost::lock_guard<boost::mutex> l2(m2_);
boost::this_thread::yield();
boost::lock_guard<boost::mutex> l1(m1_);

std::cout << '.';
++v1_;
++v2_;
}
}
};

1. The class is build around a couple of integer, that are protected in their usage by a mutex each.
2. We have a couple of functions, f1 and f2, that acquires both mutex and then modify the two private member variables a number of times.
3. This yield() call is made to give an hint to the thread manager that we would be happy to let other threads to butt in.
4. Some visual feedback is always welcomed.
5. And here is the real job done by the function: increasing both data members.
6. The other function looks almost the same, but the locks are created in reverse order. Firstly we acquire a lock on m2_ and then on m1_. This could easily leads to a deadlock, since it tends to put the two threads in a situation where the first one owns the lock on m1_ and the second one on m2_. Both of them, in this case, will indefinitely wait for the other one, and none of them would release its one to help the other thread in carrying on its job.

Here is a test that usually would show the problem:
TEST(Deadlock, Bad)
{
const int SIZE = 20;
Deadlock d(SIZE);

boost::thread t1(&Deadlock::f1, &d);
boost::thread t2(&Deadlock::f2, &d);

boost::this_thread::sleep(boost::posix_time::seconds(1)); // 1
ASSERT_EQ(SIZE * 2, d.getV1());
ASSERT_EQ(SIZE * 2, d.getV2());
}

1. It won't make much sense to join() the working threads, since we expect them to be deadlocked. So we just wait for a long time (an entire second!) and then check what they did in the meantime.

If no deadlock happened, the assertions would fail. Maybe on your machine this test could unexpectedly succeed, even more the once. If that is your case you could increase SIZE, or run the test more times (or both), and in the end you should get the expected behavior.

These code is so simple that it is very easy to find a way to make it right. It is enough to follow a common policy in acquiring locks, so I rewrite the second function:
void f2a()
{
for(int i = 0; i < SIZE_; ++i)
{
boost::lock_guard<boost::mutex> l1(m1_);
boost::this_thread::yield();
boost::lock_guard<boost::mutex> l2(m2_);

std::cout << '.';
++v1_;
++v2_;
}
}

And I write another test:
TEST(Deadlock, Good)
{
const int SIZE = 20;
Deadlock d(SIZE);

boost::thread t1(&Deadlock::f1, &d);
boost::thread t2(&Deadlock::f2a, &d);

boost::this_thread::sleep(boost::posix_time::seconds(1));
ASSERT_EQ(SIZE * 2, d.getV1());
ASSERT_EQ(SIZE * 2, d.getV2());
}

This one should run fine in any circumstance.

This patch is very easy to be done, but it has the issue that is all right only for simple code. We'll see in a next post a more robust and general solution.

No comments:

Post a Comment