From 2b94e8960cc3f225dec058f27570505351f4bc13 Mon Sep 17 00:00:00 2001 From: Marc Dionne Date: Wed, 17 Dec 2014 07:59:59 -0500 Subject: dm thin: fix crash by initializing thin device's refcount and completion earlier Commit 80e96c5484be ("dm thin: do not allow thin device activation while pool is suspended") delayed the initialization of a new thin device's refcount and completion until after this new thin was added to the pool's active_thins list and the pool lock is released. This opens a race with a worker thread that walks the list and calls thin_get/put, noticing that the refcount goes to 0 and calling complete, freezing up the system and giving the oops below: kernel: BUG: unable to handle kernel NULL pointer dereference at (null) kernel: IP: [] __wake_up_common+0x2b/0x90 kernel: Call Trace: kernel: [] __wake_up_locked+0x13/0x20 kernel: [] complete+0x37/0x50 kernel: [] thin_put+0x20/0x30 [dm_thin_pool] kernel: [] do_worker+0x667/0x870 [dm_thin_pool] kernel: [] ? __schedule+0x3ac/0x9a0 kernel: [] process_one_work+0x14f/0x400 kernel: [] worker_thread+0x6b/0x490 kernel: [] ? rescuer_thread+0x260/0x260 kernel: [] kthread+0xdb/0x100 kernel: [] ? kthread_create_on_node+0x170/0x170 kernel: [] ret_from_fork+0x7c/0xb0 kernel: [] ? kthread_create_on_node+0x170/0x170 Set the thin device's initial refcount and initialize the completion before adding it to the pool's active_thins list in thin_ctr(). Signed-off-by: Marc Dionne Signed-off-by: Mike Snitzer diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c index 922aa55..4934789 100644 --- a/drivers/md/dm-thin.c +++ b/drivers/md/dm-thin.c @@ -3832,6 +3832,8 @@ static int thin_ctr(struct dm_target *ti, unsigned argc, char **argv) r = -EINVAL; goto bad; } + atomic_set(&tc->refcount, 1); + init_completion(&tc->can_destroy); list_add_tail_rcu(&tc->list, &tc->pool->active_thins); spin_unlock_irqrestore(&tc->pool->lock, flags); /* @@ -3844,9 +3846,6 @@ static int thin_ctr(struct dm_target *ti, unsigned argc, char **argv) dm_put(pool_md); - atomic_set(&tc->refcount, 1); - init_completion(&tc->can_destroy); - return 0; bad: -- cgit v0.10.2