From 500f5a0bf5f0624dae34307010e240ec090e4cde Mon Sep 17 00:00:00 2001 From: Frederic Weisbecker Date: Sun, 13 Dec 2009 22:48:54 +0100 Subject: reiserfs: Fix possible recursive lock While allocating the bitmap using vmalloc, we hold the reiserfs lock, which makes lockdep later reporting a possible deadlock as we may swap out pages to allocate memory and then take the reiserfs lock recursively: inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage. kswapd0/312 [HC0[0]:SC0[0]:HE1:SE1] takes: (&REISERFS_SB(s)->lock){+.+.?.}, at: [] reiserfs_write_lock+0x28/0x40 {RECLAIM_FS-ON-W} state was registered at: [] mark_held_locks+0x62/0x90 [] lockdep_trace_alloc+0x9a/0xc0 [] kmem_cache_alloc+0x26/0xf0 [] __get_vm_area_node+0x6c/0xf0 [] __vmalloc_node+0x7e/0xa0 [] vmalloc+0x2b/0x30 [] reiserfs_init_bitmap_cache+0x39/0x70 [] reiserfs_fill_super+0x2e8/0xb90 [] get_sb_bdev+0x145/0x180 [] get_super_block+0x21/0x30 [] vfs_kern_mount+0x40/0xd0 [] do_kern_mount+0x39/0xd0 [] do_mount+0x2c7/0x6b0 [] sys_mount+0x66/0xa0 [] mount_block_root+0xc4/0x245 [] mount_root+0x59/0x5f [] prepare_namespace+0x111/0x14b [] kernel_init+0xcf/0xdb [] kernel_thread_helper+0x7/0x1c This is actually fine for two reasons: we call vmalloc at mount time then it's not in the swapping out path. Also the reiserfs lock can be acquired recursively, but since its implementation depends on a mutex, it's hard and not necessary worth it to teach that to lockdep. The lock is useless at mount time anyway, at least until we replay the journal. But let's remove it from this path later as this needs more thinking and is a sensible change. For now we can just relax the lock around vmalloc, Reported-by: Alexander Beregalov Signed-off-by: Frederic Weisbecker Cc: Chris Mason Cc: Ingo Molnar Cc: Thomas Gleixner diff --git a/fs/reiserfs/bitmap.c b/fs/reiserfs/bitmap.c index 6854957..65c8727 100644 --- a/fs/reiserfs/bitmap.c +++ b/fs/reiserfs/bitmap.c @@ -1277,7 +1277,10 @@ int reiserfs_init_bitmap_cache(struct super_block *sb) struct reiserfs_bitmap_info *bitmap; unsigned int bmap_nr = reiserfs_bmap_count(sb); + /* Avoid lock recursion in fault case */ + reiserfs_write_unlock(sb); bitmap = vmalloc(sizeof(*bitmap) * bmap_nr); + reiserfs_write_lock(sb); if (bitmap == NULL) return -ENOMEM; -- cgit v0.10.2 From cb1c2e51c5a72f093b5af384b11d2f1c2abd6c13 Mon Sep 17 00:00:00 2001 From: Frederic Weisbecker Date: Sun, 13 Dec 2009 23:32:06 +0100 Subject: reiserfs: Fix reiserfs lock and journal lock inversion dependency When we were using the bkl, we didn't care about dependencies against other locks, but the mutex conversion created new ones, which is why we have reiserfs_mutex_lock_safe(), which unlocks the reiserfs lock before acquiring another mutex. But this trick actually fails if we have acquired the reiserfs lock recursively, as we try to unlock it to acquire the new mutex without inverted dependency, but we eventually only decrease its depth. This happens in the case of a nested inode creation/deletion. Say we have no space left on the device, we create an inode and tak the lock but fail to create its entry, then we release the inode using iput(), which calls reiserfs_delete_inode() that takes the reiserfs lock recursively. The path eventually ends up in journal_begin() where we try to take the journal safely but we fail because of the reiserfs lock recursion: [ INFO: possible circular locking dependency detected ] 2.6.32-06486-g053fe57 #2 ------------------------------------------------------- vi/23454 is trying to acquire lock: (&journal->j_mutex){+.+...}, at: [] do_journal_begin_r+0x64/0x2f0 but task is already holding lock: (&REISERFS_SB(s)->lock){+.+.+.}, at: [] reiserfs_write_lock+0x28/0x40 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&REISERFS_SB(s)->lock){+.+.+.}: [] validate_chain+0xa23/0xf70 [] __lock_acquire+0x4e5/0xa70 [] lock_acquire+0x7a/0xa0 [] mutex_lock_nested+0x5f/0x2b0 [] reiserfs_write_lock+0x28/0x40 [] do_journal_begin_r+0x6b/0x2f0 [] journal_begin+0x7f/0x120 [] reiserfs_remount+0x212/0x4d0 [] do_remount_sb+0x67/0x140 [] do_mount+0x436/0x6b0 [] sys_mount+0x66/0xa0 [] sysenter_do_call+0x12/0x36 -> #0 (&journal->j_mutex){+.+...}: [] validate_chain+0xf68/0xf70 [] __lock_acquire+0x4e5/0xa70 [] lock_acquire+0x7a/0xa0 [] mutex_lock_nested+0x5f/0x2b0 [] do_journal_begin_r+0x64/0x2f0 [] journal_begin+0x7f/0x120 [] reiserfs_delete_inode+0x9f/0x140 [] generic_delete_inode+0x9c/0x150 [] generic_drop_inode+0x3d/0x60 [] iput+0x47/0x50 [] reiserfs_create+0x16c/0x1c0 [] vfs_create+0xc1/0x130 [] do_filp_open+0x81c/0x920 [] do_sys_open+0x4f/0x110 [] sys_open+0x29/0x40 [] sysenter_do_call+0x12/0x36 other info that might help us debug this: 2 locks held by vi/23454: #0: (&sb->s_type->i_mutex_key#5){+.+.+.}, at: [] do_filp_open+0x27e/0x920 #1: (&REISERFS_SB(s)->lock){+.+.+.}, at: [] reiserfs_write_lock+0x28/0x40 stack backtrace: Pid: 23454, comm: vi Not tainted 2.6.32-06486-g053fe57 #2 Call Trace: [] ? printk+0x18/0x1e [] print_circular_bug+0xc0/0xd0 [] validate_chain+0xf68/0xf70 [] ? trace_hardirqs_off+0xb/0x10 [] __lock_acquire+0x4e5/0xa70 [] lock_acquire+0x7a/0xa0 [] ? do_journal_begin_r+0x64/0x2f0 [] mutex_lock_nested+0x5f/0x2b0 [] ? do_journal_begin_r+0x64/0x2f0 [] ? do_journal_begin_r+0x64/0x2f0 [] ? delete_one_xattr+0x0/0x1c0 [] do_journal_begin_r+0x64/0x2f0 [] journal_begin+0x7f/0x120 [] ? reiserfs_delete_xattrs+0x15/0x50 [] reiserfs_delete_inode+0x9f/0x140 [] ? generic_delete_inode+0x5f/0x150 [] ? reiserfs_delete_inode+0x0/0x140 [] generic_delete_inode+0x9c/0x150 [] generic_drop_inode+0x3d/0x60 [] iput+0x47/0x50 [] reiserfs_create+0x16c/0x1c0 [] ? inode_permission+0x7d/0xa0 [] vfs_create+0xc1/0x130 [] ? reiserfs_create+0x0/0x1c0 [] do_filp_open+0x81c/0x920 [] ? trace_hardirqs_off+0xb/0x10 [] ? _spin_unlock+0x1d/0x20 [] ? alloc_fd+0xba/0xf0 [] do_sys_open+0x4f/0x110 [] sys_open+0x29/0x40 [] sysenter_do_call+0x12/0x36 To fix this, use reiserfs_lock_once() from reiserfs_delete_inode() which prevents from adding reiserfs lock recursion. Reported-by: Alexander Beregalov Signed-off-by: Frederic Weisbecker Cc: Chris Mason Cc: Ingo Molnar Cc: Thomas Gleixner diff --git a/fs/reiserfs/inode.c b/fs/reiserfs/inode.c index 3a28e77..bd615df 100644 --- a/fs/reiserfs/inode.c +++ b/fs/reiserfs/inode.c @@ -31,11 +31,12 @@ void reiserfs_delete_inode(struct inode *inode) JOURNAL_PER_BALANCE_CNT * 2 + 2 * REISERFS_QUOTA_INIT_BLOCKS(inode->i_sb); struct reiserfs_transaction_handle th; + int depth; int err; truncate_inode_pages(&inode->i_data, 0); - reiserfs_write_lock(inode->i_sb); + depth = reiserfs_write_lock_once(inode->i_sb); /* The = 0 happens when we abort creating a new inode for some reason like lack of space.. */ if (!(inode->i_state & I_NEW) && INODE_PKEY(inode)->k_objectid != 0) { /* also handles bad_inode case */ @@ -74,7 +75,7 @@ void reiserfs_delete_inode(struct inode *inode) out: clear_inode(inode); /* note this must go after the journal_end to prevent deadlock */ inode->i_blocks = 0; - reiserfs_write_unlock(inode->i_sb); + reiserfs_write_unlock_once(inode->i_sb, depth); } static void _make_cpu_key(struct cpu_key *key, int version, __u32 dirid, -- cgit v0.10.2 From 47376ceba54600cec4dd9e7c4fe8b98e4269633a Mon Sep 17 00:00:00 2001 From: Frederic Weisbecker Date: Wed, 16 Dec 2009 23:25:50 +0100 Subject: reiserfs: Fix reiserfs lock <-> inode mutex dependency inversion The reiserfs lock -> inode mutex dependency gets inverted when we relax the lock while walking to the tree. To fix this, use a specialized version of reiserfs_mutex_lock_safe that takes care of mutex subclasses. Then we can grab the inode mutex with I_MUTEX_XATTR subclass without any reiserfs lock dependency. This fixes the following report: [ INFO: possible circular locking dependency detected ] 2.6.32-06793-gf405425-dirty #2 ------------------------------------------------------- mv/18566 is trying to acquire lock: (&REISERFS_SB(s)->lock){+.+.+.}, at: [] reiserfs_write_lock+0x28= /0x40 but task is already holding lock: (&sb->s_type->i_mutex_key#5/3){+.+.+.}, at: [] reiserfs_for_each_xattr+0x10c/0x380 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&sb->s_type->i_mutex_key#5/3){+.+.+.}: [] validate_chain+0xa23/0xf70 [] __lock_acquire+0x4e5/0xa70 [] lock_acquire+0x7a/0xa0 [] mutex_lock_nested+0x5f/0x2b0 [] reiserfs_for_each_xattr+0x84/0x380 [] reiserfs_delete_xattrs+0x15/0x50 [] reiserfs_delete_inode+0x8f/0x140 [] generic_delete_inode+0x9c/0x150 [] generic_drop_inode+0x3d/0x60 [] iput+0x47/0x50 [] do_unlinkat+0xdb/0x160 [] sys_unlink+0x10/0x20 [] sysenter_do_call+0x12/0x36 -> #0 (&REISERFS_SB(s)->lock){+.+.+.}: [] validate_chain+0xf68/0xf70 [] __lock_acquire+0x4e5/0xa70 [] lock_acquire+0x7a/0xa0 [] mutex_lock_nested+0x5f/0x2b0 [] reiserfs_write_lock+0x28/0x40 [] search_by_key+0x1f7b/0x21b0 [] search_by_entry_key+0x1f/0x3b0 [] reiserfs_find_entry+0x77/0x400 [] reiserfs_lookup+0x85/0x130 [] __lookup_hash+0xb4/0x110 [] lookup_one_len+0xb3/0x100 [] reiserfs_for_each_xattr+0x120/0x380 [] reiserfs_delete_xattrs+0x15/0x50 [] reiserfs_delete_inode+0x8f/0x140 [] generic_delete_inode+0x9c/0x150 [] generic_drop_inode+0x3d/0x60 [] iput+0x47/0x50 [] dentry_iput+0x6f/0xf0 [] d_kill+0x24/0x50 [] dput+0x5b/0x120 [] sys_renameat+0x1b9/0x230 [] sys_rename+0x28/0x30 [] sysenter_do_call+0x12/0x36 other info that might help us debug this: 2 locks held by mv/18566: #0: (&sb->s_type->i_mutex_key#5/1){+.+.+.}, at: [] lock_rename+0xcc/0xd0 #1: (&sb->s_type->i_mutex_key#5/3){+.+.+.}, at: [] reiserfs_for_each_xattr+0x10c/0x380 stack backtrace: Pid: 18566, comm: mv Tainted: G C 2.6.32-06793-gf405425-dirty #2 Call Trace: [] ? printk+0x18/0x1e [] print_circular_bug+0xc0/0xd0 [] validate_chain+0xf68/0xf70 [] ? trace_hardirqs_off+0xb/0x10 [] __lock_acquire+0x4e5/0xa70 [] lock_acquire+0x7a/0xa0 [] ? reiserfs_write_lock+0x28/0x40 [] mutex_lock_nested+0x5f/0x2b0 [] ? reiserfs_write_lock+0x28/0x40 [] ? reiserfs_write_lock+0x28/0x40 [] ? schedule+0x27a/0x440 [] reiserfs_write_lock+0x28/0x40 [] search_by_key+0x1f7b/0x21b0 [] ? __lock_acquire+0x506/0xa70 [] ? lock_release_non_nested+0x1e7/0x340 [] ? reiserfs_write_lock+0x28/0x40 [] ? trace_hardirqs_on_caller+0x124/0x170 [] ? trace_hardirqs_on+0xb/0x10 [] ? T.316+0x15/0x1a0 [] ? sched_clock_cpu+0x9d/0x100 [] search_by_entry_key+0x1f/0x3b0 [] ? __mutex_unlock_slowpath+0x9a/0x120 [] ? trace_hardirqs_on_caller+0x124/0x170 [] reiserfs_find_entry+0x77/0x400 [] reiserfs_lookup+0x85/0x130 [] ? sched_clock_cpu+0x9d/0x100 [] __lookup_hash+0xb4/0x110 [] lookup_one_len+0xb3/0x100 [] reiserfs_for_each_xattr+0x120/0x380 [] ? delete_one_xattr+0x0/0x1c0 [] ? math_error+0x22/0x150 [] ? reiserfs_write_lock+0x28/0x40 [] reiserfs_delete_xattrs+0x15/0x50 [] ? reiserfs_write_lock+0x28/0x40 [] reiserfs_delete_inode+0x8f/0x140 [] ? generic_delete_inode+0x5f/0x150 [] ? reiserfs_delete_inode+0x0/0x140 [] generic_delete_inode+0x9c/0x150 [] generic_drop_inode+0x3d/0x60 [] iput+0x47/0x50 [] dentry_iput+0x6f/0xf0 [] d_kill+0x24/0x50 [] dput+0x5b/0x120 [] sys_renameat+0x1b9/0x230 [] ? sched_clock_cpu+0x9d/0x100 [] ? trace_hardirqs_off+0xb/0x10 [] ? cpu_clock+0x4e/0x60 [] ? do_page_fault+0x155/0x370 [] ? up_read+0x16/0x30 [] ? do_page_fault+0x155/0x370 [] sys_rename+0x28/0x30 [] sysenter_do_call+0x12/0x36 Reported-by: Alexander Beregalov Signed-off-by: Frederic Weisbecker Cc: Chris Mason Cc: Ingo Molnar Cc: Thomas Gleixner diff --git a/fs/reiserfs/xattr.c b/fs/reiserfs/xattr.c index 58aa8e7..8891cd8 100644 --- a/fs/reiserfs/xattr.c +++ b/fs/reiserfs/xattr.c @@ -243,7 +243,8 @@ static int reiserfs_for_each_xattr(struct inode *inode, goto out_dir; } - mutex_lock_nested(&dir->d_inode->i_mutex, I_MUTEX_XATTR); + reiserfs_mutex_lock_nested_safe(&dir->d_inode->i_mutex, I_MUTEX_XATTR, + inode->i_sb); buf.xadir = dir; err = reiserfs_readdir_dentry(dir, &buf, fill_with_dentries, &pos); while ((err == 0 || err == -ENOSPC) && buf.count) { diff --git a/include/linux/reiserfs_fs.h b/include/linux/reiserfs_fs.h index a05b4a2..4351b49 100644 --- a/include/linux/reiserfs_fs.h +++ b/include/linux/reiserfs_fs.h @@ -97,6 +97,15 @@ static inline void reiserfs_mutex_lock_safe(struct mutex *m, reiserfs_write_lock(s); } +static inline void +reiserfs_mutex_lock_nested_safe(struct mutex *m, unsigned int subclass, + struct super_block *s) +{ + reiserfs_write_unlock(s); + mutex_lock_nested(m, subclass); + reiserfs_write_lock(s); +} + /* * When we schedule, we usually want to also release the write lock, * according to the previous bkl based locking scheme of reiserfs. -- cgit v0.10.2 From 98ea3f50bcc97689cc0e1fa3b6733f03aeb8fef4 Mon Sep 17 00:00:00 2001 From: Frederic Weisbecker Date: Tue, 29 Dec 2009 21:51:15 +0100 Subject: reiserfs: Fix remaining in-reclaim-fs <-> reclaim-fs-on locking inversion Commit 500f5a0bf5f0624dae34307010e240ec090e4cde (reiserfs: Fix possible recursive lock) fixed a vmalloc under reiserfs lock that triggered a lockdep warning because of a IN-FS-RECLAIM <-> RECLAIM-FS-ON locking dependency inversion. But this patch has ommitted another vmalloc call in the same path that allocates the journal. Relax the lock for this one too. Reported-by: Alexander Beregalov Signed-off-by: Frederic Weisbecker Cc: Chris Mason Cc: Ingo Molnar diff --git a/fs/reiserfs/journal.c b/fs/reiserfs/journal.c index 2f8a7e7..a059879 100644 --- a/fs/reiserfs/journal.c +++ b/fs/reiserfs/journal.c @@ -2758,11 +2758,18 @@ int journal_init(struct super_block *sb, const char *j_dev_name, struct reiserfs_journal *journal; struct reiserfs_journal_list *jl; char b[BDEVNAME_SIZE]; + int ret; + /* + * Unlock here to avoid various RECLAIM-FS-ON <-> IN-RECLAIM-FS + * dependency inversion warnings. + */ + reiserfs_write_unlock(sb); journal = SB_JOURNAL(sb) = vmalloc(sizeof(struct reiserfs_journal)); if (!journal) { reiserfs_warning(sb, "journal-1256", "unable to get memory for journal structure"); + reiserfs_write_lock(sb); return 1; } memset(journal, 0, sizeof(struct reiserfs_journal)); @@ -2771,10 +2778,12 @@ int journal_init(struct super_block *sb, const char *j_dev_name, INIT_LIST_HEAD(&journal->j_working_list); INIT_LIST_HEAD(&journal->j_journal_list); journal->j_persistent_trans = 0; - if (reiserfs_allocate_list_bitmaps(sb, - journal->j_list_bitmap, - reiserfs_bmap_count(sb))) + ret = reiserfs_allocate_list_bitmaps(sb, journal->j_list_bitmap, + reiserfs_bmap_count(sb)); + reiserfs_write_lock(sb); + if (ret) goto free_and_return; + allocate_bitmap_nodes(sb); /* reserved for journal area support */ -- cgit v0.10.2 From 0719d3434747889b314a1e8add776418c4148bcf Mon Sep 17 00:00:00 2001 From: Frederic Weisbecker Date: Wed, 30 Dec 2009 00:39:22 +0100 Subject: reiserfs: Fix reiserfs lock <-> i_xattr_sem dependency inversion i_xattr_sem depends on the reiserfs lock. But after we grab i_xattr_sem, we may relax/relock the reiserfs lock while waiting on a freezed filesystem, creating a dependency inversion between the two locks. In order to avoid the i_xattr_sem -> reiserfs lock dependency, let's create a reiserfs_down_read_safe() that acts like reiserfs_mutex_lock_safe(): relax the reiserfs lock while grabbing another lock to avoid undesired dependencies induced by the heivyweight reiserfs lock. This fixes the following warning: [ 990.005931] ======================================================= [ 990.012373] [ INFO: possible circular locking dependency detected ] [ 990.013233] 2.6.33-rc1 #1 [ 990.013233] ------------------------------------------------------- [ 990.013233] dbench/1891 is trying to acquire lock: [ 990.013233] (&REISERFS_SB(s)->lock){+.+.+.}, at: [] reiserfs_write_lock+0x35/0x50 [ 990.013233] [ 990.013233] but task is already holding lock: [ 990.013233] (&REISERFS_I(inode)->i_xattr_sem){+.+.+.}, at: [] reiserfs_xattr_set_handle+0x8a/0x470 [ 990.013233] [ 990.013233] which lock already depends on the new lock. [ 990.013233] [ 990.013233] [ 990.013233] the existing dependency chain (in reverse order) is: [ 990.013233] [ 990.013233] -> #1 (&REISERFS_I(inode)->i_xattr_sem){+.+.+.}: [ 990.013233] [] __lock_acquire+0xf9c/0x1560 [ 990.013233] [] lock_acquire+0x8f/0xb0 [ 990.013233] [] down_write+0x44/0x80 [ 990.013233] [] reiserfs_xattr_set_handle+0x8a/0x470 [ 990.013233] [] reiserfs_xattr_set+0xb0/0x150 [ 990.013233] [] user_set+0x8a/0x90 [ 990.013233] [] reiserfs_setxattr+0xaa/0xb0 [ 990.013233] [] __vfs_setxattr_noperm+0x36/0xa0 [ 990.013233] [] vfs_setxattr+0xbc/0xc0 [ 990.013233] [] setxattr+0xc0/0x150 [ 990.013233] [] sys_fsetxattr+0x8d/0xa0 [ 990.013233] [] system_call_fastpath+0x16/0x1b [ 990.013233] [ 990.013233] -> #0 (&REISERFS_SB(s)->lock){+.+.+.}: [ 990.013233] [] __lock_acquire+0x12d0/0x1560 [ 990.013233] [] lock_acquire+0x8f/0xb0 [ 990.013233] [] __mutex_lock_common+0x47/0x3b0 [ 990.013233] [] mutex_lock_nested+0x3e/0x50 [ 990.013233] [] reiserfs_write_lock+0x35/0x50 [ 990.013233] [] reiserfs_prepare_write+0x45/0x180 [ 990.013233] [] reiserfs_xattr_set_handle+0x2a6/0x470 [ 990.013233] [] reiserfs_xattr_set+0xb0/0x150 [ 990.013233] [] user_set+0x8a/0x90 [ 990.013233] [] reiserfs_setxattr+0xaa/0xb0 [ 990.013233] [] __vfs_setxattr_noperm+0x36/0xa0 [ 990.013233] [] vfs_setxattr+0xbc/0xc0 [ 990.013233] [] setxattr+0xc0/0x150 [ 990.013233] [] sys_fsetxattr+0x8d/0xa0 [ 990.013233] [] system_call_fastpath+0x16/0x1b [ 990.013233] [ 990.013233] other info that might help us debug this: [ 990.013233] [ 990.013233] 2 locks held by dbench/1891: [ 990.013233] #0: (&sb->s_type->i_mutex_key#12){+.+.+.}, at: [] vfs_setxattr+0x78/0xc0 [ 990.013233] #1: (&REISERFS_I(inode)->i_xattr_sem){+.+.+.}, at: [] reiserfs_xattr_set_handle+0x8a/0x470 [ 990.013233] [ 990.013233] stack backtrace: [ 990.013233] Pid: 1891, comm: dbench Not tainted 2.6.33-rc1 #1 [ 990.013233] Call Trace: [ 990.013233] [] print_circular_bug+0xe9/0xf0 [ 990.013233] [] __lock_acquire+0x12d0/0x1560 [ 990.013233] [] ? reiserfs_xattr_set_handle+0x8a/0x470 [ 990.013233] [] lock_acquire+0x8f/0xb0 [ 990.013233] [] ? reiserfs_write_lock+0x35/0x50 [ 990.013233] [] ? reiserfs_xattr_set_handle+0x8a/0x470 [ 990.013233] [] __mutex_lock_common+0x47/0x3b0 [ 990.013233] [] ? reiserfs_write_lock+0x35/0x50 [ 990.013233] [] ? reiserfs_write_lock+0x35/0x50 [ 990.013233] [] ? mark_held_locks+0x72/0xa0 [ 990.013233] [] ? __mutex_unlock_slowpath+0xbd/0x140 [ 990.013233] [] ? trace_hardirqs_on_caller+0x14d/0x1a0 [ 990.013233] [] mutex_lock_nested+0x3e/0x50 [ 990.013233] [] reiserfs_write_lock+0x35/0x50 [ 990.013233] [] reiserfs_prepare_write+0x45/0x180 [ 990.013233] [] reiserfs_xattr_set_handle+0x2a6/0x470 [ 990.013233] [] reiserfs_xattr_set+0xb0/0x150 [ 990.013233] [] ? __mutex_lock_common+0x284/0x3b0 [ 990.013233] [] user_set+0x8a/0x90 [ 990.013233] [] reiserfs_setxattr+0xaa/0xb0 [ 990.013233] [] __vfs_setxattr_noperm+0x36/0xa0 [ 990.013233] [] vfs_setxattr+0xbc/0xc0 [ 990.013233] [] setxattr+0xc0/0x150 [ 990.013233] [] ? sched_clock_cpu+0xb8/0x100 [ 990.013233] [] ? trace_hardirqs_off+0xd/0x10 [ 990.013233] [] ? cpu_clock+0x43/0x50 [ 990.013233] [] ? fget+0xb0/0x110 [ 990.013233] [] ? fget+0x0/0x110 [ 990.013233] [] ? sysret_check+0x27/0x62 [ 990.013233] [] sys_fsetxattr+0x8d/0xa0 [ 990.013233] [] system_call_fastpath+0x16/0x1b Reported-and-tested-by: Christian Kujau Signed-off-by: Frederic Weisbecker Cc: Alexander Beregalov Cc: Chris Mason Cc: Ingo Molnar diff --git a/fs/reiserfs/xattr.c b/fs/reiserfs/xattr.c index 8891cd8..a0e2e7a 100644 --- a/fs/reiserfs/xattr.c +++ b/fs/reiserfs/xattr.c @@ -484,7 +484,7 @@ reiserfs_xattr_set_handle(struct reiserfs_transaction_handle *th, if (IS_ERR(dentry)) return PTR_ERR(dentry); - down_write(&REISERFS_I(inode)->i_xattr_sem); + reiserfs_down_read_safe(&REISERFS_I(inode)->i_xattr_sem, inode->i_sb); xahash = xattr_hash(buffer, buffer_size); while (buffer_pos < buffer_size || buffer_pos == 0) { diff --git a/include/linux/reiserfs_fs.h b/include/linux/reiserfs_fs.h index 4351b49..35d3f45 100644 --- a/include/linux/reiserfs_fs.h +++ b/include/linux/reiserfs_fs.h @@ -106,6 +106,14 @@ reiserfs_mutex_lock_nested_safe(struct mutex *m, unsigned int subclass, reiserfs_write_lock(s); } +static inline void +reiserfs_down_read_safe(struct rw_semaphore *sem, struct super_block *s) +{ + reiserfs_write_unlock(s); + down_read(sem); + reiserfs_write_lock(s); +} + /* * When we schedule, we usually want to also release the write lock, * according to the previous bkl based locking scheme of reiserfs. -- cgit v0.10.2 From c4a62ca362258d98f42efb282cfbf9b61caffdbe Mon Sep 17 00:00:00 2001 From: Frederic Weisbecker Date: Wed, 30 Dec 2009 03:20:19 +0100 Subject: reiserfs: Warn on lock relax if taken recursively When we relax the reiserfs lock to avoid creating unwanted dependencies against others locks while grabbing these, we want to ensure it has not been taken recursively, otherwise the lock won't be really relaxed. Only its depth will be decreased. The unwanted dependency would then actually happen. To prevent from that, add a reiserfs_lock_check_recursive() call in the places that need it. Signed-off-by: Frederic Weisbecker Cc: Alexander Beregalov Cc: Chris Mason Cc: Ingo Molnar diff --git a/fs/reiserfs/lock.c b/fs/reiserfs/lock.c index ee2cfc0..b87aa2c 100644 --- a/fs/reiserfs/lock.c +++ b/fs/reiserfs/lock.c @@ -86,3 +86,12 @@ void reiserfs_check_lock_depth(struct super_block *sb, char *caller) reiserfs_panic(sb, "%s called without kernel lock held %d", caller); } + +#ifdef CONFIG_REISERFS_CHECK +void reiserfs_lock_check_recursive(struct super_block *sb) +{ + struct reiserfs_sb_info *sb_i = REISERFS_SB(sb); + + WARN_ONCE((sb_i->lock_depth > 0), "Unwanted recursive reiserfs lock!\n"); +} +#endif diff --git a/include/linux/reiserfs_fs.h b/include/linux/reiserfs_fs.h index 35d3f45..793bf83 100644 --- a/include/linux/reiserfs_fs.h +++ b/include/linux/reiserfs_fs.h @@ -62,6 +62,12 @@ void reiserfs_write_unlock(struct super_block *s); int reiserfs_write_lock_once(struct super_block *s); void reiserfs_write_unlock_once(struct super_block *s, int lock_depth); +#ifdef CONFIG_REISERFS_CHECK +void reiserfs_lock_check_recursive(struct super_block *s); +#else +static inline void reiserfs_lock_check_recursive(struct super_block *s) { } +#endif + /* * Several mutexes depend on the write lock. * However sometimes we want to relax the write lock while we hold @@ -92,6 +98,7 @@ void reiserfs_write_unlock_once(struct super_block *s, int lock_depth); static inline void reiserfs_mutex_lock_safe(struct mutex *m, struct super_block *s) { + reiserfs_lock_check_recursive(s); reiserfs_write_unlock(s); mutex_lock(m); reiserfs_write_lock(s); @@ -101,6 +108,7 @@ static inline void reiserfs_mutex_lock_nested_safe(struct mutex *m, unsigned int subclass, struct super_block *s) { + reiserfs_lock_check_recursive(s); reiserfs_write_unlock(s); mutex_lock_nested(m, subclass); reiserfs_write_lock(s); @@ -109,6 +117,7 @@ reiserfs_mutex_lock_nested_safe(struct mutex *m, unsigned int subclass, static inline void reiserfs_down_read_safe(struct rw_semaphore *sem, struct super_block *s) { + reiserfs_lock_check_recursive(s); reiserfs_write_unlock(s); down_read(sem); reiserfs_write_lock(s); -- cgit v0.10.2 From 27026a05bb805866a3b9068dda8153b72cb942f4 Mon Sep 17 00:00:00 2001 From: Frederic Weisbecker Date: Wed, 30 Dec 2009 05:06:21 +0100 Subject: reiserfs: Fix reiserfs lock <-> i_mutex dependency inversion on xattr While deleting the xattrs of an inode, we hold the reiserfs lock and grab the inode->i_mutex of the targeted inode and the root private xattr directory. Later on, we may relax the reiserfs lock for various reasons, this creates inverted dependencies. We can remove the reiserfs lock -> i_mutex dependency by relaxing the former before calling open_xa_dir(). This is fine because the lookup and creation of xattr private directories done in open_xa_dir() are covered by the targeted inode mutexes. And deeper operations in the tree are still done under the write lock. This fixes the following lockdep report: ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.32-atom #173 ------------------------------------------------------- cp/3204 is trying to acquire lock: (&REISERFS_SB(s)->lock){+.+.+.}, at: [] reiserfs_write_lock_once+0x29/0x50 but task is already holding lock: (&sb->s_type->i_mutex_key#4/3){+.+.+.}, at: [] open_xa_dir+0xd8/0x1b0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&sb->s_type->i_mutex_key#4/3){+.+.+.}: [] __lock_acquire+0x11ff/0x19e0 [] lock_acquire+0x68/0x90 [] mutex_lock_nested+0x5b/0x340 [] open_xa_dir+0x43/0x1b0 [] reiserfs_for_each_xattr+0x62/0x260 [] reiserfs_delete_xattrs+0x1a/0x60 [] reiserfs_delete_inode+0x9f/0x150 [] generic_delete_inode+0xa2/0x170 [] generic_drop_inode+0x4f/0x70 [] iput+0x47/0x50 [] do_unlinkat+0xd5/0x160 [] sys_unlink+0x10/0x20 [] sysenter_do_call+0x12/0x32 -> #0 (&REISERFS_SB(s)->lock){+.+.+.}: [] __lock_acquire+0x18f6/0x19e0 [] lock_acquire+0x68/0x90 [] mutex_lock_nested+0x5b/0x340 [] reiserfs_write_lock_once+0x29/0x50 [] reiserfs_lookup+0x62/0x140 [] __lookup_hash+0xef/0x110 [] lookup_one_len+0x8d/0xc0 [] open_xa_dir+0xea/0x1b0 [] xattr_lookup+0x15/0x160 [] reiserfs_xattr_get+0x56/0x2a0 [] reiserfs_get_acl+0xa2/0x360 [] reiserfs_cache_default_acl+0x3a/0x160 [] reiserfs_mkdir+0x6c/0x2c0 [] vfs_mkdir+0xd6/0x180 [] sys_mkdirat+0xc0/0xd0 [] sys_mkdir+0x20/0x30 [] sysenter_do_call+0x12/0x32 other info that might help us debug this: 2 locks held by cp/3204: #0: (&sb->s_type->i_mutex_key#4/1){+.+.+.}, at: [] lookup_create+0x26/0xa0 #1: (&sb->s_type->i_mutex_key#4/3){+.+.+.}, at: [] open_xa_dir+0xd8/0x1b0 stack backtrace: Pid: 3204, comm: cp Not tainted 2.6.32-atom #173 Call Trace: [] ? printk+0x18/0x1a [] print_circular_bug+0xca/0xd0 [] __lock_acquire+0x18f6/0x19e0 [] ? check_usage+0x6a/0x460 [] lock_acquire+0x68/0x90 [] ? reiserfs_write_lock_once+0x29/0x50 [] ? reiserfs_write_lock_once+0x29/0x50 [] mutex_lock_nested+0x5b/0x340 [] ? reiserfs_write_lock_once+0x29/0x50 [] reiserfs_write_lock_once+0x29/0x50 [] reiserfs_lookup+0x62/0x140 [] ? debug_check_no_locks_freed+0x8a/0x140 [] ? trace_hardirqs_on_caller+0x124/0x170 [] __lookup_hash+0xef/0x110 [] lookup_one_len+0x8d/0xc0 [] open_xa_dir+0xea/0x1b0 [] xattr_lookup+0x15/0x160 [] reiserfs_xattr_get+0x56/0x2a0 [] reiserfs_get_acl+0xa2/0x360 [] ? new_inode+0x27/0xa0 [] reiserfs_cache_default_acl+0x3a/0x160 [] ? _spin_unlock+0x27/0x40 [] reiserfs_mkdir+0x6c/0x2c0 [] ? __d_lookup+0x108/0x190 [] ? mark_held_locks+0x62/0x80 [] ? mutex_lock_nested+0x2bd/0x340 [] ? generic_permission+0x1a/0xa0 [] ? security_inode_permission+0x1e/0x20 [] vfs_mkdir+0xd6/0x180 [] sys_mkdirat+0xc0/0xd0 [] ? up_read+0x16/0x30 [] ? restore_all_notrace+0x0/0x18 [] sys_mkdir+0x20/0x30 [] sysenter_do_call+0x12/0x32 v2: Don't drop reiserfs_mutex_lock_nested_safe() as we'll still need it later Signed-off-by: Frederic Weisbecker Tested-by: Christian Kujau Cc: Alexander Beregalov Cc: Chris Mason Cc: Ingo Molnar diff --git a/fs/reiserfs/xattr.c b/fs/reiserfs/xattr.c index a0e2e7a..c320c77 100644 --- a/fs/reiserfs/xattr.c +++ b/fs/reiserfs/xattr.c @@ -234,17 +234,22 @@ static int reiserfs_for_each_xattr(struct inode *inode, if (IS_PRIVATE(inode) || get_inode_sd_version(inode) == STAT_DATA_V1) return 0; + reiserfs_write_unlock(inode->i_sb); dir = open_xa_dir(inode, XATTR_REPLACE); if (IS_ERR(dir)) { err = PTR_ERR(dir); + reiserfs_write_lock(inode->i_sb); goto out; } else if (!dir->d_inode) { err = 0; + reiserfs_write_lock(inode->i_sb); goto out_dir; } - reiserfs_mutex_lock_nested_safe(&dir->d_inode->i_mutex, I_MUTEX_XATTR, - inode->i_sb); + mutex_lock_nested(&dir->d_inode->i_mutex, I_MUTEX_XATTR); + + reiserfs_write_lock(inode->i_sb); + buf.xadir = dir; err = reiserfs_readdir_dentry(dir, &buf, fill_with_dentries, &pos); while ((err == 0 || err == -ENOSPC) && buf.count) { -- cgit v0.10.2 From 0523676d3f3aa7edeea63cc3a1bc4dc612380a26 Mon Sep 17 00:00:00 2001 From: Frederic Weisbecker Date: Wed, 30 Dec 2009 05:56:08 +0100 Subject: reiserfs: Relax reiserfs lock while freeing the journal Keeping the reiserfs lock while freeing the journal on umount path triggers a lock inversion between bdev->bd_mutex and the reiserfs lock. We don't need the reiserfs lock at this stage. The filesystem is not usable anymore, and there are no more pending commits, everything got flushed (even this operation was done in parallel and didn't required the reiserfs lock from the current process). This fixes the following lockdep report: ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.32-atom #172 ------------------------------------------------------- umount/3904 is trying to acquire lock: (&bdev->bd_mutex){+.+.+.}, at: [] __blkdev_put+0x22/0x160 but task is already holding lock: (&REISERFS_SB(s)->lock){+.+.+.}, at: [] reiserfs_write_lock+0x29/0x40 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #3 (&REISERFS_SB(s)->lock){+.+.+.}: [] __lock_acquire+0x11ff/0x19e0 [] lock_acquire+0x68/0x90 [] mutex_lock_nested+0x5b/0x340 [] reiserfs_write_lock_once+0x29/0x50 [] reiserfs_get_block+0x85/0x1620 [] do_mpage_readpage+0x1f0/0x6d0 [] mpage_readpages+0xc0/0x100 [] reiserfs_readpages+0x19/0x20 [] __do_page_cache_readahead+0x1bc/0x260 [] ra_submit+0x28/0x40 [] filemap_fault+0x40e/0x420 [] __do_fault+0x3d/0x430 [] handle_mm_fault+0x12e/0x790 [] do_page_fault+0x135/0x330 [] error_code+0x6b/0x70 [] load_elf_binary+0x82a/0x1a10 [] search_binary_handler+0x90/0x1d0 [] do_execve+0x1df/0x250 [] sys_execve+0x46/0x70 [] syscall_call+0x7/0xb -> #2 (&mm->mmap_sem){++++++}: [] __lock_acquire+0x11ff/0x19e0 [] lock_acquire+0x68/0x90 [] might_fault+0x8b/0xb0 [] copy_to_user+0x32/0x70 [] filldir64+0xa4/0xf0 [] sysfs_readdir+0x116/0x210 [] vfs_readdir+0x8d/0xb0 [] sys_getdents64+0x69/0xb0 [] sysenter_do_call+0x12/0x32 -> #1 (sysfs_mutex){+.+.+.}: [] __lock_acquire+0x11ff/0x19e0 [] lock_acquire+0x68/0x90 [] mutex_lock_nested+0x5b/0x340 [] sysfs_addrm_start+0x2c/0xb0 [] create_dir+0x40/0x90 [] sysfs_create_dir+0x2b/0x50 [] kobject_add_internal+0xc2/0x1b0 [] kobject_add_varg+0x31/0x50 [] kobject_add+0x2c/0x60 [] device_add+0x94/0x560 [] add_partition+0x18a/0x2a0 [] rescan_partitions+0x33a/0x450 [] __blkdev_get+0x12f/0x2d0 [] blkdev_get+0xa/0x10 [] register_disk+0x108/0x130 [] add_disk+0xd9/0x130 [] sd_probe_async+0x105/0x1d0 [] async_thread+0xcf/0x230 [] kthread+0x74/0x80 [] kernel_thread_helper+0x7/0x3c -> #0 (&bdev->bd_mutex){+.+.+.}: [] __lock_acquire+0x18f6/0x19e0 [] lock_acquire+0x68/0x90 [] mutex_lock_nested+0x5b/0x340 [] __blkdev_put+0x22/0x160 [] blkdev_put+0xa/0x10 [] free_journal_ram+0xd2/0x130 [] do_journal_release+0x98/0x190 [] journal_release+0xa/0x10 [] reiserfs_put_super+0x36/0x130 [] generic_shutdown_super+0x4f/0xe0 [] kill_block_super+0x25/0x40 [] reiserfs_kill_sb+0x7f/0x90 [] deactivate_super+0x7a/0x90 [] mntput_no_expire+0x98/0xd0 [] sys_umount+0x4c/0x310 [] sys_oldumount+0x19/0x20 [] sysenter_do_call+0x12/0x32 other info that might help us debug this: 2 locks held by umount/3904: #0: (&type->s_umount_key#30){+++++.}, at: [] deactivate_super+0x75/0x90 #1: (&REISERFS_SB(s)->lock){+.+.+.}, at: [] reiserfs_write_lock+0x29/0x40 stack backtrace: Pid: 3904, comm: umount Not tainted 2.6.32-atom #172 Call Trace: [] ? printk+0x18/0x1a [] print_circular_bug+0xca/0xd0 [] __lock_acquire+0x18f6/0x19e0 [] ? free_pcppages_bulk+0x1f/0x250 [] lock_acquire+0x68/0x90 [] ? __blkdev_put+0x22/0x160 [] ? __blkdev_put+0x22/0x160 [] mutex_lock_nested+0x5b/0x340 [] ? __blkdev_put+0x22/0x160 [] ? mark_held_locks+0x62/0x80 [] ? kfree+0x92/0xd0 [] __blkdev_put+0x22/0x160 [] ? trace_hardirqs_on+0xb/0x10 [] blkdev_put+0xa/0x10 [] free_journal_ram+0xd2/0x130 [] do_journal_release+0x98/0x190 [] journal_release+0xa/0x10 [] reiserfs_put_super+0x36/0x130 [] ? up_write+0x16/0x30 [] generic_shutdown_super+0x4f/0xe0 [] kill_block_super+0x25/0x40 [] ? vfs_quota_off+0x0/0x20 [] reiserfs_kill_sb+0x7f/0x90 [] deactivate_super+0x7a/0x90 [] mntput_no_expire+0x98/0xd0 [] sys_umount+0x4c/0x310 [] sys_oldumount+0x19/0x20 [] sysenter_do_call+0x12/0x32 Signed-off-by: Frederic Weisbecker Cc: Alexander Beregalov Cc: Chris Mason Cc: Ingo Molnar diff --git a/fs/reiserfs/journal.c b/fs/reiserfs/journal.c index a059879..83ac4d3 100644 --- a/fs/reiserfs/journal.c +++ b/fs/reiserfs/journal.c @@ -2009,10 +2009,11 @@ static int do_journal_release(struct reiserfs_transaction_handle *th, destroy_workqueue(commit_wq); commit_wq = NULL; } - reiserfs_write_lock(sb); free_journal_ram(sb); + reiserfs_write_lock(sb); + return 0; } -- cgit v0.10.2 From 3f14fea6bbd3444dd46a2af3a2e219e792616645 Mon Sep 17 00:00:00 2001 From: Frederic Weisbecker Date: Wed, 30 Dec 2009 07:03:53 +0100 Subject: reiserfs: Relax lock before open xattr dir in reiserfs_xattr_set_handle() We call xattr_lookup() from reiserfs_xattr_get(). We then hold the reiserfs lock when we grab the i_mutex. But later, we may relax the reiserfs lock, creating dependency inversion between both locks. The lookups and creation jobs ar already protected by the inode mutex, so we can safely relax the reiserfs lock, dropping the unwanted reiserfs lock -> i_mutex dependency, as shown in the following lockdep report: ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.32-atom #173 ------------------------------------------------------- cp/3204 is trying to acquire lock: (&REISERFS_SB(s)->lock){+.+.+.}, at: [] reiserfs_write_lock_once+0x29/0x50 but task is already holding lock: (&sb->s_type->i_mutex_key#4/3){+.+.+.}, at: [] open_xa_dir+0xd8/0x1b0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&sb->s_type->i_mutex_key#4/3){+.+.+.}: [] __lock_acquire+0x11ff/0x19e0 [] lock_acquire+0x68/0x90 [] mutex_lock_nested+0x5b/0x340 [] open_xa_dir+0x43/0x1b0 [] reiserfs_for_each_xattr+0x62/0x260 [] reiserfs_delete_xattrs+0x1a/0x60 [] reiserfs_delete_inode+0x9f/0x150 [] generic_delete_inode+0xa2/0x170 [] generic_drop_inode+0x4f/0x70 [] iput+0x47/0x50 [] do_unlinkat+0xd5/0x160 [] sys_unlink+0x10/0x20 [] sysenter_do_call+0x12/0x32 -> #0 (&REISERFS_SB(s)->lock){+.+.+.}: [] __lock_acquire+0x18f6/0x19e0 [] lock_acquire+0x68/0x90 [] mutex_lock_nested+0x5b/0x340 [] reiserfs_write_lock_once+0x29/0x50 [] reiserfs_lookup+0x62/0x140 [] __lookup_hash+0xef/0x110 [] lookup_one_len+0x8d/0xc0 [] open_xa_dir+0xea/0x1b0 [] xattr_lookup+0x15/0x160 [] reiserfs_xattr_get+0x56/0x2a0 [] reiserfs_get_acl+0xa2/0x360 [] reiserfs_cache_default_acl+0x3a/0x160 [] reiserfs_mkdir+0x6c/0x2c0 [] vfs_mkdir+0xd6/0x180 [] sys_mkdirat+0xc0/0xd0 [] sys_mkdir+0x20/0x30 [] sysenter_do_call+0x12/0x32 other info that might help us debug this: 2 locks held by cp/3204: #0: (&sb->s_type->i_mutex_key#4/1){+.+.+.}, at: [] lookup_create+0x26/0xa0 #1: (&sb->s_type->i_mutex_key#4/3){+.+.+.}, at: [] open_xa_dir+0xd8/0x1b0 stack backtrace: Pid: 3204, comm: cp Not tainted 2.6.32-atom #173 Call Trace: [] ? printk+0x18/0x1a [] print_circular_bug+0xca/0xd0 [] __lock_acquire+0x18f6/0x19e0 [] ? check_usage+0x6a/0x460 [] lock_acquire+0x68/0x90 [] ? reiserfs_write_lock_once+0x29/0x50 [] ? reiserfs_write_lock_once+0x29/0x50 [] mutex_lock_nested+0x5b/0x340 [] ? reiserfs_write_lock_once+0x29/0x50 [] reiserfs_write_lock_once+0x29/0x50 [] reiserfs_lookup+0x62/0x140 [] ? debug_check_no_locks_freed+0x8a/0x140 [] ? trace_hardirqs_on_caller+0x124/0x170 [] __lookup_hash+0xef/0x110 [] lookup_one_len+0x8d/0xc0 [] open_xa_dir+0xea/0x1b0 [] xattr_lookup+0x15/0x160 [] reiserfs_xattr_get+0x56/0x2a0 [] reiserfs_get_acl+0xa2/0x360 [] ? new_inode+0x27/0xa0 [] reiserfs_cache_default_acl+0x3a/0x160 [] ? _spin_unlock+0x27/0x40 [] reiserfs_mkdir+0x6c/0x2c0 [] ? __d_lookup+0x108/0x190 [] ? mark_held_locks+0x62/0x80 [] ? mutex_lock_nested+0x2bd/0x340 [] ? generic_permission+0x1a/0xa0 [] ? security_inode_permission+0x1e/0x20 [] vfs_mkdir+0xd6/0x180 [] sys_mkdirat+0xc0/0xd0 [] ? up_read+0x16/0x30 [] ? restore_all_notrace+0x0/0x18 [] sys_mkdir+0x20/0x30 [] sysenter_do_call+0x12/0x32 Signed-off-by: Frederic Weisbecker Tested-by: Christian Kujau Cc: Alexander Beregalov Cc: Chris Mason Cc: Ingo Molnar diff --git a/fs/reiserfs/xattr.c b/fs/reiserfs/xattr.c index c320c77..78a3f24 100644 --- a/fs/reiserfs/xattr.c +++ b/fs/reiserfs/xattr.c @@ -485,11 +485,16 @@ reiserfs_xattr_set_handle(struct reiserfs_transaction_handle *th, if (!buffer) return lookup_and_delete_xattr(inode, name); + reiserfs_write_unlock(inode->i_sb); dentry = xattr_lookup(inode, name, flags); - if (IS_ERR(dentry)) + if (IS_ERR(dentry)) { + reiserfs_write_lock(inode->i_sb); return PTR_ERR(dentry); + } - reiserfs_down_read_safe(&REISERFS_I(inode)->i_xattr_sem, inode->i_sb); + down_read(&REISERFS_I(inode)->i_xattr_sem); + + reiserfs_write_lock(inode->i_sb); xahash = xattr_hash(buffer, buffer_size); while (buffer_pos < buffer_size || buffer_pos == 0) { -- cgit v0.10.2 From c674905ca74ad0ae5b048afb1ef68663a0d7e987 Mon Sep 17 00:00:00 2001 From: Frederic Weisbecker Date: Wed, 30 Dec 2009 07:12:03 +0100 Subject: reiserfs: Fix unwanted recursive reiserfs lock in reiserfs_unlink() reiserfs_unlink() may or may not be called under the reiserfs lock. But it also takes the reiserfs lock and can then acquire it recursively which leads to do_journal_begin_r() that fails to relax the reiserfs lock before grabbing the journal mutex, creating an unexpected lock inversion. We need to ensure reiserfs_unlink() won't get the reiserfs lock recursively using reiserfs_write_lock_once(). This fixes the following warning that precedes a lock inversion report (reiserfs lock <-> journal mutex). ------------[ cut here ]------------ WARNING: at fs/reiserfs/lock.c:95 reiserfs_lock_check_recursive+0x3a/0x50() Hardware name: MS-7418 Unwanted recursive reiserfs lock! Pid: 3208, comm: dbench Not tainted 2.6.32-atom #177 Call Trace: [] ? reiserfs_lock_check_recursive+0x3a/0x50 [] ? reiserfs_lock_check_recursive+0x3a/0x50 [] warn_slowpath_common+0x67/0xc0 [] ? reiserfs_lock_check_recursive+0x3a/0x50 [] warn_slowpath_fmt+0x26/0x30 [] reiserfs_lock_check_recursive+0x3a/0x50 [] do_journal_begin_r+0x83/0x360 [] ? __lock_acquire+0x1296/0x19e0 [] ? xattr_unlink+0x57/0xb0 [] journal_begin+0x80/0x130 [] reiserfs_unlink+0x7d/0x2d0 [] ? xattr_unlink+0x57/0xb0 [] ? xattr_unlink+0x57/0xb0 [] ? xattr_unlink+0x57/0xb0 [] xattr_unlink+0x64/0xb0 [] delete_one_xattr+0x29/0x100 [] reiserfs_for_each_xattr+0x10b/0x290 [] ? delete_one_xattr+0x0/0x100 [] ? mutex_lock_nested+0x299/0x340 [] reiserfs_delete_xattrs+0x1a/0x60 [] ? reiserfs_write_lock_once+0x29/0x50 [] reiserfs_delete_inode+0x9f/0x150 [] ? _atomic_dec_and_lock+0x4f/0x70 [] ? reiserfs_delete_inode+0x0/0x150 [] generic_delete_inode+0xa2/0x170 [] generic_drop_inode+0x4f/0x70 [] iput+0x47/0x50 [] do_unlinkat+0xd5/0x160 [] ? up_read+0x16/0x30 [] ? do_page_fault+0x187/0x330 [] ? restore_all_notrace+0x0/0x18 [] ? do_page_fault+0x0/0x330 [] ? trace_hardirqs_on_caller+0x124/0x170 [] sys_unlink+0x10/0x20 [] sysenter_do_call+0x12/0x32 ---[ end trace 2e35d71a6cc69d0c ]--- Signed-off-by: Frederic Weisbecker Tested-by: Christian Kujau Cc: Alexander Beregalov Cc: Chris Mason Cc: Ingo Molnar diff --git a/fs/reiserfs/namei.c b/fs/reiserfs/namei.c index e296ff7..9d4dcf0 100644 --- a/fs/reiserfs/namei.c +++ b/fs/reiserfs/namei.c @@ -921,6 +921,7 @@ static int reiserfs_unlink(struct inode *dir, struct dentry *dentry) struct reiserfs_transaction_handle th; int jbegin_count; unsigned long savelink; + int depth; inode = dentry->d_inode; @@ -932,7 +933,7 @@ static int reiserfs_unlink(struct inode *dir, struct dentry *dentry) JOURNAL_PER_BALANCE_CNT * 2 + 2 + 4 * REISERFS_QUOTA_TRANS_BLOCKS(dir->i_sb); - reiserfs_write_lock(dir->i_sb); + depth = reiserfs_write_lock_once(dir->i_sb); retval = journal_begin(&th, dir->i_sb, jbegin_count); if (retval) goto out_unlink; @@ -993,7 +994,7 @@ static int reiserfs_unlink(struct inode *dir, struct dentry *dentry) retval = journal_end(&th, dir->i_sb, jbegin_count); reiserfs_check_path(&path); - reiserfs_write_unlock(dir->i_sb); + reiserfs_write_unlock_once(dir->i_sb, depth); return retval; end_unlink: @@ -1003,7 +1004,7 @@ static int reiserfs_unlink(struct inode *dir, struct dentry *dentry) if (err) retval = err; out_unlink: - reiserfs_write_unlock(dir->i_sb); + reiserfs_write_unlock_once(dir->i_sb, depth); return retval; } -- cgit v0.10.2 From 4dd859697f836cf62c8de08bd9a9f4b4f4beaa91 Mon Sep 17 00:00:00 2001 From: Frederic Weisbecker Date: Wed, 30 Dec 2009 07:26:28 +0100 Subject: reiserfs: Fix journal mutex <-> inode mutex lock inversion We need to relax the reiserfs lock before locking the inode mutex from xattr_unlink(), otherwise we'll face the usual bad dependencies: ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.32-atom #178 ------------------------------------------------------- rm/3202 is trying to acquire lock: (&journal->j_mutex){+.+...}, at: [] do_journal_begin_r+0x94/0x360 but task is already holding lock: (&sb->s_type->i_mutex_key#4/2){+.+...}, at: [] xattr_unlink+0x57/0xb0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&sb->s_type->i_mutex_key#4/2){+.+...}: [] __lock_acquire+0x11ff/0x19e0 [] lock_acquire+0x68/0x90 [] mutex_lock_nested+0x5b/0x340 [] xattr_unlink+0x57/0xb0 [] delete_one_xattr+0x29/0x100 [] reiserfs_for_each_xattr+0x10b/0x290 [] reiserfs_delete_xattrs+0x1a/0x60 [] reiserfs_delete_inode+0x9f/0x150 [] generic_delete_inode+0xa2/0x170 [] generic_drop_inode+0x4f/0x70 [] iput+0x47/0x50 [] do_unlinkat+0xd5/0x160 [] sys_unlinkat+0x23/0x40 [] sysenter_do_call+0x12/0x32 -> #1 (&REISERFS_SB(s)->lock){+.+.+.}: [] __lock_acquire+0x11ff/0x19e0 [] lock_acquire+0x68/0x90 [] mutex_lock_nested+0x5b/0x340 [] reiserfs_write_lock+0x29/0x40 [] do_journal_begin_r+0x9c/0x360 [] journal_begin+0x80/0x130 [] reiserfs_remount+0x223/0x4e0 [] do_remount_sb+0xa6/0x140 [] do_mount+0x560/0x750 [] sys_mount+0x84/0xb0 [] sysenter_do_call+0x12/0x32 -> #0 (&journal->j_mutex){+.+...}: [] __lock_acquire+0x18f6/0x19e0 [] lock_acquire+0x68/0x90 [] mutex_lock_nested+0x5b/0x340 [] do_journal_begin_r+0x94/0x360 [] journal_begin+0x80/0x130 [] reiserfs_unlink+0x83/0x2e0 [] xattr_unlink+0x64/0xb0 [] delete_one_xattr+0x29/0x100 [] reiserfs_for_each_xattr+0x10b/0x290 [] reiserfs_delete_xattrs+0x1a/0x60 [] reiserfs_delete_inode+0x9f/0x150 [] generic_delete_inode+0xa2/0x170 [] generic_drop_inode+0x4f/0x70 [] iput+0x47/0x50 [] do_unlinkat+0xd5/0x160 [] sys_unlinkat+0x23/0x40 [] sysenter_do_call+0x12/0x32 other info that might help us debug this: 2 locks held by rm/3202: #0: (&sb->s_type->i_mutex_key#4/3){+.+.+.}, at: [] reiserfs_for_each_xattr+0x9b/0x290 #1: (&sb->s_type->i_mutex_key#4/2){+.+...}, at: [] xattr_unlink+0x57/0xb0 stack backtrace: Pid: 3202, comm: rm Not tainted 2.6.32-atom #178 Call Trace: [] ? printk+0x18/0x1a [] print_circular_bug+0xca/0xd0 [] __lock_acquire+0x18f6/0x19e0 [] ? xattr_unlink+0x57/0xb0 [] lock_acquire+0x68/0x90 [] ? do_journal_begin_r+0x94/0x360 [] ? do_journal_begin_r+0x94/0x360 [] mutex_lock_nested+0x5b/0x340 [] ? do_journal_begin_r+0x94/0x360 [] do_journal_begin_r+0x94/0x360 [] ? run_timer_softirq+0x1a6/0x220 [] ? __do_softirq+0x50/0x140 [] journal_begin+0x80/0x130 [] ? __do_softirq+0xf2/0x140 [] ? hrtimer_interrupt+0xdf/0x220 [] reiserfs_unlink+0x83/0x2e0 [] ? mark_held_locks+0x62/0x80 [] ? trace_hardirqs_on_thunk+0xc/0x10 [] ? restore_all_notrace+0x0/0x18 [] ? xattr_unlink+0x57/0xb0 [] xattr_unlink+0x64/0xb0 [] delete_one_xattr+0x29/0x100 [] reiserfs_for_each_xattr+0x10b/0x290 [] ? delete_one_xattr+0x0/0x100 [] ? mutex_lock_nested+0x299/0x340 [] reiserfs_delete_xattrs+0x1a/0x60 [] ? reiserfs_write_lock_once+0x29/0x50 [] reiserfs_delete_inode+0x9f/0x150 [] ? _atomic_dec_and_lock+0x4f/0x70 [] ? reiserfs_delete_inode+0x0/0x150 [] generic_delete_inode+0xa2/0x170 [] generic_drop_inode+0x4f/0x70 [] iput+0x47/0x50 [] do_unlinkat+0xd5/0x160 [] ? mutex_unlock+0x8/0x10 [] ? vfs_readdir+0x7d/0xb0 [] ? filldir64+0x0/0xf0 [] ? sysenter_exit+0xf/0x16 [] ? trace_hardirqs_on_caller+0x124/0x170 [] sys_unlinkat+0x23/0x40 [] sysenter_do_call+0x12/0x32 Signed-off-by: Frederic Weisbecker Tested-by: Christian Kujau Cc: Alexander Beregalov Cc: Chris Mason Cc: Ingo Molnar diff --git a/fs/reiserfs/xattr.c b/fs/reiserfs/xattr.c index 78a3f24..8b9631d 100644 --- a/fs/reiserfs/xattr.c +++ b/fs/reiserfs/xattr.c @@ -82,7 +82,8 @@ static int xattr_unlink(struct inode *dir, struct dentry *dentry) BUG_ON(!mutex_is_locked(&dir->i_mutex)); vfs_dq_init(dir); - mutex_lock_nested(&dentry->d_inode->i_mutex, I_MUTEX_CHILD); + reiserfs_mutex_lock_nested_safe(&dentry->d_inode->i_mutex, + I_MUTEX_CHILD, dir->i_sb); error = dir->i_op->unlink(dir, dentry); mutex_unlock(&dentry->d_inode->i_mutex); -- cgit v0.10.2 From 8b513f56d4e117f11cf0760abcc030eedefc45c3 Mon Sep 17 00:00:00 2001 From: Frederic Weisbecker Date: Wed, 30 Dec 2009 07:28:58 +0100 Subject: reiserfs: Safely acquire i_mutex from reiserfs_for_each_xattr Relax the reiserfs lock before taking the inode mutex from reiserfs_for_each_xattr() to avoid the usual bad dependencies: ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.32-atom #179 ------------------------------------------------------- rm/3242 is trying to acquire lock: (&sb->s_type->i_mutex_key#4/3){+.+.+.}, at: [] reiserfs_for_each_xattr+0x23f/0x290 but task is already holding lock: (&REISERFS_SB(s)->lock){+.+.+.}, at: [] reiserfs_write_lock+0x29/0x40 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&REISERFS_SB(s)->lock){+.+.+.}: [] __lock_acquire+0x11ff/0x19e0 [] lock_acquire+0x68/0x90 [] mutex_lock_nested+0x5b/0x340 [] reiserfs_write_lock_once+0x29/0x50 [] reiserfs_lookup+0x62/0x140 [] __lookup_hash+0xef/0x110 [] lookup_one_len+0x8d/0xc0 [] open_xa_dir+0xea/0x1b0 [] reiserfs_for_each_xattr+0x70/0x290 [] reiserfs_delete_xattrs+0x1a/0x60 [] reiserfs_delete_inode+0x9f/0x150 [] generic_delete_inode+0xa2/0x170 [] generic_drop_inode+0x4f/0x70 [] iput+0x47/0x50 [] do_unlinkat+0xd5/0x160 [] sys_unlinkat+0x23/0x40 [] sysenter_do_call+0x12/0x32 -> #0 (&sb->s_type->i_mutex_key#4/3){+.+.+.}: [] __lock_acquire+0x18f6/0x19e0 [] lock_acquire+0x68/0x90 [] mutex_lock_nested+0x5b/0x340 [] reiserfs_for_each_xattr+0x23f/0x290 [] reiserfs_delete_xattrs+0x1a/0x60 [] reiserfs_delete_inode+0x9f/0x150 [] generic_delete_inode+0xa2/0x170 [] generic_drop_inode+0x4f/0x70 [] iput+0x47/0x50 [] do_unlinkat+0xd5/0x160 [] sys_unlinkat+0x23/0x40 [] sysenter_do_call+0x12/0x32 other info that might help us debug this: 1 lock held by rm/3242: #0: (&REISERFS_SB(s)->lock){+.+.+.}, at: [] reiserfs_write_lock+0x29/0x40 stack backtrace: Pid: 3242, comm: rm Not tainted 2.6.32-atom #179 Call Trace: [] ? printk+0x18/0x1a [] print_circular_bug+0xca/0xd0 [] __lock_acquire+0x18f6/0x19e0 [] ? mark_held_locks+0x62/0x80 [] ? trace_hardirqs_on+0xb/0x10 [] ? mutex_unlock+0x8/0x10 [] lock_acquire+0x68/0x90 [] ? reiserfs_for_each_xattr+0x23f/0x290 [] ? reiserfs_for_each_xattr+0x23f/0x290 [] mutex_lock_nested+0x5b/0x340 [] ? reiserfs_for_each_xattr+0x23f/0x290 [] reiserfs_for_each_xattr+0x23f/0x290 [] ? delete_one_xattr+0x0/0x100 [] reiserfs_delete_xattrs+0x1a/0x60 [] ? reiserfs_write_lock_once+0x29/0x50 [] reiserfs_delete_inode+0x9f/0x150 [] ? _atomic_dec_and_lock+0x4f/0x70 [] ? reiserfs_delete_inode+0x0/0x150 [] generic_delete_inode+0xa2/0x170 [] generic_drop_inode+0x4f/0x70 [] iput+0x47/0x50 [] do_unlinkat+0xd5/0x160 [] ? mutex_unlock+0x8/0x10 [] ? vfs_readdir+0x7d/0xb0 [] ? filldir64+0x0/0xf0 [] ? sysenter_exit+0xf/0x16 [] ? trace_hardirqs_on_caller+0x124/0x170 [] sys_unlinkat+0x23/0x40 [] sysenter_do_call+0x12/0x32 Signed-off-by: Frederic Weisbecker Tested-by: Christian Kujau Cc: Alexander Beregalov Cc: Chris Mason Cc: Ingo Molnar diff --git a/fs/reiserfs/xattr.c b/fs/reiserfs/xattr.c index 8b9631d..bfdac66 100644 --- a/fs/reiserfs/xattr.c +++ b/fs/reiserfs/xattr.c @@ -289,8 +289,9 @@ static int reiserfs_for_each_xattr(struct inode *inode, err = journal_begin(&th, inode->i_sb, blocks); if (!err) { int jerror; - mutex_lock_nested(&dir->d_parent->d_inode->i_mutex, - I_MUTEX_XATTR); + reiserfs_mutex_lock_nested_safe( + &dir->d_parent->d_inode->i_mutex, + I_MUTEX_XATTR, inode->i_sb); err = action(dir, data); jerror = journal_end(&th, inode->i_sb, blocks); mutex_unlock(&dir->d_parent->d_inode->i_mutex); -- cgit v0.10.2 From 835d5247d98f46e35d007dcfa6215e526ca33360 Mon Sep 17 00:00:00 2001 From: Frederic Weisbecker Date: Wed, 30 Dec 2009 07:40:39 +0100 Subject: reiserfs: Safely acquire i_mutex from xattr_rmdir Relax the reiserfs lock before taking the inode mutex from xattr_rmdir() to avoid the usual reiserfs lock <-> inode mutex bad dependency. Signed-off-by: Frederic Weisbecker Tested-by: Christian Kujau Cc: Alexander Beregalov Cc: Chris Mason Cc: Ingo Molnar diff --git a/fs/reiserfs/xattr.c b/fs/reiserfs/xattr.c index bfdac66..9623cfe 100644 --- a/fs/reiserfs/xattr.c +++ b/fs/reiserfs/xattr.c @@ -98,7 +98,8 @@ static int xattr_rmdir(struct inode *dir, struct dentry *dentry) BUG_ON(!mutex_is_locked(&dir->i_mutex)); vfs_dq_init(dir); - mutex_lock_nested(&dentry->d_inode->i_mutex, I_MUTEX_CHILD); + reiserfs_mutex_lock_nested_safe(&dentry->d_inode->i_mutex, + I_MUTEX_CHILD, dir->i_sb); dentry_unhash(dentry); error = dir->i_op->rmdir(dir, dentry); if (!error) -- cgit v0.10.2