summaryrefslogtreecommitdiff
path: root/drivers/md/persistent-data/dm-transaction-manager.c
AgeCommit message (Collapse)Author
2014-11-10dm transaction manager: add support for prefetching blocks of metadataJoe Thornber
Introduce the dm_tm_issue_prefetches interface. If you're using a non-blocking clone the tm will build up a list of requested blocks that weren't in core. dm_tm_issue_prefetches will request those blocks to be prefetched. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2014-03-27dm transaction manager: fix corruption due to non-atomic transaction commitJoe Thornber
The persistent-data library used by dm-thin, dm-cache, etc is transactional. If anything goes wrong, such as an io error when writing new metadata or a power failure, then we roll back to the last transaction. Atomicity when committing a transaction is achieved by: a) Never overwriting data from the previous transaction. b) Writing the superblock last, after all other metadata has hit the disk. This commit and the following commit ("dm: take care to copy the space map roots before locking the superblock") fix a bug associated with (b). When committing it was possible for the superblock to still be written in spite of an io error occurring during the preceeding metadata flush. With these commits we're careful not to take the write lock out on the superblock until after the metadata flush has completed. Change the transaction manager's semantics for dm_tm_commit() to assume all data has been flushed _before_ the single superblock that is passed in. As a prerequisite, split the block manager's block unlocking and flushing by simplifying dm_bm_flush_and_unlock() to dm_bm_flush(). Now the unlocking must be done separately. This issue was discovered by forcing io errors at the crucial time using dm-flakey. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org
2013-02-28hlist: drop the node parameter from iteratorsSasha Levin
I'm not sure why, but the hlist for each entry iterators were conceived list_for_each_entry(pos, head, member) The hlist ones were greedy and wanted an extra parameter: hlist_for_each_entry(tpos, pos, head, member) Why did they need an extra pos parameter? I'm not quite sure. Not only they don't really need it, it also prevents the iterator from looking exactly like the list iterator, which is unfortunate. Besides the semantic patch, there was some manual work required: - Fix up the actual hlist iterators in linux/list.h - Fix up the declaration of other iterators based on the hlist ones. - A very small amount of places were using the 'node' parameter, this was modified to use 'obj->member' instead. - Coccinelle didn't handle the hlist_for_each_entry_safe iterator properly, so those had to be fixed up manually. The semantic patch which is mostly the work of Peter Senna Tschudin is here: @@ iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host; type T; expression a,c,d,e; identifier b; statement S; @@ -T b; <+... when != b ( hlist_for_each_entry(a, - b, c, d) S | hlist_for_each_entry_continue(a, - b, c) S | hlist_for_each_entry_from(a, - b, c) S | hlist_for_each_entry_rcu(a, - b, c, d) S | hlist_for_each_entry_rcu_bh(a, - b, c, d) S | hlist_for_each_entry_continue_rcu_bh(a, - b, c) S | for_each_busy_worker(a, c, - b, d) S | ax25_uid_for_each(a, - b, c) S | ax25_for_each(a, - b, c) S | inet_bind_bucket_for_each(a, - b, c) S | sctp_for_each_hentry(a, - b, c) S | sk_for_each(a, - b, c) S | sk_for_each_rcu(a, - b, c) S | sk_for_each_from -(a, b) +(a) S + sk_for_each_from(a) S | sk_for_each_safe(a, - b, c, d) S | sk_for_each_bound(a, - b, c) S | hlist_for_each_entry_safe(a, - b, c, d, e) S | hlist_for_each_entry_continue_rcu(a, - b, c) S | nr_neigh_for_each(a, - b, c) S | nr_neigh_for_each_safe(a, - b, c, d) S | nr_node_for_each(a, - b, c) S | nr_node_for_each_safe(a, - b, c, d) S | - for_each_gfn_sp(a, c, d, b) S + for_each_gfn_sp(a, c, d) S | - for_each_gfn_indirect_valid_sp(a, c, d, b) S + for_each_gfn_indirect_valid_sp(a, c, d) S | for_each_host(a, - b, c) S | for_each_host_safe(a, - b, c, d) S | for_each_mesh_entry(a, - b, c, d) S ) ...+> [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c] [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c] [akpm@linux-foundation.org: checkpatch fixes] [akpm@linux-foundation.org: fix warnings] [akpm@linux-foudnation.org: redo intrusive kvm changes] Tested-by: Peter Senna Tschudin <peter.senna@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-24drivers/md/persistent-data/dm-transaction-manager.c: rename HASH_SIZEAndrew Morton
Fix the warning: drivers/md/persistent-data/dm-transaction-manager.c:28:1: warning: "HASH_SIZE" redefined In file included from include/linux/elevator.h:5, from include/linux/blkdev.h:216, from drivers/md/persistent-data/dm-block-manager.h:11, from drivers/md/persistent-data/dm-transaction-manager.h:10, from drivers/md/persistent-data/dm-transaction-manager.c:6: include/linux/hashtable.h:22:1: warning: this is the location of the previous definition Cc: Alasdair Kergon <agk@redhat.com> Cc: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-07-27dm persistent data: stop using dm_bm_unlock_move when shadowing blocks in tmJoe Thornber
Stop using dm_bm_unlock_move when shadowing blocks in the transaction manager as an optimisation and remove the function as it is then no longer used. Some code, such as the space maps, keeps using on-disk data structures from the previous transaction. It can do this because blocks won't be reallocated until the subsequent transaction. Using dm_bm_unlock_move to copy blocks sounds like a win, but it forces a synchronous read should the old block be accessed. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-07-27dm persistent data: tidy transaction manager creation fnsJoe Thornber
Tidy the transaction manager creation functions. They no longer lock the superblock. Superblock locking is pulled out to the caller. Also export dm_bm_write_lock_zero. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-07-27dm persistent data: remove debug space map checkerJoe Thornber
Remove debug space map checker from dm persistent data. The space map checker is a wrapper for other space maps that double checks the reference counts are correct. It holds all these reference counts in memory rather than on disk, so uses a lot of memory and is thus restricted to small pools. As yet, this checker hasn't found any issues, but has caused a few of its own due to people turning it on by default with larger pools. Removing. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-07-03dm persistent data: handle space map checker creation failureMike Snitzer
If CONFIG_DM_DEBUG_SPACE_MAPS is enabled and dm_sm_checker_create() fails, dm_tm_create_internal() would still return success even though it cleaned up all resources it was supposed to have created. This will lead to a kernel crash: general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC ... RIP: 0010:[<ffffffff81593659>] [<ffffffff81593659>] dm_bufio_get_block_size+0x9/0x20 Call Trace: [<ffffffff81599bae>] dm_bm_block_size+0xe/0x10 [<ffffffff8159b8b8>] sm_ll_init+0x78/0xd0 [<ffffffff8159c1a6>] sm_ll_new_disk+0x16/0xa0 [<ffffffff8159c98e>] dm_sm_disk_create+0xfe/0x160 [<ffffffff815abf6e>] dm_pool_metadata_open+0x16e/0x6a0 [<ffffffff815aa010>] pool_ctr+0x3f0/0x900 [<ffffffff8158d565>] dm_table_add_target+0x195/0x450 [<ffffffff815904c4>] table_load+0xe4/0x330 [<ffffffff815917ea>] ctl_ioctl+0x15a/0x2c0 [<ffffffff81591963>] dm_ctl_ioctl+0x13/0x20 [<ffffffff8116a4f8>] do_vfs_ioctl+0x98/0x560 [<ffffffff8116aa51>] sys_ioctl+0x91/0xa0 [<ffffffff81869f52>] system_call_fastpath+0x16/0x1b Fix the space map checker code to return an appropriate ERR_PTR and have dm_sm_disk_create() and dm_tm_create_internal() check for it with IS_ERR. Reported-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-07-03dm persistent data: fix shadow_info_leak on dm_tm_destroyMike Snitzer
Cleanup the shadow table before destroying the transaction manager. Reference: leak was identified with kmemleak when running test_discard_random_sectors in the thinp-test-suite. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2012-06-02dm thin: provide userspace access to pool metadataJoe Thornber
This patch implements two new messages that can be sent to the thin pool target allowing it to take a snapshot of the _metadata_. This, read-only snapshot can be accessed by userland, concurrently with the live target. Only one metadata snapshot can be held at a time. The pool's status line will give the block location for the current msnap. Since version 0.1.5 of the userland thin provisioning tools, the thin_dump program displays the msnap as follows: thin_dump -m <msnap root> <metadata dev> Available here: https://github.com/jthornber/thin-provisioning-tools Now that userland can access the metadata we can do various things that have traditionally been kernel side tasks: i) Incremental backups. By using metadata snapshots we can work out what blocks have changed over time. Combined with data snapshots we can ensure the data doesn't change while we back it up. A short proof of concept script can be found here: https://github.com/jthornber/thinp-test-suite/blob/master/incremental_backup_example.rb ii) Migration of thin devices from one pool to another. iii) Merging snapshots back into an external origin. iv) Asyncronous replication. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2011-11-07drivers/md: change module.h -> export.h in persistent-data/dm-*Paul Gortmaker
For the files which are not themselves modular, we can change them to include only the smaller export.h since all they are doing is looking for EXPORT_SYMBOL. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-10-31dm: add persistent data libraryJoe Thornber
The persistent-data library offers a re-usable framework for the storage and management of on-disk metadata in device-mapper targets. It's used by the thin-provisioning target in the next patch and in an upcoming hierarchical storage target. For further information, please read Documentation/device-mapper/persistent-data.txt Signed-off-by: Joe Thornber <thornber@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>