summaryrefslogtreecommitdiff
path: root/fs/btrfs/send.c
AgeCommit message (Collapse)Author
2015-02-03btrfs: kill btrfs_inode_*time helpersDavid Sterba
They just opencode taking address of the timespec member. Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com>
2014-11-25Btrfs: ensure send always works on roots without orphansFilipe Manana
Move the logic from the snapshot creation ioctl into send. This avoids doing the transaction commit if send isn't used, and ensures that if a crash/reboot happens after the transaction commit that created the snapshot and before the transaction commit that switched the commit root, send will not get a commit root that differs from the main root (that has orphan items). Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-10-04Merge branch 'cleanup/misc-for-3.18' of ↵Chris Mason
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus Signed-off-by: Chris Mason <clm@fb.com> Conflicts: fs/btrfs/extent_io.c
2014-10-03Btrfs: send, don't delay dir move if there's a new parent inodeFilipe Manana
If between two snapshots we rename an existing directory named X to Y and make it a child (direct or not) of a new inode named X, we were delaying the move/rename of the former directory unnecessarily, which would result in attempting to rename the new directory from its orphan name to name X prematurely. Minimal reproducer: $ mkfs.btrfs -f /dev/vdd $ mount /dev/vdd /mnt $ mkdir -p /mnt/merlin/RC/OSD/Source $ btrfs subvolume snapshot -r /mnt /mnt/mysnap1 $ mkdir /mnt/OSD $ mv /mnt/merlin/RC/OSD /mnt/OSD/OSD-Plane_788 $ mv /mnt/OSD /mnt/merlin/RC $ btrfs subvolume snapshot -r /mnt /mnt/mysnap2 $ btrfs send /mnt/mysnap1 -f /tmp/1.snap $ btrfs send -p /mnt/mysnap1 /mnt/mysnap2 -f /tmp/2.snap $ mkfs.btrfs -f /dev/vdc $ mount /dev/vdc /mnt2 $ btrfs receive /mnt2 -f /tmp/1.snap $ btrfs receive /mnt2 -f /tmp/2.snap The second receive (from an incremental send) failed with the following error message: "rename o261-7-0 -> merlin/RC/OSD failed". This is a regression introduced in the 3.16 kernel. A test case for xfstests follows. Reported-by: Marc Merlin <marc@merlins.org> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-10-02btrfs: hide typecast to definition of BTRFS_SEND_TRANS_STUBDavid Sterba
Signed-off-by: David Sterba <dsterba@suse.cz>
2014-09-17Btrfs: send, lower mem requirements for processing xattrsFilipe Manana
Maximum xattr size can be up to nearly the leaf size. For an fs with a leaf size larger than the page size, using kmalloc requires allocating multiple pages that are contiguous, which might not be possible if there's heavy memory fragmentation. Therefore fallback to vmalloc if we fail to allocate with kmalloc. Also start with a smaller buffer size, since xattr values typically are smaller than a page. Reported-by: Chris Murphy <lists@colorremedies.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-09-17Btrfs: fix sparse warningFabian Frederick
Fix the following sparse warning: fs/btrfs/send.c:518:51: warning: incorrect type in argument 2 (different address spaces) fs/btrfs/send.c:518:51: expected char const [noderef] <asn:1>*<noident> fs/btrfs/send.c:518:51: got char * We can safely use (const char __user *) with set_fs(KERNEL_DS) __force added to avoid sparse-all warning: fs/btrfs/send.c:518:40: warning: cast adds address space to expression (<asn:1>) Signed-off-by: Fabian Frederick <fabf@skynet.be> Reviewed-by: Zach Brown <zab@zabbo.net> Signed-off-by: Chris Mason <clm@fb.com>
2014-06-10Btrfs: send, use the right limits for xattr names and valuesFilipe Manana
We were limiting the sum of the xattr name and value lengths to PATH_MAX, which is not correct, specially on filesystems created with btrfs-progs v3.12 or higher, where the default leaf size is max(16384, PAGE_SIZE), or systems with page sizes larger than 4096 bytes. Xattrs have their own specific maximum name and value lengths, which depend on the leaf size, therefore use these limits to be able to send xattrs with sizes larger than PATH_MAX. A test case for xfstests follows. Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-06-10Btrfs: send, don't error in the presence of subvols/snapshotsFilipe Manana
If we are doing an incremental send and the base snapshot has a directory with name X that doesn't exist anymore in the second snapshot and a new subvolume/snapshot exists in the second snapshot that has the same name as the directory (name X), the incremental send would fail with -ENOENT error. This is because it attempts to lookup for an inode with a number matching the objectid of a root, which doesn't exist. Steps to reproduce: mkfs.btrfs -f /dev/sdd mount /dev/sdd /mnt mkdir /mnt/testdir btrfs subvolume snapshot -r /mnt /mnt/mysnap1 rmdir /mnt/testdir btrfs subvolume create /mnt/testdir btrfs subvolume snapshot -r /mnt /mnt/mysnap2 btrfs send -p /mnt/mysnap1 /mnt/mysnap2 -f /tmp/send.data A test case for xfstests follows. Reported-by: Robert White <rwhite@pobox.com> Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-06-10btrfs: remove stale newlines from log messagesDavid Sterba
I've noticed an extra line after "use no compression", but search revealed much more in messages of more critical levels and rare errors. Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com>
2014-06-10Btrfs: send, fix more issues related to directory renamesFilipe Manana
This is a continuation of the previous changes titled: Btrfs: fix incremental send's decision to delay a dir move/rename Btrfs: part 2, fix incremental send's decision to delay a dir move/rename There's a few more cases where a directory rename/move must be delayed which was previously overlooked. If our immediate ancestor has a lower inode number than ours and it doesn't have a delayed rename/move operation associated to it, it doesn't mean there isn't any non-direct ancestor of our current inode that needs to be renamed/moved before our current inode (i.e. with a higher inode number than ours). So we can't stop the search if our immediate ancestor has a lower inode number than ours, we need to navigate the directory hierarchy upwards until we hit the root or: 1) find an ancestor with an higher inode number that was renamed/moved in the send root too (or already has a pending rename/move registered); 2) find an ancestor that is a new directory (higher inode number than ours and exists only in the send root). Reproducer for case 1) $ mkfs.btrfs -f /dev/sdd $ mount /dev/sdd /mnt $ mkdir -p /mnt/a/b $ mkdir -p /mnt/a/c/d $ mkdir /mnt/a/b/e $ mkdir /mnt/a/c/d/f $ mv /mnt/a/b /mnt/a/c/d/2b $ mkdir /mnt/a/x $ mkdir /mnt/a/y $ btrfs subvolume snapshot -r /mnt /mnt/snap1 $ btrfs send /mnt/snap1 -f /tmp/base.send $ mv /mnt/a/x /mnt/a/y $ mv /mnt/a/c/d/2b/e /mnt/a/c/d/2b/2e $ mv /mnt/a/c/d /mnt/a/h/2d $ mv /mnt/a/c /mnt/a/h/2d/2b/2c $ btrfs subvolume snapshot -r /mnt /mnt/snap2 $ btrfs send -p /mnt/snap1 /mnt/snap2 -f /tmp/incremental.send Simple reproducer for case 2) $ mkfs.btrfs -f /dev/sdd $ mount /dev/sdd /mnt $ mkdir -p /mnt/a/b $ mkdir /mnt/a/c $ mv /mnt/a/b /mnt/a/c/b2 $ mkdir /mnt/a/e $ btrfs subvolume snapshot -r /mnt /mnt/snap1 $ btrfs send /mnt/snap1 -f /tmp/base.send $ mv /mnt/a/c/b2 /mnt/a/e/b3 $ mkdir /mnt/a/e/b3/f $ mkdir /mnt/a/h $ mv /mnt/a/c /mnt/a/e/b3/f/c2 $ mv /mnt/a/e /mnt/a/h/e2 $ btrfs subvolume snapshot -r /mnt /mnt/snap2 $ btrfs send -p /mnt/snap1 /mnt/snap2 -f /tmp/incremental.send Another simple reproducer for case 2) $ mkfs.btrfs -f /dev/sdd $ mount /dev/sdd /mnt $ mkdir -p /mnt/a/b $ mkdir /mnt/a/c $ mkdir /mnt/a/b/d $ mkdir /mnt/a/c/e $ btrfs subvolume snapshot -r /mnt /mnt/snap1 $ btrfs send /mnt/snap1 -f /tmp/base.send $ mkdir /mnt/a/b/d/f $ mkdir /mnt/a/b/g $ mv /mnt/a/c/e /mnt/a/b/g/e2 $ mv /mnt/a/c /mnt/a/b/d/f/c2 $ mv /mnt/a/b/d/f /mnt/a/b/g/e2/f2 $ btrfs subvolume snapshot -r /mnt /mnt/snap2 $ btrfs send -p /mnt/snap1 /mnt/snap2 -f /tmp/incremental.send More complex reproducer for case 2) $ mkfs.btrfs -f /dev/sdd $ mount /dev/sdd /mnt $ mkdir -p /mnt/a/b $ mkdir -p /mnt/a/c/d $ mkdir /mnt/a/b/e $ mkdir /mnt/a/c/d/f $ mv /mnt/a/b /mnt/a/c/d/2b $ mkdir /mnt/a/x $ mkdir /mnt/a/y $ btrfs subvolume snapshot -r /mnt /mnt/snap1 $ btrfs send /mnt/snap1 -f /tmp/base.send $ mv /mnt/a/x /mnt/a/y $ mv /mnt/a/c/d/2b/e /mnt/a/c/d/2b/2e $ mv /mnt/a/c/d /mnt/a/h/2d $ mv /mnt/a/c /mnt/a/h/2d/2b/2c $ btrfs subvolume snapshot -r /mnt /mnt/snap2 $ btrfs send -p /mnt/snap1 /mnt/snap2 -f /tmp/incremental.send For both cases the incremental send would enter an infinite loop when building path strings. While solving these cases, this change also re-implements the code to detect when directory moves/renames should be delayed. Instead of dealing with several specific cases separately, it's now more generic handling all cases with a simple detection algorithm and if when applying a delayed move/rename there's a path loop detected, it further delays the move/rename registering a new ancestor inode as the dependency inode (so our rename happens after that ancestor is renamed). Tests for these cases is being added to xfstests too. Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-06-10Btrfs: send, remove dead code from __get_cur_name_and_parentFilipe Manana
Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-06-10Btrfs: send, account for orphan directories when building path stringsFilipe Manana
If we have directories with a pending move/rename operation, we must take into account any orphan directories that got created before executing the pending move/rename. Those orphan directories are directories with an inode number higher then the current send progress and that don't exist in the parent snapshot, they are created before current progress reaches their inode number, with a generated name of the form oN-M-I and at the root of the filesystem tree, and later when progress matches their inode number, moved/renamed to their final location. Reproducer: $ mkfs.btrfs -f /dev/sdd $ mount /dev/sdd /mnt $ mkdir -p /mnt/a/b/c/d $ mkdir /mnt/a/b/e $ mv /mnt/a/b/c /mnt/a/b/e/CC $ mkdir /mnt/a/b/e/CC/d/f $ mkdir /mnt/a/g $ btrfs subvolume snapshot -r /mnt /mnt/snap1 $ btrfs send /mnt/snap1 -f /tmp/base.send $ mkdir /mnt/a/g/h $ mv /mnt/a/b/e /mnt/a/g/h/EE $ mv /mnt/a/g/h/EE/CC/d /mnt/a/g/h/EE/DD $ btrfs subvolume snapshot -r /mnt /mnt/snap2 $ btrfs send -p /mnt/snap1 /mnt/snap2 -f /tmp/incremental.send The second receive command failed with the following error: ERROR: rename a/b/e/CC/d -> o264-7-0/EE/DD failed. No such file or directory A test case for xfstests follows soon. Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-06-10Btrfs: send, avoid unnecessary inode item lookup in the btreeFilipe Manana
Regardless of whether the caller is interested or not in knowing the inode's generation (dir_gen != NULL), get_first_ref always does a btree lookup to get the inode item. Avoid this useless lookup if dir_gen parameter is NULL (which is in some cases). Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-06-10btrfs: protect snapshots from deleting during sendDavid Sterba
The patch "Btrfs: fix protection between send and root deletion" (18f687d538449373c37c) does not actually prevent to delete the snapshot and just takes care during background cleaning, but this seems rather user unfriendly, this patch implements the idea presented in http://www.spinics.net/lists/linux-btrfs/msg30813.html - add an internal root_item flag to denote a dead root - check if the send_in_progress is set and refuse to delete, otherwise set the flag and proceed - check the flag in send similar to the btrfs_root_readonly checks, for all involved roots The root lookup in send via btrfs_read_fs_root_no_name will check if the root is really dead or not. If it is, ENOENT, aborted send. If it's alive, it's protected by send_in_progress, send can continue. CC: Miao Xie <miaox@cn.fujitsu.com> CC: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com>
2014-06-06Btrfs: send, fix corrupted path strings for long pathsFilipe Manana
If a path has more than 230 characters, we allocate a new buffer to use for the path, but we were forgotting to copy the contents of the previous buffer into the new one, which has random content from the kmalloc call. Test: mkfs.btrfs -f /dev/sdd mount /dev/sdd /mnt TEST_PATH="/mnt/fdmanana/.config/google-chrome-mysetup/Default/Pepper_Data/Shockwave_Flash/WritableRoot/#SharedObjects/JSHJ4ZKN/s.wsj.net/[[IMPORT]]/players.edgesuite.net/flash/plugins/osmf/advanced-streaming-plugin/v2.7/osmf1.6/Ak#" mkdir -p $TEST_PATH echo "hello world" > $TEST_PATH/amaiAdvancedStreamingPlugin.txt btrfs subvolume snapshot -r /mnt /mnt/mysnap1 btrfs send /mnt/mysnap1 -f /tmp/1.snap A test for xfstests follows. Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Cc: Marc Merlin <marc@merlins.org> Tested-by: Marc MERLIN <marc@merlins.org> Signed-off-by: Chris Mason <clm@fb.com>
2014-05-20Btrfs: send, fix incorrect ref access when using extrefsFilipe Manana
When running send, if an inode only has extended reference items associated to it and no regular references, send.c:get_first_ref() was incorrectly assuming the reference it found was of type BTRFS_INODE_REF_KEY due to use of the wrong key variable. This caused weird behaviour when using the found item has a regular reference, such as weird path string, and occasionally (when lucky) a crash: [ 190.600652] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC [ 190.600994] Modules linked in: btrfs xor raid6_pq binfmt_misc nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc psmouse serio_raw evbug pcspkr i2c_piix4 e1000 floppy [ 190.602565] CPU: 2 PID: 14520 Comm: btrfs Not tainted 3.13.0-fdm-btrfs-next-26+ #1 [ 190.602728] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 190.602868] task: ffff8800d447c920 ti: ffff8801fa79e000 task.ti: ffff8801fa79e000 [ 190.603030] RIP: 0010:[<ffffffff813266b4>] [<ffffffff813266b4>] memcpy+0x54/0x110 [ 190.603262] RSP: 0018:ffff8801fa79f880 EFLAGS: 00010202 [ 190.603395] RAX: ffff8800d4326e3f RBX: 000000000000036a RCX: ffff880000000000 [ 190.603553] RDX: 000000000000032a RSI: ffe708844042936a RDI: ffff8800d43271a9 [ 190.603710] RBP: ffff8801fa79f8c8 R08: 00000000003a4ef0 R09: 0000000000000000 [ 190.603867] R10: 793a4ef09f000000 R11: 9f0000000053726f R12: ffff8800d43271a9 [ 190.604020] R13: 0000160000000000 R14: ffff8802110134f0 R15: 000000000000036a [ 190.604020] FS: 00007fb423d09b80(0000) GS:ffff880216200000(0000) knlGS:0000000000000000 [ 190.604020] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 190.604020] CR2: 00007fb4229d4b78 CR3: 00000001f5d76000 CR4: 00000000000006e0 [ 190.604020] Stack: [ 190.604020] ffffffffa01f4d49 ffff8801fa79f8f0 00000000000009f9 ffff8801fa79f8c8 [ 190.604020] 00000000000009f9 ffff880211013260 000000000000f971 ffff88021147dba8 [ 190.604020] 00000000000009f9 ffff8801fa79f918 ffffffffa02367f5 ffff8801fa79f928 [ 190.604020] Call Trace: [ 190.604020] [<ffffffffa01f4d49>] ? read_extent_buffer+0xb9/0x120 [btrfs] [ 190.604020] [<ffffffffa02367f5>] fs_path_add_from_extent_buffer+0x45/0x60 [btrfs] [ 190.604020] [<ffffffffa0238806>] get_first_ref+0x1f6/0x210 [btrfs] [ 190.604020] [<ffffffffa0238994>] __get_cur_name_and_parent+0x174/0x3a0 [btrfs] [ 190.604020] [<ffffffff8118df3d>] ? kmem_cache_alloc_trace+0x11d/0x1e0 [ 190.604020] [<ffffffffa0236674>] ? fs_path_alloc+0x24/0x60 [btrfs] [ 190.604020] [<ffffffffa0238c91>] get_cur_path+0xd1/0x240 [btrfs] (...) Steps to reproduce (either crash or some weirdness like an odd path string): mkfs.btrfs -f -O extref /dev/sdd mount /dev/sdd /mnt mkdir /mnt/testdir touch /mnt/testdir/foobar for i in `seq 1 2550`; do ln /mnt/testdir/foobar /mnt/testdir/foobar_link_`printf "%04d" $i` done ln /mnt/testdir/foobar /mnt/testdir/final_foobar_name rm -f /mnt/testdir/foobar for i in `seq 1 2550`; do rm -f /mnt/testdir/foobar_link_`printf "%04d" $i` done btrfs subvolume snapshot -r /mnt /mnt/mysnap btrfs send /mnt/mysnap -f /tmp/mysnap.send Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Chris Mason <clm@fb.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
2014-04-26Btrfs: limit the path size in send to PATH_MAXChris Mason
fs_path_ensure_buf is used to make sure our path buffers for send are big enough for the path names as we construct them. The buffer size is limited to 32K by the length field in the struct. But bugs in the path construction can end up trying to build a huge buffer, and we'll do invalid memmmoves when the buffer length field wraps. This patch is step one, preventing the overflows. Signed-off-by: Chris Mason <clm@fb.com>
2014-04-07Btrfs: send, build path string only once in send_holeFilipe Manana
There's no point building the path string in each iteration of the send_hole loop, as it produces always the same string. Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-04-07Btrfs: send, fix data corruption due to incorrect hole detectionFilipe Manana
During an incremental send, when we finish processing an inode (corresponding to a regular file) we would assume the gap between the end of the last processed file extent and the file's size corresponded to a file hole, and therefore incorrectly send a bunch of zero bytes to overwrite that region in the file. This affects only kernel 3.14. Reproducer: mkfs.btrfs -f /dev/sdc mount /dev/sdc /mnt xfs_io -f -c "falloc -k 0 268435456" /mnt/foo btrfs subvolume snapshot -r /mnt /mnt/mysnap0 xfs_io -c "pwrite -S 0x01 -b 9216 16190218 9216" /mnt/foo xfs_io -c "pwrite -S 0x02 -b 1121 198720104 1121" /mnt/foo xfs_io -c "pwrite -S 0x05 -b 9216 107887439 9216" /mnt/foo xfs_io -c "pwrite -S 0x06 -b 9216 225520207 9216" /mnt/foo xfs_io -c "pwrite -S 0x07 -b 67584 102138300 67584" /mnt/foo xfs_io -c "pwrite -S 0x08 -b 7000 94897484 7000" /mnt/foo xfs_io -c "pwrite -S 0x09 -b 113664 245083212 113664" /mnt/foo xfs_io -c "pwrite -S 0x10 -b 123 17937788 123" /mnt/foo xfs_io -c "pwrite -S 0x11 -b 39936 229573311 39936" /mnt/foo xfs_io -c "pwrite -S 0x12 -b 67584 174792222 67584" /mnt/foo xfs_io -c "pwrite -S 0x13 -b 9216 249253213 9216" /mnt/foo xfs_io -c "pwrite -S 0x16 -b 67584 150046083 67584" /mnt/foo xfs_io -c "pwrite -S 0x17 -b 39936 118246040 39936" /mnt/foo xfs_io -c "pwrite -S 0x18 -b 67584 215965442 67584" /mnt/foo xfs_io -c "pwrite -S 0x19 -b 33792 97096725 33792" /mnt/foo xfs_io -c "pwrite -S 0x20 -b 125952 166300596 125952" /mnt/foo xfs_io -c "pwrite -S 0x21 -b 123 1078957 123" /mnt/foo xfs_io -c "pwrite -S 0x25 -b 9216 212044492 9216" /mnt/foo xfs_io -c "pwrite -S 0x26 -b 7000 265037146 7000" /mnt/foo xfs_io -c "pwrite -S 0x27 -b 42757 215922685 42757" /mnt/foo xfs_io -c "pwrite -S 0x28 -b 7000 69865411 7000" /mnt/foo xfs_io -c "pwrite -S 0x29 -b 67584 67948958 67584" /mnt/foo xfs_io -c "pwrite -S 0x30 -b 39936 266967019 39936" /mnt/foo xfs_io -c "pwrite -S 0x31 -b 1121 19582453 1121" /mnt/foo xfs_io -c "pwrite -S 0x32 -b 17408 257710255 17408" /mnt/foo xfs_io -c "pwrite -S 0x33 -b 39936 3895518 39936" /mnt/foo xfs_io -c "pwrite -S 0x34 -b 125952 12045847 125952" /mnt/foo xfs_io -c "pwrite -S 0x35 -b 17408 19156379 17408" /mnt/foo xfs_io -c "pwrite -S 0x36 -b 39936 50160066 39936" /mnt/foo xfs_io -c "pwrite -S 0x37 -b 113664 9549793 113664" /mnt/foo xfs_io -c "pwrite -S 0x38 -b 105472 94391506 105472" /mnt/foo xfs_io -c "pwrite -S 0x39 -b 23552 143632863 23552" /mnt/foo xfs_io -c "pwrite -S 0x40 -b 39936 241283845 39936" /mnt/foo xfs_io -c "pwrite -S 0x41 -b 113664 199937606 113664" /mnt/foo xfs_io -c "pwrite -S 0x42 -b 67584 67380093 67584" /mnt/foo xfs_io -c "pwrite -S 0x43 -b 67584 26793129 67584" /mnt/foo xfs_io -c "pwrite -S 0x44 -b 39936 14421913 39936" /mnt/foo xfs_io -c "pwrite -S 0x45 -b 123 253097405 123" /mnt/foo xfs_io -c "pwrite -S 0x46 -b 1121 128233424 1121" /mnt/foo xfs_io -c "pwrite -S 0x47 -b 105472 91577959 105472" /mnt/foo xfs_io -c "pwrite -S 0x48 -b 1121 7245381 1121" /mnt/foo xfs_io -c "pwrite -S 0x49 -b 113664 182414694 113664" /mnt/foo xfs_io -c "pwrite -S 0x50 -b 9216 32750608 9216" /mnt/foo xfs_io -c "pwrite -S 0x51 -b 67584 266546049 67584" /mnt/foo xfs_io -c "pwrite -S 0x52 -b 67584 87969398 67584" /mnt/foo xfs_io -c "pwrite -S 0x53 -b 9216 260848797 9216" /mnt/foo xfs_io -c "pwrite -S 0x54 -b 39936 119461243 39936" /mnt/foo xfs_io -c "pwrite -S 0x55 -b 7000 200178693 7000" /mnt/foo xfs_io -c "pwrite -S 0x56 -b 9216 243316029 9216" /mnt/foo xfs_io -c "pwrite -S 0x57 -b 7000 209658229 7000" /mnt/foo xfs_io -c "pwrite -S 0x58 -b 101376 179745192 101376" /mnt/foo xfs_io -c "pwrite -S 0x59 -b 9216 64012300 9216" /mnt/foo xfs_io -c "pwrite -S 0x60 -b 125952 181705139 125952" /mnt/foo xfs_io -c "pwrite -S 0x61 -b 23552 235737348 23552" /mnt/foo xfs_io -c "pwrite -S 0x62 -b 113664 106021355 113664" /mnt/foo xfs_io -c "pwrite -S 0x63 -b 67584 135753552 67584" /mnt/foo xfs_io -c "pwrite -S 0x64 -b 23552 95730888 23552" /mnt/foo xfs_io -c "pwrite -S 0x65 -b 11 17311415 11" /mnt/foo xfs_io -c "pwrite -S 0x66 -b 33792 120695553 33792" /mnt/foo xfs_io -c "pwrite -S 0x67 -b 9216 17164631 9216" /mnt/foo xfs_io -c "pwrite -S 0x68 -b 9216 136065853 9216" /mnt/foo xfs_io -c "pwrite -S 0x69 -b 67584 37752198 67584" /mnt/foo xfs_io -c "pwrite -S 0x70 -b 101376 189717473 101376" /mnt/foo xfs_io -c "pwrite -S 0x71 -b 7000 227463698 7000" /mnt/foo xfs_io -c "pwrite -S 0x72 -b 9216 12655137 9216" /mnt/foo xfs_io -c "pwrite -S 0x73 -b 7000 7488866 7000" /mnt/foo xfs_io -c "pwrite -S 0x74 -b 113664 87813649 113664" /mnt/foo xfs_io -c "pwrite -S 0x75 -b 33792 25802183 33792" /mnt/foo xfs_io -c "pwrite -S 0x76 -b 39936 93524024 39936" /mnt/foo xfs_io -c "pwrite -S 0x77 -b 33792 113336388 33792" /mnt/foo xfs_io -c "pwrite -S 0x78 -b 105472 184955320 105472" /mnt/foo xfs_io -c "pwrite -S 0x79 -b 101376 225691598 101376" /mnt/foo xfs_io -c "pwrite -S 0x80 -b 23552 77023155 23552" /mnt/foo xfs_io -c "pwrite -S 0x81 -b 11 201888192 11" /mnt/foo xfs_io -c "pwrite -S 0x82 -b 11 115332492 11" /mnt/foo xfs_io -c "pwrite -S 0x83 -b 67584 230278015 67584" /mnt/foo xfs_io -c "pwrite -S 0x84 -b 11 120589073 11" /mnt/foo xfs_io -c "pwrite -S 0x85 -b 125952 202207819 125952" /mnt/foo xfs_io -c "pwrite -S 0x86 -b 113664 86672080 113664" /mnt/foo xfs_io -c "pwrite -S 0x87 -b 17408 208459603 17408" /mnt/foo xfs_io -c "pwrite -S 0x88 -b 7000 73372211 7000" /mnt/foo xfs_io -c "pwrite -S 0x89 -b 7000 42252122 7000" /mnt/foo xfs_io -c "pwrite -S 0x90 -b 23552 46784881 23552" /mnt/foo xfs_io -c "pwrite -S 0x91 -b 101376 63172351 101376" /mnt/foo xfs_io -c "pwrite -S 0x92 -b 23552 59341931 23552" /mnt/foo xfs_io -c "pwrite -S 0x93 -b 39936 239599283 39936" /mnt/foo xfs_io -c "pwrite -S 0x94 -b 67584 175643105 67584" /mnt/foo xfs_io -c "pwrite -S 0x97 -b 23552 105534880 23552" /mnt/foo xfs_io -c "pwrite -S 0x98 -b 113664 8236844 113664" /mnt/foo xfs_io -c "pwrite -S 0x99 -b 125952 144489686 125952" /mnt/foo xfs_io -c "pwrite -S 0xa0 -b 7000 73273112 7000" /mnt/foo xfs_io -c "pwrite -S 0xa1 -b 125952 194580243 125952" /mnt/foo xfs_io -c "pwrite -S 0xa2 -b 123 56296779 123" /mnt/foo xfs_io -c "pwrite -S 0xa3 -b 11 233066845 11" /mnt/foo xfs_io -c "pwrite -S 0xa4 -b 39936 197727090 39936" /mnt/foo xfs_io -c "pwrite -S 0xa5 -b 101376 53579812 101376" /mnt/foo xfs_io -c "pwrite -S 0xa6 -b 9216 85669738 9216" /mnt/foo xfs_io -c "pwrite -S 0xa7 -b 125952 21266322 125952" /mnt/foo xfs_io -c "pwrite -S 0xa8 -b 23552 125726568 23552" /mnt/foo xfs_io -c "pwrite -S 0xa9 -b 9216 18423680 9216" /mnt/foo xfs_io -c "pwrite -S 0xb0 -b 1121 165901483 1121" /mnt/foo btrfs subvolume snapshot -r /mnt /mnt/mysnap1 xfs_io -c "pwrite -S 0xff -b 10 16190218 10" /mnt/foo btrfs subvolume snapshot -r /mnt /mnt/mysnap2 md5sum /mnt/foo # returns 79e53f1466bfc09fd82b450689e6119e md5sum /mnt/mysnap2/foo # returns 79e53f1466bfc09fd82b450689e6119e too btrfs send /mnt/mysnap1 -f /tmp/1.snap btrfs send -p /mnt/mysnap1 /mnt/mysnap2 -f /tmp/2.snap mkfs.btrfs -f /dev/sdc mount /dev/sdc /mnt btrfs receive /mnt -f /tmp/1.snap btrfs receive /mnt -f /tmp/2.snap md5sum /mnt/mysnap2/foo # returns 2bb414c5155767cedccd7063e51beabd !! A testcase for xfstests follows soon too. Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-04-07Btrfs: hold the commit_root_sem when getting the commit root during sendJosef Bacik
We currently rely too heavily on roots being read-only to save us from just accessing root->commit_root. We can easily balance blocks out from underneath a read only root, so to save us from getting screwed make sure we only access root->commit_root under the commit root sem. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-04-07Btrfs: remove transaction from sendJosef Bacik
Lets try this again. We can deadlock the box if we send on a box and try to write onto the same fs with the app that is trying to listen to the send pipe. This is because the writer could get stuck waiting for a transaction commit which is being blocked by the send. So fix this by making sure looking at the commit roots is always going to be consistent. We do this by keeping track of which roots need to have their commit roots swapped during commit, and then taking the commit_root_sem and swapping them all at once. Then make sure we take a read lock on the commit_root_sem in cases where we search the commit root to make sure we're always looking at a consistent view of the commit roots. Previously we had problems with this because we would swap a fs tree commit root and then swap the extent tree commit root independently which would cause the backref walking code to screw up sometimes. With this patch we no longer deadlock and pass all the weird send/receive corner cases. Thanks, Reportedy-by: Hugo Mills <hugo@carfax.org.uk> Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-04-07Btrfs: don't clear uptodate if the eb is under IOJosef Bacik
So I have an awful exercise script that will run snapshot, balance and send/receive in parallel. This sometimes would crash spectacularly and when it came back up the fs would be completely hosed. Turns out this is because of a bad interaction of balance and send/receive. Send will hold onto its entire path for the whole send, but its blocks could get relocated out from underneath it, and because it doesn't old tree locks theres nothing to keep this from happening. So it will go to read in a slot with an old transid, and we could have re-allocated this block for something else and it could have a completely different transid. But because we think it is invalid we clear uptodate and re-read in the block. If we do this before we actually write out the new block we could write back stale data to the fs, and boom we're screwed. Now we definitely need to fix this disconnect between send and balance, but we really really need to not allow ourselves to accidently read in stale data over new data. So make sure we check if the extent buffer is not under io before clearing uptodate, this will kick back EIO to the caller instead of reading in stale data and keep us from corrupting the fs. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-03-21btrfs: fix uninit variable warningChris Mason
fs/btrfs/send.c:2926: warning: ‘entry’ may be used uninitialized in this function Signed-off-by: Chris Mason <clm@fb.com>
2014-03-21Btrfs: part 2, fix incremental send's decision to delay a dir move/renameFilipe Manana
For an incremental send, fix the process of determining whether the directory inode we're currently processing needs to have its move/rename operation delayed. We were ignoring the fact that if the inode's new immediate ancestor has a higher inode number than ours but wasn't renamed/moved, we might still need to delay our move/rename, because some other ancestor directory higher in the hierarchy might have an inode number higher than ours *and* was renamed/moved too - in this case we have to wait for rename/move of that ancestor to happen before our current directory's rename/move operation. Simple steps to reproduce this issue: $ mkfs.btrfs -f /dev/sdd $ mount /dev/sdd /mnt $ mkdir -p /mnt/a/x1/x2 $ mkdir /mnt/a/Z $ mkdir -p /mnt/a/x1/x2/x3/x4/x5 $ btrfs subvolume snapshot -r /mnt /mnt/snap1 $ btrfs send /mnt/snap1 -f /tmp/base.send $ mv /mnt/a/x1/x2/x3 /mnt/a/Z/X33 $ mv /mnt/a/x1/x2 /mnt/a/Z/X33/x4/x5/X22 $ btrfs subvolume snapshot -r /mnt /mnt/snap2 $ btrfs send -p /mnt/snap1 /mnt/snap2 -f /tmp/incremental.send The incremental send caused the kernel code to enter an infinite loop when building the path string for directory Z after its references are processed. A more complex scenario: $ mkfs.btrfs -f /dev/sdd $ mount /dev/sdd /mnt $ mkdir -p /mnt/a/b/c/d $ mkdir /mnt/a/b/c/d/e $ mkdir /mnt/a/b/c/d/f $ mv /mnt/a/b/c/d/e /mnt/a/b/c/d/f/E2 $ mkdir /mmt/a/b/c/g $ mv /mnt/a/b/c/d /mnt/a/b/D2 $ btrfs subvolume snapshot -r /mnt /mnt/snap1 $ btrfs send /mnt/snap1 -f /tmp/base.send $ mkdir /mnt/a/o $ mv /mnt/a/b/c/g /mnt/a/b/D2/f/G2 $ mv /mnt/a/b/D2 /mnt/a/b/dd $ mv /mnt/a/b/c /mnt/a/C2 $ mv /mnt/a/b/dd/f /mnt/a/o/FF $ mv /mnt/a/b /mnt/a/o/FF/E2/BB $ btrfs subvolume snapshot -r /mnt /mnt/snap2 $ btrfs send -p /mnt/snap1 /mnt/snap2 -f /tmp/incremental.send A test case for xfstests follows. Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-03-21Btrfs: fix incremental send's decision to delay a dir move/renameFilipe Manana
It's possible to change the parent/child relationship between directories in such a way that if a child directory has a higher inode number than its parent, it doesn't necessarily means the child rename/move operation can be performed immediately. The parent migth have its own rename/move operation delayed, therefore in this case the child needs to have its rename/move operation delayed too, and be performed after its new parent's rename/move. Steps to reproduce the issue: $ umount /mnt $ mkfs.btrfs -f /dev/sdd $ mount /dev/sdd /mnt $ mkdir /mnt/A $ mkdir /mnt/B $ mkdir /mnt/C $ mv /mnt/C /mnt/A $ mv /mnt/B /mnt/A/C $ mkdir /mnt/A/C/D $ btrfs subvolume snapshot -r /mnt /mnt/snap1 $ btrfs send /mnt/snap1 -f /tmp/base.send $ mv /mnt/A/C/D /mnt/A/D2 $ mv /mnt/A/C/B /mnt/A/D2/B2 $ mv /mnt/A/C /mnt/A/D2/B2/C2 $ btrfs subvolume snapshot -r /mnt /mnt/snap2 $ btrfs send -p /mnt/snap1 /mnt/snap2 -f /tmp/incremental.send The incremental send caused the kernel code to enter an infinite loop when building the path string for directory C after its references are processed. The necessary conditions here are that C has an inode number higher than both A and B, and B as an higher inode number higher than A, and D has the highest inode number, that is: inode_number(A) < inode_number(B) < inode_number(C) < inode_number(D) The same issue could happen if after the first snapshot there's any number of intermediary parent directories between A2 and B2, and between B2 and C2. A test case for xfstests follows, covering this simple case and more advanced ones, with files and hard links created inside the directories. Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-03-21Btrfs: remove unnecessary inode generation lookup in sendFilipe Manana
No need to search in the send tree for the generation number of the inode, we already have it in the recorded_ref structure passed to us. Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-03-10Btrfs: add readahead for send_writeLiu Bo
Btrfs send reads data from disk and then writes to a stream via pipe or a file via flush. Currently we're going to read each page a time, so every page results in a disk read, which is not friendly to disks, esp. HDD. Given that, the performance can be gained by adding readahead for those pages. Here is a quick test: $ btrfs subvolume create send $ xfs_io -f -c "pwrite 0 1G" send/foobar $ btrfs subvolume snap -r send ro $ time "btrfs send ro -f /dev/null" w/o w real 1m37.527s 0m9.097s user 0m0.122s 0m0.086s sys 0m53.191s 0m12.857s Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fb.com>
2014-03-10Btrfs: share the same code for __record_{new,deleted}_refLiu Bo
This has no functional change, only picks out the same part of two functions, and makes it shared. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Josef Bacik <jbacik@fb.com>
2014-03-10Btrfs: avoid unnecessary utimes update in incremental sendFilipe Manana
When we're finishing processing of an inode, if we're dealing with a directory inode that has a pending move/rename operation, we don't need to send a utimes update instruction to the send stream, as we'll do it later after doing the move/rename operation. Therefore we save some time here building paths and doing btree lookups. Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Josef Bacik <jbacik@fb.com>
2014-03-10Btrfs: skip search tree for REG filesLiu Bo
It is really unnecessary to search tree again for @gen, @mode and @rdev in the case of REG inodes' creation, as we've got btrfs_inode_item in sctx, and @gen, @mode and @rdev can easily be fetched. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Josef Bacik <jbacik@fb.com>
2014-03-10btrfs: send: simplify allocation code in fs_path_ensure_bufDavid Sterba
Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fb.com>
2014-03-10btrfs: send: fix old buffer length in fs_path_ensure_bufDavid Sterba
In "btrfs: send: lower memory requirements in common case" the code to save the old_buf_len was incorrectly moved to a wrong place and broke the original logic. Reported-by: Filipe David Manana <fdmanana@gmail.com> Signed-off-by: David Sterba <dsterba@suse.cz> Reviewed-by: Filipe David Manana <fdmanana@gmail.com> Signed-off-by: Josef Bacik <jbacik@fb.com>
2014-03-10Btrfs: fix send issuing outdated paths for utimes, chown and chmodFilipe Manana
When doing an incremental send, if we had a directory pending a move/rename operation and none of its parents, except for the immediate parent, were pending a move/rename, after processing the directory's references, we would be issuing utimes, chown and chmod intructions against am outdated path - a path which matched the one in the parent root. This change also simplifies a bit the code that deals with building a path for a directory which has a move/rename operation delayed. Steps to reproduce: $ mkfs.btrfs -f /dev/sdb3 $ mount /dev/sdb3 /mnt/btrfs $ mkdir -p /mnt/btrfs/a/b/c/d/e $ mkdir /mnt/btrfs/a/b/c/f $ chmod 0777 /mnt/btrfs/a/b/c/d/e $ btrfs subvolume snapshot -r /mnt/btrfs /mnt/btrfs/snap1 $ btrfs send /mnt/btrfs/snap1 -f /tmp/base.send $ mv /mnt/btrfs/a/b/c/f /mnt/btrfs/a/b/f2 $ mv /mnt/btrfs/a/b/c/d/e /mnt/btrfs/a/b/f2/e2 $ mv /mnt/btrfs/a/b/c /mnt/btrfs/a/b/c2 $ mv /mnt/btrfs/a/b/c2/d /mnt/btrfs/a/b/c2/d2 $ chmod 0700 /mnt/btrfs/a/b/f2/e2 $ btrfs subvolume snapshot -r /mnt/btrfs /mnt/btrfs/snap2 $ btrfs send -p /mnt/btrfs/snap1 /mnt/btrfs/snap2 -f /tmp/incremental.send $ umount /mnt/btrfs $ mkfs.btrfs -f /dev/sdb3 $ mount /dev/sdb3 /mnt/btrfs $ btrfs receive /mnt/btrfs -f /tmp/base.send $ btrfs receive /mnt/btrfs -f /tmp/incremental.send The second btrfs receive command failed with: ERROR: chmod a/b/c/d/e failed. No such file or directory A test case for xfstests follows. Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Josef Bacik <jbacik@fb.com>
2014-03-10Btrfs: fix send attempting to rmdir non-empty directoriesFilipe Manana
The incremental send algorithm assumed that it was possible to issue a directory remove (rmdir) if the the inode number it was currently processing was greater than (or equal) to any inode that referenced the directory's inode. This wasn't a valid assumption because any such inode might be a child directory that is pending a move/rename operation, because it was moved into a directory that has a higher inode number and was moved/renamed too - in other words, the case the following commit addressed: 9f03740a956d7ac6a1b8f8c455da6fa5cae11c22 (Btrfs: fix infinite path build loops in incremental send) This made an incremental send issue an rmdir operation before the target directory was actually empty, which made btrfs receive fail. Therefore it needs to wait for all pending child directory inodes to be moved/renamed before sending an rmdir operation. Simple steps to reproduce this issue: $ mkfs.btrfs -f /dev/sdb3 $ mount /dev/sdb3 /mnt/btrfs $ mkdir -p /mnt/btrfs/a/b/c/x $ mkdir /mnt/btrfs/a/b/y $ btrfs subvolume snapshot -r /mnt/btrfs /mnt/btrfs/snap1 $ btrfs send /mnt/btrfs/snap1 -f /tmp/base.send $ mv /mnt/btrfs/a/b/y /mnt/btrfs/a/b/YY $ mv /mnt/btrfs/a/b/c/x /mnt/btrfs/a/b/YY $ rmdir /mnt/btrfs/a/b/c $ btrfs subvolume snapshot -r /mnt/btrfs /mnt/btrfs/snap2 $ btrfs send -p /mnt/btrfs/snap1 /mnt/btrfs/snap2 -f /tmp/incremental.send $ umount /mnt/btrfs $ mkfs.btrfs -f /dev/sdb3 $ mount /dev/sdb3 /mnt/btrfs $ btrfs receive /mnt/btrfs -f /tmp/base.send $ btrfs receive /mnt/btrfs -f /tmp/incremental.send The second btrfs receive command failed with: ERROR: rmdir o259-6-0 failed. Directory not empty A test case for xfstests follows. Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Josef Bacik <jbacik@fb.com>
2014-03-10Btrfs: send, don't send rmdir for same target multiple timesFilipe Manana
When doing an incremental send, if we delete a directory that has N > 1 hardlinks for the same file and that file has the highest inode number inside the directory contents, an incremental send would send N times an rmdir operation against the directory. This made the btrfs receive command fail on the second rmdir instruction, as the target directory didn't exist anymore. Steps to reproduce the issue: $ mkfs.btrfs -f /dev/sdb3 $ mount /dev/sdb3 /mnt/btrfs $ mkdir -p /mnt/btrfs/a/b/c $ echo 'ola mundo' > /mnt/btrfs/a/b/c/foo.txt $ ln /mnt/btrfs/a/b/c/foo.txt /mnt/btrfs/a/b/c/bar.txt $ btrfs subvolume snapshot -r /mnt/btrfs /mnt/btrfs/snap1 $ btrfs send /mnt/btrfs/snap1 -f /tmp/base.send $ rm -f /mnt/btrfs/a/b/c/foo.txt $ rm -f /mnt/btrfs/a/b/c/bar.txt $ rmdir /mnt/btrfs/a/b/c $ btrfs subvolume snapshot -r /mnt/btrfs /mnt/btrfs/snap2 $ btrfs send -p /mnt/btrfs/snap1 /mnt/btrfs/snap2 -f /tmp/incremental.send $ umount /mnt/btrfs $ mkfs.btrfs -f /dev/sdb3 $ mount /dev/sdb3 /mnt/btrfs $ btrfs receive /mnt/btrfs -f /tmp/base.send $ btrfs receive /mnt/btrfs -f /tmp/incremental.send The second btrfs receive command failed with: ERROR: rmdir o259-6-0 failed. No such file or directory A test case for xfstests follows. Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Josef Bacik <jbacik@fb.com>
2014-03-10Btrfs: incremental send, fix invalid path after dir renameFilipe Manana
This fixes yet one more case not caught by the commit titled: Btrfs: fix infinite path build loops in incremental send In this case, even before the initial full send, we have a directory which is a child of a directory with a higher inode number. Then we perform the initial send, and after we rename both the child and the parent, without moving them around. After doing these 2 renames, an incremental send sent a rename instruction for the child directory which contained an invalid "from" path (referenced the parent's old name, not the new one), which made the btrfs receive command fail. Steps to reproduce: $ mkfs.btrfs -f /dev/sdb3 $ mount /dev/sdb3 /mnt/btrfs $ mkdir -p /mnt/btrfs/a/b $ mkdir /mnt/btrfs/d $ mkdir /mnt/btrfs/a/b/c $ mv /mnt/btrfs/d /mnt/btrfs/a/b/c $ btrfs subvolume snapshot -r /mnt/btrfs /mnt/btrfs/snap1 $ btrfs send /mnt/btrfs/snap1 -f /tmp/base.send $ mv /mnt/btrfs/a/b/c /mnt/btrfs/a/b/x $ mv /mnt/btrfs/a/b/x/d /mnt/btrfs/a/b/x/y $ btrfs subvolume snapshot -r /mnt/btrfs /mnt/btrfs/snap2 $ btrfs send -p /mnt/btrfs/snap1 /mnt/btrfs/snap2 -f /tmp/incremental.send $ umout /mnt/btrfs $ mkfs.btrfs -f /dev/sdb3 $ mount /dev/sdb3 /mnt/btrfs $ btrfs receive /mnt/btrfs -f /tmp/base.send $ btrfs receive /mnt/btrfs -f /tmp/incremental.send The second btrfs receive command failed with: "ERROR: rename a/b/c/d -> a/b/x/y failed. No such file or directory" A test case for xfstests follows. Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Josef Bacik <jbacik@fb.com>
2014-03-10Revert "Btrfs: remove transaction from btrfs send"Wang Shilong
This reverts commit 41ce9970a8a6a362ae8df145f7a03d789e9ef9d2. Previously i was thinking we can use readonly root's commit root safely while it is not true, readonly root may be cowed with the following cases. 1.snapshot send root will cow source root. 2.balance,device operations will also cow readonly send root to relocate. So i have two ideas to make us safe to use commit root. -->approach 1: make it protected by transaction and end transaction properly and we research next item from root node(see btrfs_search_slot_for_read()). -->approach 2: add another counter to local root structure to sync snapshot with send. and add a global counter to sync send with exclusive device operations. So with approach 2, send can use commit root safely, because we make sure send root can not be cowed during send. Unfortunately, it make codes *ugly* and more complex to maintain. To make snapshot and send exclusively, device operations and send operation exclusively with each other is a little confusing for common users. So why not drop into previous way. Cc: Josef Bacik <jbacik@fb.com> Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fb.com>
2014-03-10btrfs: send: lower memory requirements in common caseDavid Sterba
The fs_path structure uses an inline buffer and falls back to a chain of allocations, but vmalloc is not necessary because PATH_MAX fits into PAGE_SIZE. The size of fs_path has been reduced to 256 bytes from PAGE_SIZE, usually 4k. Experimental measurements show that most paths on a single filesystem do not exceed 200 bytes, and these get stored into the inline buffer directly, which is now 230 bytes. Longer paths are kmalloced when needed. Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fb.com>
2014-03-10Btrfs: make some tree searches in send.c more efficientFilipe David Borba Manana
We have this pattern where we do search for a contiguous group of items in a tree and everytime we find an item, we process it, then we release our path, increment the offset of the search key, do another full tree search and repeat these steps until a tree search can't find more items we're interested in. Instead of doing these full tree searches after processing each item, just process the next item/slot in our leaf and don't release the path. Since all these trees are read only and we always use the commit root for a search and skip node/leaf locks, we're not affecting concurrency on the trees. Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Josef Bacik <jbacik@fb.com>
2014-03-10Btrfs: use right extent item position in send when finding extent clonesFilipe David Borba Manana
This was a leftover from the commit: 74dd17fbe3d65829e75d84f00a9525b2ace93998 (Btrfs: fix btrfs send for inline items and compression) Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Josef Bacik <jbacik@fb.com>
2014-03-10btrfs: send: remove BUG_ON from name_cache_deleteDavid Sterba
If cleaning the name cache fails, we could try to proceed at the cost of some memory leak. This is not expected to happen often. Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fb.com>
2014-03-10btrfs: send: remove BUG from process_all_refsDavid Sterba
There are only 2 static callers, the BUG would normally be never reached, but let's be nice. Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fb.com>
2014-03-10btrfs: send: squeeze bitfilelds in fs_pathDavid Sterba
We know that buf_len is at most PATH_MAX, 4k, and can merge it with the reversed member. This saves 3 bytes in favor of inline_buf. Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fb.com>
2014-03-10btrfs: send: remove virtual_mem member from fs_pathDavid Sterba
We don't need to keep track of that, it's available via is_vmalloc_addr. Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fb.com>
2014-03-10btrfs: send: remove prepared member from fs_pathDavid Sterba
The member is used only to return value back from fs_path_prepare_for_add, we can do it locally and save 8 bytes for the inline_buf path. Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fb.com>
2014-03-10btrfs: send: replace check with an assert in gen_unique_nameDavid Sterba
The buffer passed to snprintf can hold the fully expanded format string, 64 = 3x largest ULL + 3x char + trailing null. I don't think that removing the check entirely is a good idea, hence the ASSERT. Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fb.com>
2014-03-10Btrfs: more send support for parent/child dir relationship inversionFilipe David Borba Manana
The commit titled "Btrfs: fix infinite path build loops in incremental send" didn't cover a particular case where the parent-child relationship inversion of directories doesn't imply a rename of the new parent directory. This was due to a simple logic mistake, a logical and instead of a logical or. Steps to reproduce: $ mkfs.btrfs -f /dev/sdb3 $ mount /dev/sdb3 /mnt/btrfs $ mkdir -p /mnt/btrfs/a/b/bar1/bar2/bar3/bar4 $ btrfs subvol snapshot -r /mnt/btrfs /mnt/btrfs/snap1 $ mv /mnt/btrfs/a/b/bar1/bar2/bar3/bar4 /mnt/btrfs/a/b/k44 $ mv /mnt/btrfs/a/b/bar1/bar2/bar3 /mnt/btrfs/a/b/k44 $ mv /mnt/btrfs/a/b/bar1/bar2 /mnt/btrfs/a/b/k44/bar3 $ mv /mnt/btrfs/a/b/bar1 /mnt/btrfs/a/b/k44/bar3/bar2/k11 $ btrfs subvol snapshot -r /mnt/btrfs /mnt/btrfs/snap2 $ btrfs send -p /mnt/btrfs/snap1 /mnt/btrfs/snap2 > /tmp/incremental.send A patch to update the test btrfs/030 from xfstests, so that it covers this case, will be submitted soon. Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Josef Bacik <jbacik@fb.com>
2014-03-10Btrfs: fix send dealing with file renames and directory movesFilipe David Borba Manana
This fixes a case that the commit titled: Btrfs: fix infinite path build loops in incremental send didn't cover. If the parent-child relationship between 2 directories is inverted, both get renamed, and the former parent has a file that got renamed too (but remains a child of that directory), the incremental send operation would use the file's old path after sending an unlink operation for that old path, causing receive to fail on future operations like changing owner, permissions or utimes of the corresponding inode. This is not a regression from the commit mentioned before, as without that commit we would fall into the issues that commit fixed, so it's just one case that wasn't covered before. Simple steps to reproduce this issue are: $ mkfs.btrfs -f /dev/sdb3 $ mount /dev/sdb3 /mnt/btrfs $ mkdir -p /mnt/btrfs/a/b/c/d $ touch /mnt/btrfs/a/b/c/d/file $ mkdir -p /mnt/btrfs/a/b/x $ btrfs subvol snapshot -r /mnt/btrfs /mnt/btrfs/snap1 $ mv /mnt/btrfs/a/b/x /mnt/btrfs/a/b/c/x2 $ mv /mnt/btrfs/a/b/c/d /mnt/btrfs/a/b/c/x2/d2 $ mv /mnt/btrfs/a/b/c/x2/d2/file /mnt/btrfs/a/b/c/x2/d2/file2 $ btrfs subvol snapshot -r /mnt/btrfs /mnt/btrfs/snap2 $ btrfs send -p /mnt/btrfs/snap1 /mnt/btrfs/snap2 > /tmp/incremental.send A patch to update the test btrfs/030 from xfstests, so that it covers this case, will be submitted soon. Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Josef Bacik <jbacik@fb.com>
2014-03-10Btrfs: add missing error check in incremental sendFilipe David Borba Manana
Function wait_for_parent_move() returns negative value if an error happened, 0 if we don't need to wait for the parent's move, and 1 if the wait is needed. Before this change an error return value was being treated like the return value 1, which was not correct. Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Josef Bacik <jbacik@fb.com>