summaryrefslogtreecommitdiff
path: root/fs/ext4/resize.c
AgeCommit message (Collapse)Author
2013-03-02ext4: convert number of blocks to clusters properlyLukas Czerner
We're using macro EXT4_B2C() to convert number of blocks to number of clusters for bigalloc file systems. However, we should be using EXT4_NUM_B2C(). Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
2013-02-09ext4: pass context information to jbd2__journal_start()Theodore Ts'o
So we can better understand what bits of ext4 are responsible for long-running jbd2 handles, use jbd2__journal_start() so we can pass context information for logging purposes. The recommended way for finding the longer-running handles is: T=/sys/kernel/debug/tracing EVENT=$T/events/jbd2/jbd2_handle_stats echo "interval > 5" > $EVENT/filter echo 1 > $EVENT/enable ./run-my-fs-benchmark cat $T/trace > /tmp/problem-handles This will list handles that were active for longer than 20ms. Having longer-running handles is bad, because a commit started at the wrong time could stall for those 20+ milliseconds, which could delay an fsync() or an O_SYNC operation. Here is an example line from the trace file describing a handle which lived on for 311 jiffies, or over 1.2 seconds: postmark-2917 [000] .... 196.435786: jbd2_handle_stats: dev 254,32 tid 570 type 2 line_no 2541 interval 311 sync 0 requested_blocks 1 dirtied_blocks 0 Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-01-13ext4: trigger the lazy inode table initialization after resizeTheodore Ts'o
After we have finished extending the file system, we need to trigger a the lazy inode table thread to zero out the inode tables. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-01-12ext4: use unlikely to improve the efficiency of the kernelWang Shilong
Because the function 'sb_getblk' seldomly fails to return NULL value,it will be better to use 'unlikely' to optimize it. Signed-off-by: Wang Shilong <wangsl-fnst@cn.fujitsu.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-01-12ext4: return ENOMEM if sb_getblk() failsTheodore Ts'o
The only reason for sb_getblk() failing is if it can't allocate the buffer_head. So ENOMEM is more appropriate than EIO. In addition, make sure that the file system is marked as being inconsistent if sb_getblk() fails. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
2012-11-08ext4: remove ext4_handle_release_buffer()Eric Sandeen
ext4_handle_release_buffer() was intended to remove journal write access from a buffer, but it doesn't actually do anything at all other than add a BUFFER_TRACE point, but it's not reliably used for that either. Remove all the associated dead code. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
2012-10-22ext4: Checksum the block bitmap properly with bigalloc enabledTao Ma
In mke2fs, we only checksum the whole bitmap block and it is right. While in the kernel, we use EXT4_BLOCKS_PER_GROUP to indicate the size of the checksumed bitmap which is wrong when we enable bigalloc. The right size should be EXT4_CLUSTERS_PER_GROUP and this patch fixes it. Also as every caller of ext4_block_bitmap_csum_set and ext4_block_bitmap_csum_verify pass in EXT4_BLOCKS_PER_GROUP(sb)/8, we'd better removes this parameter and sets it in the function itself. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Lukas Czerner <lczerner@redhat.com> Cc: stable@vger.kernel.org
2012-09-26ext4: don't call update_backups() multiple times for the same bgTao Ma
When performing an online resize, we add a bunch of groups at one time in ext4_flex_group_add, so in most cases a lot of group descriptors will be in the same group block. But in the end of this function, update_backups will be called for every group descriptor and the same block will be copied and journalled again and again. It is really a waste. Fix things so we only update a particular bg descriptor block once and skip subsequent updates of the same block. Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-09-26ext4: fix double unlock buffer mess during fs-resizeDmitry Monakhov
bh_submit_read() is responsible for unlock bh on endio. In addition, we need to use bh_uptodate_or_lock() to avoid races. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-09-20ext4: remove erroneous ext4_superblock_csum_set() in update_backups()Tao Ma
The update_backups() function is used to backup all the metadata blocks, so we should not take it for granted that 'data' is pointed to a super block and use ext4_superblock_csum_set to calculate the checksum there. In case where the data is a group descriptor block, it will corrupt the last group descriptor, and then e2fsck will complain about it it. As all the metadata checksums should already be OK when we do the backup, remove the wrong ext4_superblock_csum_set and it should be just fine. Reported-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Tao Ma <boyu.mt@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
2012-09-19ext4: fix online resizing when the # of block groups is constantTheodore Ts'o
Commit 1c6bd7173d66b3 introduced a regression where an online resize operation which did not change the number of block groups would fail, i.e: mke2fs -t /dev/vdc 60000 mount /dev/vdc resize2fs /dev/vdc 60001 This was due to a bug in the logic regarding when to try converting the filesystem to use meta_bg. Also fix up a number of other minor issues with the online resizing code: (a) Fix a sparse warning; (b) only check to make sure the device is large enough once, instead of multiple times through the resize loop. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-09-13ext4: log a resize update to the console every 10 secondsTheodore Ts'o
For very long online resizes, a periodic update to the console log is helpful for debugging and for progress reporting. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-09-13ext4: convert file system to meta_bg if needed during resizingTheodore Ts'o
If we have run out of reserved gdt blocks, then clear the resize_inode feature and enable the meta_bg feature, so that we can continue resizing the file system seamlessly. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-09-12ext4: set bg_itable_unused when resizingTheodore Ts'o
Set bg_itable_unused for file systems that have uninit_bg enabled. This will speed up the first e2fsck run after the file system is resized. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-09-05ext4: add online resizing support for meta_bg and 64-bit file systemsYongqiang Yang
This patch adds support for resizing file systems with the meta_bg and 64bit features. [ Added a fix by tytso to fix a divide by zero when resizing a filesystem from 14 TB to 18TB. Also fixed overhead accounting for meta_bg file systems.] Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-09-05ext4: grow the s_group_info array as neededTheodore Ts'o
Previously we allocated the s_group_info array with enough space for any future possible growth of the file system via online resize. This is unfortunate because it wastes memory, and it doesn't work for the meta_bg scheme, since there is no limit based on the number of reserved gdt blocks. So add the code to grow the s_group_info array as needed. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-09-05ext4: grow the s_flex_groups array as needed when resizingTheodore Ts'o
Previously, we allocated the s_flex_groups array to the maximum size that the file system could be resized. There was two problems with this approach. First, it wasted memory in the common case where the file system was not resized. Secondly, once we start allowing online resizing using the meta_bg scheme, there is no maximum size that the file system can be resized. So instead, we need to grow the s_flex_groups at inline resize time. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-09-05ext4: avoid duplicate writes of the backup bg descriptor blocksYongqiang Yang
The resize code was needlessly writing the backup block group descriptor blocks multiple times (once per block group) during an online resize. Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
2012-09-05ext4: don't copy non-existent gdt blocks when resizingYongqiang Yang
The resize code was copying blocks at the beginning of each block group in order to copy the superblock and block group descriptor table (gdt) blocks. This was, unfortunately, being done even for block groups that did not have super blocks or gdt blocks. This is a complete waste of perfectly good I/O bandwidth, to skip writing those blocks for sparse bg's. Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
2012-09-05ext4: report the original old blocks count in a debug message when resizingYongqiang Yang
Avoid changing o_blocks_count, since it is used later when reporting old blocks count in debug mode. Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-09-05ext4: ignore last group w/o enough space when resizing instead of BUG'ingYongqiang Yang
If the last group does not have enough space for group tables, ignore it instead of calling BUG_ON(). Reported-by: Daniel Drake <dsd@laptop.org> Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
2012-07-23ext4: remove unnecessary argument from __ext4_handle_dirty_metadata()Artem Bityutskiy
The '__ext4_handle_dirty_metadata()' does not need the 'now' argument anymore and we can kill it. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz>
2012-07-23ext4: remove unused variable in ext4_update_super()Theodore Ts'o
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-07-09ext4: fix overhead calculation used by ext4_statfs()Theodore Ts'o
Commit f975d6bcc7a introduced bug which caused ext4_statfs() to miscalculate the number of file system overhead blocks. This causes the f_blocks field in the statfs structure to be larger than it should be. This would in turn cause the "df" output to show the number of data blocks in the file system and the number of data blocks used to be larger than they should be. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org
2012-05-28ext4: remove redundundant "(char *) bh->b_data" castsTheodore Ts'o
The b_data field of the buffer_head is already a char *, so there's no point casting it to a char *. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-05-28ext4: fix potential integer overflow in alloc_flex_gd()Haogang Chen
In alloc_flex_gd(), when flexbg_size is large, kmalloc size would overflow and flex_gd->groups would point to a buffer smaller than expected, causing OOB accesses when it is used. Note that in ext4_resize_fs(), flexbg_size is calculated using sbi->s_log_groups_per_flex, which is read from the disk and only bounded to [1, 31]. The patch returns NULL for too large flexbg_size. Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Haogang Chen <haogangchen@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org
2012-04-29ext4: make block group checksums use metadata_csum algorithmDarrick J. Wong
metadata_csum supersedes uninit_bg. Convert the ROCOMPAT uninit_bg flag check to a helper function that covers both, and make the checksum calculation algorithm use either crc16 or the metadata_csum chosen algorithm depending on which flag is set. Print a warning if we try to mount a filesystem with both feature flags set. Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-04-29ext4: calculate and verify block bitmap checksumDarrick J. Wong
Compute and verify the checksum of the block bitmap; this checksum is stored in the block group descriptor. Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-04-29ext4: calculate and verify checksums for inode bitmapsDarrick J. Wong
Compute and verify the checksum of the inode bitmap; the checkum is stored in the block group descriptor. Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-04-29ext4: calculate and verify superblock checksumDarrick J. Wong
Calculate and verify the superblock checksum. Since the UUID and block group number are embedded in each copy of the superblock, we need only checksum the entire block. Refactor some of the code to eliminate open-coding of the checksum update call. Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-03-20ext4: update s_free_{inodes,blocks}_count during online resizeDarrick J. Wong
When we're doing an online resize of an ext4 filesystem, we need to update the free inode and block counts in the superblock so that fsck doesn't complain. Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-03-20ext4: change some printk() calls to use ext4_msg() insteadTheodore Ts'o
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-02-21ext4: fix resize when resizing within single groupLukas Czerner
When resizing file system in the way that the new size of the file system is still in the same group (no new groups are added), then we can hit a BUG_ON in ext4_alloc_group_tables() BUG_ON(flex_gd->count == 0 || group_data == NULL); because flex_gd->count is zero. The reason is the missing check for such case, so the code always extend the last group fully and then attempt to add more groups, but at that time n_blocks_count is actually smaller than o_blocks_count. It can be easily reproduced like this: mkfs.ext4 -b 4096 /dev/sda 30M mount /dev/sda /mnt/test resize2fs /dev/sda 50M Fix this by checking whether the resize happens within the singe group and only add that many blocks into the last group to satisfy user request. Then o_blocks_count == n_blocks_count and the resize will exit successfully without and attempt to add more groups into the fs. Also fix mixing together block number and blocks count which might be confusing and can easily lead to off-by-one errors (but it is actually not the case here since the two occurrence of this mix-up will cancel each other). Signed-off-by: Lukas Czerner <lczerner@redhat.com> Reported-by: Milan Broz <mbroz@redhat.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-01-04ext4: let ext4_group_add() use common codeYongqiang Yang
This patch lets ext4_group_add() call ext4_flex_group_add(). Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-01-04ext4: let ext4_group_extend() use common codeYongqiang Yang
ext4_group_extend_no_check() is moved out from ext4_group_extend(), this patch lets ext4_group_extend() call ext4_group_extentd_no_check() instead. Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-01-04ext4: add new online resize interfaceYongqiang Yang
This patch adds new online resize interface, whose input argument is a 64-bit integer indicating how many blocks there are in the resized fs. In new resize impelmentation, all work like allocating group tables are done by kernel side, so the new resize interface can support flex_bg feature and prepares ground for suppoting resize with features like bigalloc and exclude bitmap. Besides these, user-space tools just passes in the new number of blocks. We delay initializing the bitmaps and inode tables of added groups if possible and add multi groups (a flex groups) each time, so new resize is very fast like mkfs. Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-01-04ext4: add a new function which adds a flex group to a fsYongqiang Yang
This patch adds a new function named ext4_flex_group_add() which adds a flex group to a fs. The function is used by 64bit-resize interface. Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-01-04ext4: add a new function which allocates bitmaps and inode tablesYongqiang Yang
This patch adds a new function named ext4_allocates_group_table() which allocates block bitmaps, inode bitmaps and inode tables for a flex groups and is used by resize code. Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-01-04ext4: pass verify_reserved_gdb() the number of group decriptorsYongqiang Yang
The 64bit resizer adds a flex group each time, so verify_reserved_gdb can not use s_groups_count directly, it should use the number of group decriptors before the added group. Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-01-04ext4: add a function which updates the super block during online resizingYongqiang Yang
This patch adds a function named ext4_update_super() which updates super block so the newly created block groups are visible to the file system. This code is copied from ext4_group_add(). The function will be used by new resize implementation. Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-01-04ext4: add a function which sets up a block group descriptors of a flex bgYongqiang Yang
This patch adds a function named ext4_setup_new_descs which sets up the block group descriptors of a flex bg. Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-01-04ext4: add a function which sets up group blocks of a flex bgYongqiang Yang
This patch adds a function named setup_new_flex_group_blocks() which sets up group blocks of a flex bg. Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-01-04ext4: add a structure which will be used by 64bit-resize interfaceYongqiang Yang
This patch adds a structure which will be used by 64bit-resize interface. Two functions which allocate and destroy the structure respectively are added. Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-01-04ext4: add a function which adds a new group descriptors to a fsYongqiang Yang
This patch adds a function named ext4_add_new_descs() which adds one or more new group descriptors to a fs and whose code is copied from ext4_group_add(). The function will be used by new resize implementation. Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-01-04ext4: add a function which extends a group without checking parametersYongqiang Yang
This patch added a function named ext4_group_extend_no_check() whose code is copied from ext4_group_extend(). ext4_group_extend_no_check() assumes the parameter is valid and has been checked by caller. Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-09-09ext4: Rename ext4_free_blks_{count,set}() to refer to clustersTheodore Ts'o
The field bg_free_blocks_count_{lo,high} in the block group descriptor has been repurposed to hold the number of free clusters for bigalloc functions. So rename the functions so it makes it easier to read and audit the block allocation and block freeing code. Note: at this point in bigalloc development we doesn't support online resize, so this also makes it really obvious all of the places we need to fix up to add support for online resize. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-09-09ext4: convert the free_blocks field in s_flex_groups to be free_clustersTheodore Ts'o
Convert the free_blocks to be free_clusters to make the final revised bigalloc changes easier to read/understand. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-09-09ext4: convert s_{dirty,free}blocks_counter to s_{dirty,free}clusters_counterTheodore Ts'o
Convert the percpu counters s_dirtyblocks_counter and s_freeblocks_counter in struct ext4_super_info to be s_dirtyclusters_counter and s_freeclusters_counter. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-08-01ext4: use ext4_kvzalloc()/ext4_kvmalloc() for s_group_desc and s_group_infoTheodore Ts'o
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-07-30ext4: add missing kfree() on error return path in add_new_gdb()Dan Carpenter
We added some more error handling in b40971426a "ext4: add error checking to calls to ext4_handle_dirty_metadata()". But we need to call kfree() as well to avoid a memory leak. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>