summaryrefslogtreecommitdiff
path: root/fs/ext4
AgeCommit message (Collapse)Author
2008-12-17ext4: Widen type of ext4_sb_info.s_mb_maxs[]Yasunori Goto
I chased the cause of following ext4 oops report which is tested on ia64 box. http://bugzilla.kernel.org/show_bug.cgi?id=12018 The cause is the size of s_mb_maxs array that is defined as "unsigned short" in ext4_sb_info structure. If the file system's block size is 8k or greater, an unsigned short is not wide enough to contain the value fs->blocksize << 3. Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Miao Xie <miaox@cn.fujitsu.com> Cc: stable@kernel.org
2008-11-27ext4: When resizing set the EXT4_BG_INODE_ZEROED flag for new block groupsSolofo.Ramangalahy@bull.net
The inode table has been zeroed in setup_new_group_blocks(). Mark it as such in ext4_group_add(). Since we are currently clearing inode table for the new block group, we should set the EXT4_BG_INODE_ZEROED flag. If at some point in the future we don't immediately zero out the inode table as part of the resize operation, then obviously we shouldn't do this. Signed-off-by: Solofo.Ramangalahy@bull.net Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-11-26ext4: Use simple_strtol() instead of simple_strtoul() in ext4_ui_proc_openRoel Kluin
Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-11-25ext4: fix build warningWu Fengguang
Replace `if' with `goto' to assure gcc that ix has been initialized. Signed-off-by: Wu Fengguang <wfg@linux.intel.com>
2009-01-06ext4: avoid ext4_error when mounting a fs with a single bgAneesh Kumar K.V
Remove some completely unneeded code which which caused an ext4_error to be generated when mounting a file system with only a single block group. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org
2009-01-06ext4: Fix the delalloc writepages to allocate blocks at the right offset.Aneesh Kumar K.V
When iterating through the pages which have mapped buffer_heads, we failed to update the b_state value. This results in allocating blocks at logical offset 0. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org
2008-11-05ext4: tone down ext4_da_writepages warningsTheodore Ts'o
If the filesystem has errors, ext4_da_writepages() will return a *lot* of errors, including lots and lots of stack dumps. While it's true that we are dropping user data on the floor, which is unfortunate, the stack dumps aren't helpful, and they tend to obscure the true original root cause of the problem. So in the case where the filesystem has aborted, return an EROFS right away. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-12-12ext4: remove do_blk_alloc()Theodore Ts'o
The convenience function do_blk_alloc() is a static function with only one caller, so fold it into ext4_new_meta_blocks() to simplify the code and to make it easier to understand. To save more stack space, if count is a null pointer in ext4_new_meta_blocks() assume that caller wanted a single block (and if there is an error, no blocks were allocated). Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-12-07ext4: remove ext4_new_meta_block()Theodore Ts'o
There were only two one callers of the function ext4_new_meta_block(), which just a very simpler wrapper function around ext4_new_meta_blocks(). Change those two functions to call ext4_new_meta_blocks() directly, to save code and stack space usage. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2009-01-02ext4: remove ext4_new_blocks() and call ext4_mb_new_blocks() directlyTheodore Ts'o
There was only one caller of the compatibility function ext4_new_blocks(), in balloc.c's ext4_alloc_blocks(). Change it to call ext4_mb_new_blocks() directly, and remove ext4_new_blocks() altogether. This cleans up the code, by removing two extra functions from the call chain, and hopefully saving some stack usage. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-12-06ext3/4: Fix loop index in do_split() so it is signedTheodore Ts'o
This fixes a gcc warning but it doesn't appear able to result in a failure, since the primary way the loop is exited is the first conditional in the for loop, and at least for a consistent filesystem, the signed/unsigned should in practice never be exposed. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-11-03ext4: wait on all pending commits in ext4_sync_fs()Theodore Ts'o
In ext4_sync_fs, we only wait for a commit to finish if we started it, but there may be one already in progress which will not be synced. In the case of a data=ordered umount with pending long symlinks which are delayed due to a long list of other I/O on the backing block device, this causes the buffer associated with the long symlinks to not be moved to the inode dirty list in the second phase of fsync_super. Then, before they can be dirtied again, kjournald exits, seeing the UMOUNT flag and the dirty pages are never written to the backing block device, causing long symlink corruption and exposing new or previously freed block data to userspace. To ensure all commits are synced, we flush all journal commits now when sync_fs'ing ext4. Signed-off-by: Arthur Jones <ajones@riverbed.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Eric Sandeen <sandeen@redhat.com> Cc: <linux-ext4@vger.kernel.org>
2008-11-04ext4: Convert to host order before using the values.Aneesh Kumar K.V
Use le16_to_cpu to read the s_reserved_gdt_blocks values from super block. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-11-04ext4: fix missing ext4_unlock_group in error pathAneesh Kumar K.V
If we try to free a block which is already freed, the code was returning without first unlocking the group. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-28ext4: Add support for non-native signed/unsigned htree hash algorithmsTheodore Ts'o
The original ext3 hash algorithms assumed that variables of type char were signed, as God and K&R intended. Unfortunately, this assumption is not true on some architectures. Userspace support for marking filesystems with non-native signed/unsigned chars was added two years ago, but the kernel-side support was never added (until now). Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-29ext4: fix printk format warningAlexander Beregalov
fs/ext4/balloc.c:607: warning: format '%lld' expects type 'long long int', but argument 2 has type 's64' fs/ext4/inode.c:1822: warning: format '%lld' expects type 'long long int', but argument 2 has type 's64' fs/ext4/inode.c:1824: warning: format '%lld' expects type 'long long int', but argument 2 has type 's64' Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2008-10-28delay capable() check in ext4_has_free_blocks()Eric Sandeen
As reported by Eric Paris, the capable() check in ext4_has_free_blocks() sometimes causes SELinux denials. We can rearrange the logic so that we only try to use the root-reserved blocks when necessary, and even then we can move the capable() test to last, to avoid the check most of the time. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-28merge ext4_claim_free_blocks & ext4_has_free_blocksEric Sandeen
Mingming pointed out that ext4_claim_free_blocks & ext4_has_free_blocks are largely cut & pasted; they can be collapsed/merged as follows. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-28ext4: fix a bug accessing freed memory in ext4_abortHidehiro Kawai
Vegard Nossum reported a bug which accesses freed memory (found via kmemcheck). When journal has been aborted, ext4_put_super() calls ext4_abort() after freeing the journal_t object, and then ext4_abort() accesses it. This patch fix it. Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-26ext4: Fix duplicate entries returned from getdents() system callTheodore Ts'o
Fix a regression caused by commit d0156417, "ext4: fix ext4_dx_readdir hash collision handling", where deleting files in a large directory (requiring more than one getdents system call), results in some filenames being returned twice. This was caused by a failure to update info->curr_hash and info->curr_minor_hash, so that if the directory had gotten modified since the last getdents() system call (as would be the case if the user is running "rm -r" or "git clean"), a directory entry would get returned twice to the userspace. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> This patch fixes the bug reported by Markus Trippelsdorf at: http://bugzilla.kernel.org/show_bug.cgi?id=11844 Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Tested-by: Markus Trippelsdorf <markus@trippelsdorf.de>
2008-10-23ext4: remove unused variable in ext4_get_parentChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> [ All users removed in "switch all filesystems over to d_obtain_alias", aka commit 440037287c5ebb07033ab927ca16bb68c291d309 ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/viro/bdevLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/viro/bdev: (66 commits) [PATCH] kill the rest of struct file propagation in block ioctls [PATCH] get rid of struct file use in blkdev_ioctl() BLKBSZSET [PATCH] get rid of blkdev_locked_ioctl() [PATCH] get rid of blkdev_driver_ioctl() [PATCH] sanitize blkdev_get() and friends [PATCH] remember mode of reiserfs journal [PATCH] propagate mode through swsusp_close() [PATCH] propagate mode through open_bdev_excl/close_bdev_excl [PATCH] pass fmode_t to blkdev_put() [PATCH] kill the unused bsize on the send side of /dev/loop [PATCH] trim file propagation in block/compat_ioctl.c [PATCH] end of methods switch: remove the old ones [PATCH] switch sr [PATCH] switch sd [PATCH] switch ide-scsi [PATCH] switch tape_block [PATCH] switch dcssblk [PATCH] switch dasd [PATCH] switch mtd_blkdevs [PATCH] switch mmc ...
2008-10-23[PATCH] switch all filesystems over to d_obtain_aliasChristoph Hellwig
Switch all users of d_alloc_anon to d_obtain_alias. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-23[PATCH] switch quota_on-related stuff to kern_path()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-21[PATCH] pass fmode_t to blkdev_put()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-10-20fs/Kconfig: move ext2, ext3, ext4, JBD, JBD2 outAlexey Dobriyan
Use fs/*/Kconfig more, which is good because everything related to one filesystem is in one place and fs/Kconfig is quite fat. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-17ext4: Remove automatic enabling of the HUGE_FILE feature flagTheodore Ts'o
If the HUGE_FILE feature flag is not set, don't allow the creation of large files, instead of automatically enabling the feature flag. Recent versions of mke2fs will set the HUGE_FILE flag automatically anyway for ext4 filesystems. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-17ext4: Replace hackish ext4_mb_poll_new_transaction with commit callbackTheodore Ts'o
The multiblock allocator needs to be able to release blocks (and issue a blkdev discard request) when the transaction which freed those blocks is committed. Previously this was done via a polling mechanism when blocks are allocated or freed. A much better way of doing things is to create a jbd2 callback function and attaching the list of blocks to be freed directly to the transaction structure. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-17ext4: Remove unused mount options: nomballoc, mballoc, nocheckTheodore Ts'o
These mount options don't actually do anything any more, so remove them. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-17ext4: Remove compile warnings when building w/o CONFIG_PROC_FSManish Katiyar
Signed-off-by: Manish Katiyar <mkatiyar@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-17ext4: Add missing newlines to printk messagesEric Sesterhenn
There are some newlines missing in ext4_check_descriptors, which cause the printk level to be printed out when the next printk call is made: [ 778.847265] EXT4-fs: ext4_check_descriptors: Block bitmap for group 0 not in group (block 1509949442)!<3>EXT4-fs: group descriptors corrupted! [ 802.646630] EXT4-fs: ext4_check_descriptors: Inode bitmap for group 0 not in group (block 9043971)!<3>EXT4-fs: group descriptors corrupted! Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-16ext4: Fix file fragmentation during large file write.Aneesh Kumar K.V
The range_cyclic writeback mode uses the address_space writeback_index as the start index for writeback. With delayed allocation we were updating writeback_index wrongly resulting in highly fragmented file. This patch reduces the number of extents reduced from 4000 to 27 for a 3GB file. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2008-10-14ext4: Use tag dirty lookup during mpage_da_submit_ioAneesh Kumar K.V
This enables us to drop the range_cont writeback mode use from ext4_da_writepages. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
2008-10-16ext4: let the block device know when unused blocks can be discardedTheodore Ts'o
Let the block device know when unused blocks can be discarded, using the new sb_issue_discard() interface. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-11ext4: Don't reuse released data blocks until transaction commitsAneesh Kumar K.V
We need to make sure we don't reuse the data blocks released during the transaction untill the transaction commits. We force this mode only for ordered and journalled mode. Writeback mode already don't provided data consistency. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2008-10-16ext4: Use an rbtree for tracking blocks freed during transaction.Aneesh Kumar K.V
With this patch we track the block freed during a transaction using red-black tree. We also make sure contiguous blocks freed are collected in one node in the tree. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2008-10-11ext4: Do mballoc init before doing filesystem recoveryAneesh Kumar K.V
During filesystem recovery we may be doing a truncate which expects some of the mballoc data structures to be initialized. So do ext4_mb_init before recovery. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2008-10-13ext4: Free ext4_prealloc_space using kmem_cache_freeAneesh Kumar K.V
We should use kmem_cache_free to free memory allocated via kmem_cache_alloc Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2008-10-13vfs: Use const for kernel parser tableSteven Whitehouse
This is a much better version of a previous patch to make the parser tables constant. Rather than changing the typedef, we put the "const" in all the various places where its required, allowing the __initconst exception for nfsroot which was the cause of the previous trouble. This was posted for review some time ago and I believe its been in -mm since then. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Alexander Viro <aviro@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-12ext4: fix build failure without procfsAlexander Beregalov
fs/ext4/super.c: In function 'ext4_fill_super': fs/ext4/super.c:2226: error: 'ext4_ui_proc_fops' undeclared (first use in this function) fs/ext4/super.c:2226: error: (Each undeclared identifier is reported only once fs/ext4/super.c:2226: error: for each function it appears in.) Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2008-10-11ext4: add an option to control error handling on file dataHidehiro Kawai
If the journal doesn't abort when it gets an IO error in file data blocks, the file data corruption will spread silently. Because most of applications and commands do buffered writes without fsync(), they don't notice the IO error. It's scary for mission critical systems. On the other hand, if the journal aborts whenever it gets an IO error in file data blocks, the system will easily become inoperable. So this patch introduces a filesystem option to determine whether it aborts the journal or just call printk() when it gets an IO error in file data. If you mount an ext4 fs with data_err=abort option, it aborts on file data write error. If you mount it with data_err=ignore, it doesn't abort, just call printk(). data_err=ignore is the default. Here is the corresponding patch of the ext3 version: http://kerneltrap.org/mailarchive/linux-kernel/2008/9/9/3239374 Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2008-10-11ext4: add checks for errors from jbd2Hidehiro Kawai
If the journal has aborted due to a checkpointing failure, we have to keep the contents of the journal space. Otherwise, the filesystem will lose uncheckpointed metadata completely and become inconsistent. To avoid this, we need to keep needs_recovery flag if checkpoint has failed. With this patch, ext4_put_super() detects a checkpointing failure from the return value of journal_destroy(), then it invokes ext4_abort() to make the filesystem read only and keep needs_recovery flag. Errors from jbd2_journal_flush() are also handled by this patch in some places. Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2008-10-11ext4: Rename ext4dev to ext4Theodore Ts'o
The ext4 filesystem is getting stable enough that it's time to drop the "dev" prefix. Also remove the requirement for the TEST_FILESYS flag. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-07ext4: Avoid double dirtying of super block in ext4_put_super()Andi Kleen
While reading code I noticed that ext4_put_super() dirties the superblock bh twice. It is always done in ext4_commit_super() too. Remove the redundant dirty operation. Should be a nop semantically. Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-10-07Hook ext4 to the vfs fiemap interface.Eric Sandeen
ext4_ext_walk_space() was reinstated to be used for iterating over file extents with a callback; it is used by the ext4 fiemap implementation. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org
2008-10-09ext4: fix xattr deadlockKalpak Shah
ext4_xattr_set_handle() eventually ends up calling ext4_mark_inode_dirty() which tries to expand the inode by shifting the EAs. This leads to the xattr_sem being downed again and leading to a deadlock. This patch makes sure that if ext4_xattr_set_handle() is in the call-chain, ext4_mark_inode_dirty() will not expand the inode. Signed-off-by: Kalpak Shah <kalpak.shah@sun.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-06ext4: Add debugging markers that can be used by systemtapTheodore Ts'o
This debugging markers are designed to debug problems such as the random filesystem latency problems reported by Arjan. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-10ext4: fix initialization of UNINIT bitmap blocksFrederic Bohe
This fixes a bug which caused on-line resizing of filesystems with a 1k blocksize to fail. The root cause of this bug was the fact that if an uninitalized bitmap block gets read in by userspace (which e2fsprogs does try to avoid, but can happen when the blocksize is less than the pagesize and an adjacent blocks is read into memory) ext4_read_block_bitmap() was erroneously depending on the buffer uptodate flag to decide whether it needed to initialize the bitmap block in memory --- i.e., to set the standard set of blocks in use by a block group (superblock, bitmaps, inode table, etc.). Essentially, ext4_read_block_bitmap() assumed it was the only routine that might try to read a block containing a block bitmap, which is simply not true. To fix this, ext4_read_block_bitmap() and ext4_read_inode_bitmap() must always initialize uninitialized bitmap blocks. Once a block or inode is allocated out of that bitmap, it will be marked as initialized in the block group descriptor, so in general this won't result any extra unnecessary work. Signed-off-by: Frederic Bohe <frederic.bohe@bull.net> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-10ext4: Remove old legacy block allocatorTheodore Ts'o
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2008-10-10ext4: Use readahead when reading an inode from the inode tableTheodore Ts'o
With modern hard drives, reading 64k takes roughly the same time as reading a 4k block. So request readahead for adjacent inode table blocks to reduce the time it takes when iterating over directories (especially when doing this in htree sort order) in a cold cache case. With this patch, the time it takes to run "git status" on a kernel tree after flushing the caches via "echo 3 > /proc/sys/vm/drop_caches" is reduced by 21%. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>