summaryrefslogtreecommitdiff
path: root/fs/gfs2/inode.c
AgeCommit message (Collapse)Author
2008-03-31[GFS2] Fix debug inode printingBob Peterson
I noticed that the latest change to i_height got rid of the value from the inode dump. This patch adds it back. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31[GFS2] Streamline indirect pointer tree height calculationSteven Whitehouse
This patch improves the calculation of the tree height in order to reduce the number of operations which are carried out on each call to gfs2_block_map. In the common case, we now make a single comparison, rather than calculating the required tree height from scratch each time. Also in the case that the tree does need some extra height, we start from the current height rather from zero when we work out what the new height ought to be. In addition the di_height field is moved into the inode proper and reduced in size to a u8 since the value must be between 0 and GFS2_MAX_META_HEIGHT (10). Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-02-07iget: use iget_failed() in GFS2David Howells
Use iget_failed() in GFS2 to kill a failed inode. Signed-off-by: David Howells <dhowells@redhat.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-01-25[GFS2] Lockup on errorBob Peterson
I spotted this bug while I was digging around. Looks like it could cause a lockup in some rare error condition. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-01-25[GFS2] Reduce inode size by moving i_alloc out of lineSteven Whitehouse
It is possible to reduce the size of GFS2 inodes by taking the i_alloc structure out of the gfs2_inode. This patch allocates the i_alloc structure whenever its needed, and frees it afterward. This decreases the amount of low memory we use at the expense of requiring a memory allocation for each page or partial page that we write. A quick test with postmark shows that the overhead is not measurable and I also note that OCFS2 use the same approach. In the future I'd like to solve the problem by shrinking down the size of the members of the i_alloc structure, but for now, this reduces the immediate problem of using too much low-memory on x86 and doesn't add too much overhead. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-01-25[GFS2] Remove lock methods for lock_nolock protocolWendy Cheng
GFS2 supports two modes of locking - lock_nolock for single node filesystem and lock_dlm for cluster mode locking. The gfs2 lock methods are removed from file operation table for lock_nolock protocol. This would allow VFS to handle posix lock and flock logics just like other in-tree filesystems without duplication. Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-01-25[GFS2] Don't add glocks to the journalSteven Whitehouse
The only reason for adding glocks to the journal was to keep track of which locks required a log flush prior to release. We add a flag to the glock to allow this check to be made in a simpler way. This reduces the size of a glock (by 12 bytes on i386, 24 on x86_64) and means that we can avoid extra work during the journal flush. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-01-25[GFS2] Introduce gfs2_set_aops()Steven Whitehouse
Just like ext3 we now have three sets of address space operations to cover the cases of writeback, ordered and journalled data writes. This means that the individual operations can now become less complicated as we are able to remove some of the tests for file data mode from the code. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-01-25[GFS2] Remove useless i_cache from inodesSteven Whitehouse
The i_cache was designed to keep references to the indirect blocks used during block mapping so that they didn't have to be looked up continually. The idea failed because there are too many places where the i_cache needs to be freed, and this has in the past been the cause of many bugs. In addition there was no performance benefit being gained since the disk blocks in question were cached anyway. So this patch removes it in order to simplify the code to prepare for other changes which would otherwise have had to add further support for this feature. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-01-25[GFS2] Clean up internal read functionSteven Whitehouse
As requested by Christoph, this patch cleans up GFS2's internal read function so that it no longer uses the do_generic_mapping_read function. This function is obsolete and GFS2 is the last user of it. As a side effect the internal read code gets smaller and easier to read and gfs2_readpage is split into two. One function has the locking and the other function has the rest of the logic. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Christoph Hellwig <hch@infradead.org>
2007-10-10[GFS2] Alternate gfs2_iget to avoid looking up inodes being freedBenjamin Marzinski
There is a possible deadlock between two processes on the same node, where one process is deleting an inode, and another process is looking for allocated but unused inodes to delete in order to create more space. process A does an iput() on inode X, and it's i_count drops to 0. This causes iput_final() to be called, which puts an inode into state I_FREEING at generic_delete_inode(). There no point between when iput_final() is called, and when I_FREEING is set where GFS2 could acquire any glocks. Once I_FREEING is set, no other process on that node can successfully look up that inode until the delete finishes. process B locks the the resource group for the same inode in get_local_rgrp(), which is called by gfs2_inplace_reserve_i() process A tries to lock the resource group for the inode in gfs2_dinode_dealloc(), but it's already locked by process B process B waits in find_inode for the inode to have the I_FREEING state cleared. Deadlock. This patch solves the problem by adding an alternative to gfs2_iget(), gfs2_iget_skip(), that simply skips any inodes that are in the I_FREEING state.o The alternate test function is just like the original one, except that it fails if the inode is being freed, and sets a skipped flag. The alternate set function is just like the original, except that it fails if the skipped flag is set. Only try_rgrp_unlink() calls gfs2_iget_skip() instead of gfs2_iget(). Signed-off-by: Benjamin E. Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-10-10[GFS2] fix inode meta data corruptionWendy Cheng
Fix a nasty inode meta data corruption issue by keeping the buffer head in icache array. This buffer needs to stay in memory until journal flush occurs Otherwise, gfs2_meta_inode_buffer could do a disk read before the inode hits disk. It ends up with meta data corruptions. The buffer will be released as part of the existing journal flush logic. Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09[GFS2] Remove i_mode passing from NFS File HandleWendy Cheng
GFS2 has been passing i_mode within NFS File Handle. Other than the wrong assumption that there is always room for this extra 16 bit value, the current gfs2_get_dentry doesn't really need the i_mode to work correctly. Note that GFS2 NFS code does go thru the same lookup code path as direct file access route (where the mode is obtained from name lookup) but gfs2_get_dentry() is coded for different purpose. It is not used during lookup time. It is part of the file access procedure call. When the call is invoked, if on-disk inode is not in-memory, it has to be read-in. This makes i_mode passing a useless overhead. Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09[GFS2] Obtaining no_formal_ino from directory entryWendy Cheng
GFS2 lookup code doesn't ask for inode shared glock. This implies during in-memory inode creation for existing file, GFS2 will not disk-read in the inode contents. This leaves no_formal_ino un-initialized during lookup time. The un-initialized no_formal_ino is subsequently encoded into file handle. Clients will get ESTALE error whenever it tries to access these files. Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09[GFS2] Fix deallocation issuesAbhijith Das
There were two issues during deallocation of unlinked inodes. The first was relating to the use of a "try" lock which in the case of the inode lock wasn't trying hard enough to deallocate in all circumstances (now changed to a normal glock) and in the case of the iopen lock didn't wait for the demotion of the shared lock before attempting to get the exclusive lock, and thereby sometimes (timing dependent) not completing the deallocation when it should have done. The second issue related to the lack of a way to invalidate dcache entries on remote nodes (now fixed by this patch) which meant that unlinks were taking a long time to return disk space to the fs. By adding some code to invalidate the dcache entries across the cluster for unlinked inodes, that is now fixed. This patch was written jointly by Abhijith Das and Steven Whitehouse. Signed-off-by: Abhijith Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09[GFS2] gfs2_lookupi() uninitialised var fixakpm@linux-foundation.org
fs/gfs2/inode.c: In function 'gfs2_lookupi': fs/gfs2/inode.c:392: warning: 'error' may be used uninitialized in this function Looks like a real bug to me. Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09[GFS2] Recovery for lost unlinked inodesSteven Whitehouse
Under certain circumstances its possible (though rather unlikely) that inodes which were unlinked by one node while still open on another might get "lost" in the sense that they don't get deallocated if the node which held the inode open crashed before it was unlinked. This patch adds the recovery code which allows automatic deallocation of the inode if its found during block allocation (the sensible time to look for such inodes since we are scanning the rgrp's bitmaps anyway at this time, so it adds no overhead to do this). Since the inode will have had its i_nlink set to zero, all we need to trigger recovery is a lookup and an iput(), and the normal deallocation code takes care of the rest. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09[GFS2] Fix bug in error path of inodeSteven Whitehouse
This fixes a bug in the ordering of operations in the error path of createi. Its not valid to do an iput() when holding the inode's glock since the iput() will (in this case) result in delete_inode() being called which needs to grab the lock itself. This was causing the recursive lock checking code to trigger. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09[GFS2] Add nanosecond timestamp featureSteven Whitehouse
This adds a nanosecond timestamp feature to the GFS2 filesystem. Due to the way that the on-disk format works, older filesystems will just appear to have this field set to zero. When mounted by an older version of GFS2, the filesystem will simply ignore the extra fields so that it will again appear to have whole second resolution, so that its trivially backward compatible. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09[GFS2] Fix sign problem in quota/statfs and cleanup _host structuresSteven Whitehouse
This patch fixes some sign issues which were accidentally introduced into the quota & statfs code during the endianess annotation process. Also included is a general clean up which moves all of the _host structures out of gfs2_ondisk.h (where they should not have been to start with) and into the places where they are actually used (often only one place). Also those _host structures which are not required any more are removed entirely (which is the eventual plan for all of them). The conversion routines from ondisk.c are also moved into the places where they are actually used, which for almost every one, was just one single place, so all those are now static functions. This also cleans up the end of gfs2_ondisk.h which no longer needs the #ifdef __KERNEL__. The net result is a reduction of about 100 lines of code, many functions now marked static plus the bug fixes as mentioned above. For good measure I ran the code through sparse after making these changes to check that there are no warnings generated. This fixes Red Hat bz #239686 Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-07-09[GFS2] Clean up inode number handlingSteven Whitehouse
This patch cleans up the inode number handling code. The main difference is that instead of looking up the inodes using a struct gfs2_inum_host we now use just the no_addr member of this structure. The tests relating to no_formal_ino can then be done by the calling code. This has advantages in that we want to do different things in different code paths if the no_formal_ino doesn't match. In the NFS patch we want to return -ESTALE, but in the ->lookup() path, its a bug in the fs if the no_formal_ino doesn't match and thus we can withdraw in this case. In order to later fix bz #201012, we need to be able to look up an inode without knowing no_formal_ino, as the only information that is known to us is the on-disk location of the inode in question. This patch will also help us to fix bz #236099 at a later date by cleaning up a lot of the code in that area. There are no user visible changes as a result of this patch and there are no changes to the on-disk format either. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-03-07[GFS2] Fix bz 229831, lookup returns wrong inodeSteven Whitehouse
The following patch fixes Red Hat bz 229831. Without this patch its possible for the wrong inode to be returned in certain cases. It is a pretty unusual event, so that its taken some time to track down. Thanks and due to Josef Whiter who did a lot of the testing required to thrack this down and fix it. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-03-07[GFS2] pass formal ino in do_filldir_mainWendy Cheng
ok, the following is the minimum changes to get NFSD going before we settle down this issue .. would appreciate this in the tree so other NFS related works can get done in parallel. Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05[GFS2] Fix unlink deadlocksRussell Cattelan
Move the glock acquisition to outside of the transactions. Lock odering must be preserved in order to prevent ABBA deadlocks. The current gfs2_change_nlink code would tries to grab the glock after having started a transaction and thus is holding the log lock. This is inconsistent with other code paths in gfs that grab the resource group glock prior to staring a tranactions. One problem with this fix is that the resource group lock is always grabbed now even if the inode still has ref count and can not be marked for unlink. Signed-off-by: Russell Cattelan <cattelan@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05[GFS2] Fix recursive locking attempt with NFSSteven Whitehouse
In certain cases, its possible for NFS to call the lookup code while holding the glock (when doing a readdirplus operation) so we need to check for that and not try and lock the glock twice. This also fixes a typo in a previous NFS related GFS2 patch. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05[GFS2] use CURRENT_TIME_SEC instead of get_seconds in gfs2Eric Sandeen
I was looking something else up and came across this... I don't honestly have a good reason to change it other than to make it like every other Linux filesystem in this regard. ;-) It doesn't functionally change anything, but makes some lines shorter. :) I'm also curious; why does gfs2 have 64-bits of on-disk timestamps, but not in timespec_t format, and only stores second resolutions? Seems like you're halfway to sub-second resolutions already. I suppose if that gets implemented then all of the below should instead be CURRENT_TIME not CURRENT_TIME_SEC. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05[GFS2] make gfs2_change_nlink_i() staticAdrian Bunk
On Thu, Jan 11, 2007 at 10:26:27PM -0800, Andrew Morton wrote: >... > Changes since 2.6.20-rc3-mm1: >... > git-gfs2-nmw.patch >... > git trees >... This patch makes the needlessly globlal gfs2_change_nlink_i() static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05[GFS2] Fix gfs2_rename deadlockS. Wendy Cheng
Second round of gfs2_rename lock re-ordering to allow Anaconda adding root partition on top of gfs2. Previous to this patch the recursive lock detector in glock.c can be triggered due to attempting to lock the rgrp twice. This fixes it by checking to see whether the rgrp is already locked. This fixes Red Hat bugzilla #221237 Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05[GFS2] BZ 217008 fsfuzzer fix.Russell Cattelan
Update the quilt header comments to match the code changes. Change gfs2_lookup_simple to return an error in the case of a NULL inode. The callers of gfs2_lookup_simple do not check for NULL in the no entry case and such would end up dereferencing a NULL ptr. This fixes: http://projects.info-pull.com/mokb/MOKB-15-11-2006.html Signed-off-by: Russell Cattelan <cattelan@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05[GFS2] Fix change nlink deadlockS. Wendy Cheng
Bugzilla 215088 Fix deadlock in gfs2_change_nlink() while installing RHEL5 into GFS2 partition. The gfs2_rename() apparently needs block allocation for the new name (into the directory) where it requires rg locks. At the same time, while updating the nlink count for the replaced file, gfs2_change_nlink() tries to return the inode meta-data back to resource group where it needs rg locks too. Our logic doesn't allow process to acquire these locks recursively by the same process (RHEL installer) that results a BUG call. This only happens within rename code path and only if the destination file exists before the rename operation. Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30[GFS2] Fix glock ordering on inode creationSteven Whitehouse
The lock order here should be parent -> child rather than numeric order. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30[GFS2] Remove unused function from inode.cSteven Whitehouse
The gfs2_glock_nq_m_atime function is unused in so far as its only ever called with num_gh = 1, and this falls through to the gfs2_glock_nq_atime function, so we might as well call that directly. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30[GFS2] Remove gfs2_inode_attr_inSteven Whitehouse
This function wasn't really doing the right thing. There was no need to update the inode size at this point and the updating of the i_blocks field has now been moved to the places where di_blocks is updated. A result of this patch and some those preceeding it is that unlocking a glock is now a much more efficient process, since there is no longer any requirement to copy data from the gfs2 inode into the vfs inode at this point. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30[GFS2] Inode number is constantSteven Whitehouse
Since the inode number is constant, we don't need to keep updating it everytime we refresh the other inode fields. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30[GFS2] Only set inode flags when requiredSteven Whitehouse
We were setting the inode flags from GFS2's flags far too often, even when they couldn't possibly have changed. This patch reduces the amount of flag setting going on so that we do it only when the inode is read in or when the flags have changed. The create case is covered by the "when the inode is read in" case. This also fixes a bug where we didn't set S_SYNC correctly. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30[GFS2] Tidy up 0 initialisations in inode.cSteven Whitehouse
We don't need to use endian conversions for 0 initialisations when creating a new on-disk inode. Cc: Christoph Hellwig <hch@infradead.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30[GFS2] Shrink gfs2_inode (8) - i_vnSteven Whitehouse
This shrinks the size of the gfs2_inode by 8 bytes by replacing the version counter with a one bit valid/invalid flag. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30[GFS2] Shrink gfs2_inode (7) - di_payload_formatSteven Whitehouse
This is almost never used. Its there for backward compatibility with GFS1. It doesn't need its own field since it can always be calculated from the inode mode & flags. This saves a bit more space in the gfs2_inode. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30[GFS2] Shrink gfs2_inode (6) - di_atime/di_mtime/di_ctimeSteven Whitehouse
Remove the di_[amc]time fields and use inode->i_[amc]time fields instead. This saves 24 bytes from the gfs2_inode. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30[GFS2] Shrink gfs2_inode (5) - di_nlinkSteven Whitehouse
Remove the di_nlink field in favour of inode->i_nlink and update the nlink handling to use the proper macros. This saves 4 bytes. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30[GFS2] Shrink gfs2_inode (4) - di_uid/di_gidSteven Whitehouse
Remove duplicate di_uid/di_gid fields in favour of using inode->i_uid/inode->i_gid instead. This saves 8 bytes. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30[GFS2] Shrink gfs2_inode (3) - di_modeSteven Whitehouse
This removes the duplicate di_mode field in favour of using the inode->i_mode field. This saves 4 bytes. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30[GFS2] Shrink gfs2_inode (2) - di_major/di_minorSteven Whitehouse
This removes the device numbers from this structure by using inode->i_rdev instead. It also cleans up the code in gfs2_mknod. It results in shrinking the gfs2_inode by 8 bytes. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30[GFS2] Shrink gfs2_inode (1) - di_header/di_numSteven Whitehouse
The metadata header doesn't need to be stored in the incore struct gfs2_inode since its constant, and this patch removes it. Also, there is already a field for the inode's number in the struct gfs2_inode, so we don't need one in struct gfs2_dinode_host as well. This saves 28 bytes of space in the struct gfs2_inode. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30[GFS2] Change argument to gfs2_dinode_printSteven Whitehouse
Change argument for gfs2_dinode_print in order to prepare for removal of duplicate fields between struct inode and struct gfs2_dinode_host. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30[GFS2] Move gfs2_dinode_in to inode.cSteven Whitehouse
gfs2_dinode_in() is only ever called from one place, so move it to that place (in inode.c) and make it static. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30[GFS2] Change argument to gfs2_dinode_inSteven Whitehouse
This is a preliminary patch to enable the removal of fields in gfs2_dinode_host which are duplicated in struct inode. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30[GFS2] Change argument of gfs2_dinode_outSteven Whitehouse
Everywhere this was called, a struct gfs2_inode was available, but despite that, it was always called with a struct gfs2_dinode as an argument. By making this change it paves the way to start eliminating fields duplicated between the kernel's struct inode and the struct gfs2_dinode. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30[GFS2] gfs2 misc endianness annotationsAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-11-30[GFS2] split and annotate gfs2_inumAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>