summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2007-02-12[PATCH] FS: speed up rw_verify_area()Eric Dumazet
oprofile hunting showed a stall in rw_verify_area(), because of triple indirection and potential cache misses. (file->f_path.dentry->d_inode->i_flock) By moving initialization of 'struct inode' pointer before the pos/count sanity tests, we allow the compiler and processor to perform two loads by anticipation, reducing stall, without prefetch() hints. Even x86 arch has enough registers to not use temporary variables and not increase text size. I validated this patch running a bench and studied oprofile changes, and absolute perf of the test program. Results of my epoll_pipe_bench (source available on request) on a Pentium-M 1.6 GHz machine Before : # ./epoll_pipe_bench -l 30 -t 20 Avg: 436089 evts/sec read_count=8843037 write_count=8843040 21.218390 samples per call (best value out of 10 runs) After : # ./epoll_pipe_bench -l 30 -t 20 Avg: 470980 evts/sec read_count=9549871 write_count=9549894 21.216694 samples per call (best value out of 10 runs) oprofile CPU_CLK_UNHALTED events gave a reduction from 5.3401 % to 2.5851 % for the rw_verify_area() function. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12[PATCH] warning fix: unsigned->signedTomasz Kvarsin
While compiling my code with -Wconversion using gcc-trunk, I always get a bunch of warrning from headers, here is fix for them: __getblk is alawys called with unsigned argument, but it takes signed, the same story with __bread,__breadahead and so on. Signed-off-by: Tomasz Kvarsin Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12[PATCH] reiserfs: Use ARRAY_SIZE macro when appropriateAhmed S. Darwish
Use ARRAY_SIZE macro already defined in kernel.h Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12[PATCH] inotify: read return val fixNick Piggin
Fix for inotify read bug (bugzilla.kernel.org #6999) Problem Description: When reading from an inotify device with an insufficient sized buffer, read(2) will return 0 with no errno set. This is because of an logically incorrect action from the user program thus should return an more logical value. My suggestion is return -EINVAL as for bind(2). This patch is based on the proposal from Ryan <wolf0403@hotmail.com>, and feedback from John McCutchan <john@johnmccutchan.com>. Return -EINVAL if we have not passed in enough buffer space to read a single inotify event, rather than 0 which indicates that there is nothing to read. Signed-off-by: Nick Piggin <npiggin@suse.de> Acked-by: "John McCutchan" <john@johnmccutchan.com> Cc: Ryan <wolf0403@hotmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12[PATCH] remove sb->s_files and file_list_lock usage in dquot.cChristoph Hellwig
Iterate over sb->s_inodes instead of sb->s_files in add_dquot_ref. This reduces list search and lock hold time aswell as getting rid of one of the few uses of file_list_lock which Ingo identified as a scalability problem. Previously we called dq_op->initialize for every inode handing of a writeable file that wasn't initialized before. Now we're calling it for every inode that has a non-zero i_writecount, aka a writeable file descriptor refering to it. Thanks a lot to Jan Kara for running this patch through his quota test harness. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12[PATCH] move remove_dquot_ref to dqout.cChristoph Hellwig
Remove_dquot_ref can move to dqout.c instead of beeing in inode.c under #ifdef CONFIG_QUOTA. Also clean the resulting code up a tiny little bit by testing sb->dq_op earlier - it's constant over a filesystems lifetime. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12[PATCH] Fix d_path for lazy unmountsAndreas Gruenbacher
Here is a bugfix to d_path. First, when d_path() hits a lazily unmounted mount point, it tries to prepend the name of the lazily unmounted dentry to the path name. It gets this wrong, and also overwrites the slash that separates the name from the following pathname component. This is demonstrated by the attached test case, which prints "getcwd returned d_path-bugsubdir" with the bug. The correct result would be "getcwd returned d_path-bug/subdir". It could be argued that the name of the root dentry should not be part of the result of d_path in the first place. On the other hand, what the unconnected namespace was once reachable as may provide some useful hints to users, and so that seems okay. Second, it isn't always possible to tell from the __d_path result whether the specified root and rootmnt (i.e., the chroot) was reached: lazy unmounts of bind mounts will produce a path that does start with a non-slash so we can tell from that, but other lazy unmounts will produce a path that starts with a slash, just like "ordinary" paths. The attached patch cleans up __d_path() to fix the bug with overlapping pathname components. It also adds a @fail_deleted argument, which allows to get rid of some of the mess in sys_getcwd(). Grabbing the dcache_lock can then also be moved into __d_path(). The patch also makes sure that paths will only start with a slash for paths which are connected to the root and rootmnt. The @fail_deleted argument could be added to d_path() as well: this would allow callers to recognize deleted files, without having to resort to the ambiguous check for the " (deleted)" string at the end of the pathnames. This is not currently done, but it might be worthwhile. Signed-off-by: Andreas Gruenbacher <agruen@suse.de> Cc: Neil Brown <neilb@suse.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12[PATCH] NTFS: rename incorrect check of NTFS_DEBUG with just DEBUGRobert P. J. Day
Replace the incorrect debugging check of "#ifdef NTFS_DEBUG" with just "#ifdef DEBUG". Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Acked-by: Anton Altaparmakov <aia21@cantab.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12[PATCH] register_chrdev_region() don't hand out the LOCAL/EXPERIMENTAL majorsAndrew Morton
As pointed out in http://bugzilla.kernel.org/show_bug.cgi?id=7922, dynamic chardev major allocation can hand out majors which LANANA has defined as being for local/experimental use. Cc: Torben Mathiasen <device@lanana.org> Cc: Greg KH <greg@kroah.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Tomas Klas <tomas.klas@mepatek.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12[PATCH] Make XFS use BH_Unwritten and BH_Delay correctlyDavid Chinner
Don't hide buffer_unwritten behind buffer_delay() and remove the hack that clears unexpected buffer_unwritten() states now that it can't happen. Signed-off-by: Dave Chinner <dgc@sgi.com> Acked-by: Christoph Hellwig <hch@lst.de> Cc: Timothy Shimmin <tes@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12[PATCH] Make BH_Unwritten a first class bufferhead flag V2David Chinner
Currently, XFS uses BH_PrivateStart for flagging unwritten extent state in a bufferhead. Recently, I found the long standing mmap/unwritten extent conversion bug, and it was to do with partial page invalidation not clearing the unwritten flag from bufferheads attached to the page but beyond EOF. See here for a full explaination: http://oss.sgi.com/archives/xfs/2006-12/msg00196.html The solution I have checked into the XFS dev tree involves duplicating code from block_invalidatepage to clear the unwritten flag from the bufferhead(s), and then calling block_invalidatepage() to do the rest. Christoph suggested that this would be better solved by pushing the unwritten flag into the common buffer head flags and just adding the call to discard_buffer(): http://oss.sgi.com/archives/xfs/2006-12/msg00239.html The following patch makes BH_Unwritten a first class citizen. Signed-off-by: Dave Chinner <dgc@sgi.com> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11Merge git://oss.sgi.com:8090/xfs/xfs-2.6Linus Torvalds
* git://oss.sgi.com:8090/xfs/xfs-2.6: (33 commits) [XFS] Don't use kmap in xfs_iozero. [XFS] Remove a bunch of unused functions from XFS. [XFS] Remove unused arguments from the XFS_BTREE_*_ADDR macros. [XFS] Remove unused header files for MAC and CAP checking functionality. [XFS] Make freeze code a little cleaner. [XFS] Remove unused argument to xfs_bmap_finish [XFS] Clean up use of VFS attr flags [XFS] Remove useless memory barrier [XFS] XFS sysctl cleanups [XFS] Fix assertion in xfs_attr_shortform_remove(). [XFS] Fix callers of xfs_iozero() to zero the correct range. [XFS] Ensure a frozen filesystem has a clean log before writing the dummy [XFS] Fix sub-block zeroing for buffered writes into unwritten extents. [XFS] Re-initialize the per-cpu superblock counters after recovery. [XFS] Fix block reservation changes for non-SMP systems. [XFS] Fix block reservation mechanism. [XFS] Make growfs work for amounts greater than 2TB [XFS] Fix inode log item use-after-free on forced shutdown [XFS] Fix attr2 corruption with btree data extents [XFS] Workaround log space issue by increasing XFS_TRANS_PUSH_AIL_RESTARTS ...
2007-02-11Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6: [SPARC64]: Update defconfig. [SPARC64]: Add PCI MSI support on Niagara. [SPARC64] IRQ: Use irq_desc->chip_data instead of irq_desc->handler_data [SPARC64]: Add obppath sysfs attribute for SBUS and PCI devices. [PARTITION]: Add whole_disk attribute.
2007-02-11[PATCH] ifdef ->rchar, ->wchar, ->syscr, ->syscw from task_structAlexey Dobriyan
They are fat: 4x8 bytes in task_struct. They are uncoditionally updated in every fork, read, write and sendfile. They are used only if you have some "extended acct fields feature". And please, please, please, read(2) knows about bytes, not characters, why it is called "rchar"? Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Jay Lan <jlan@engr.sgi.com> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11[PATCH] jbd layer function called instead of fs specific oneDmitriy Monakhov
jbd function called instead of fs specific one. Signed-off-by: Dmitriy Monakhov <dmonakhov@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11[PATCH] Remove unused kernel config option ZISOFS_FSRobert P. J. Day
Remove the kernel config option ZISOFS_FS, since it appears that the actual option is simply ZISOFS. Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11[PATCH] buffer: memorder fixNick Piggin
unlock_buffer(), like unlock_page(), must not clear the lock without ensuring that the critical section is closed. Mingming later sent the same patch, saying: We are running SDET benchmark and saw double free issue for ext3 extended attributes block, which complains the same xattr block already being freed (in ext3_xattr_release_block()). The problem could also been triggered by multiple threads loop untar/rm a kernel tree. The race is caused by missing a memory barrier at unlock_buffer() before the lock bit being cleared, resulting in possible concurrent h_refcounter update. That causes a reference counter leak, then later leads to the double free that we have seen. Inside unlock_buffer(), there is a memory barrier is placed *after* the lock bit is being cleared, however, there is no memory barrier *before* the bit is cleared. On some arch the h_refcount update instruction and the clear bit instruction could be reordered, thus leave the critical section re-entered. The race is like this: For example, if the h_refcount is initialized as 1, cpu 0: cpu1 -------------------------------------- ----------------------------------- lock_buffer() /* test_and_set_bit */ clear_buffer_locked(bh); lock_buffer() /* test_and_set_bit */ h_refcount = h_refcount+1; /* = 2*/ h_refcount = h_refcount + 1; /*= 2 */ clear_buffer_locked(bh); .... ...... We lost a h_refcount here. We need a memory barrier before the buffer head lock bit being cleared to force the order of the two writes. Please apply. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11[PATCH] extend the set of "__attribute__" shortcut macrosRobert P. J. Day
Extend the set of "__attribute__" shortcut macros, and remove identical (and now superfluous) definitions from a couple of source files. based on a page at robert love's blog: http://rlove.org/log/2005102601 extend the set of shortcut macros defined in compiler-gcc.h with the following: #define __packed __attribute__((packed)) #define __weak __attribute__((weak)) #define __naked __attribute__((naked)) #define __noreturn __attribute__((noreturn)) #define __pure __attribute__((pure)) #define __aligned(x) __attribute__((aligned(x))) #define __printf(a,b) __attribute__((format(printf,a,b))) Once these are in place, it's up to subsystem maintainers to decide if they want to take advantage of them. there is already a strong precedent for using shortcuts like this in the source tree. The ones that might give people pause are "__aligned" and "__printf", but shortcuts for both of those are already in use, and in some ways very confusingly. note the two very different definitions for a macro named "ALIGNED": drivers/net/sgiseeq.c:#define ALIGNED(x) ((((unsigned long)(x)) + 0xf) & ~(0xf)) drivers/scsi/ultrastor.c:#define ALIGNED(x) __attribute__((aligned(x))) also: include/acpi/platform/acgcc.h: #define ACPI_PRINTF_LIKE(c) __attribute__ ((__format__ (__printf__, c, c+1))) Given the precedent, then, it seems logical to at least standardize on a consistent set of these macros. Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11[PATCH] remove ext[34]_inc_count and _dec_countEric Sandeen
- Naming is confusing, ext3_inc_count manipulates i_nlink not i_count - handle argument passed in is not used - ext3 and ext4 already call inc_nlink and dec_nlink directly in other places Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11[PATCH] return ENOENT from ext3_link when racing with unlinkEric Sandeen
Return -ENOENT from ext[34]_link if we've raced with unlink and i_nlink is 0. Doing otherwise has the potential to corrupt the orphan inode list, because we'd wind up with an inode with a non-zero link count on the list, and it will never get properly cleaned up & removed from the orphan list before it is freed. [akpm@osdl.org: build fix] Signed-off-by: Eric Sandeen <sandeen@redhat.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11[PATCH] fix umask when noACL kernel meets extN tuned for ACLsHugh Dickins
Fix insecure default behaviour reported by Tigran Aivazian: if an ext2 or ext3 or ext4 filesystem is tuned to mount with "acl", but mounted by a kernel built without ACL support, then umask was ignored when creating inodes - though root or user has umask 022, touch creates files as 0666, and mkdir creates directories as 0777. This appears to have worked right until 2.6.11, when a fix to the default mode on symlinks (always 0777) assumed VFS applies umask: which it does, unless the mount is marked for ACLs; but ext[234] set MS_POSIXACL in s_flags according to s_mount_opt set according to def_mount_opts. We could revert to the 2.6.10 ext[234]_init_acl (adding an S_ISLNK test); but other filesystems only set MS_POSIXACL when ACLs are configured. We could fix this at another level; but it seems most robust to avoid setting the s_mount_opt flag in the first place (at the expense of more ifdefs). Likewise don't set the XATTR_USER flag when built without XATTR support. Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk> Cc: <linux-ext4@vger.kernel.org> Cc: Andreas Gruenbacher <agruen@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11[PATCH] seq_file conversion: codaAlexey Dobriyan
Compile-tested. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: <jaharkes@cs.cmu.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11[PATCH] ext4: refuse ro to rw remount of fs with orphan inodesEric Sandeen
In the rare case where we have skipped orphan inode processing due to a readonly block device, and the block device subsequently changes back to read-write, disallow a remount,rw transition of the filesystem when we have an unprocessed orphan inodes as this would corrupt the list. Ideally we should process the orphan inode list during the remount, but that's trickier, and this plugs the hole for now. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Cc: "Stephen C. Tweedie" <sct@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11[PATCH] ext3: refuse ro to rw remount of fs with orphan inodesEric Sandeen
In the rare case where we have skipped orphan inode processing due to a readonly block device, and the block device subsequently changes back to read-write, disallow a remount,rw transition of the filesystem when we have an unprocessed orphan inodes as this would corrupt the list. Ideally we should process the orphan inode list during the remount, but that's trickier, and this plugs the hole for now. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Cc: "Stephen C. Tweedie" <sct@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11[PATCH] proc_misc warning fixAndrew Morton
fs/proc/proc_misc.c: In function 'proc_misc_init': fs/proc/proc_misc.c:764: warning: unused variable 'entry' Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11[PATCH] msdos partitions: fix logic error in AIX detectionOlaf Hering
Correct the AIX magic check to let 'echo > /dev/sdb' actually work. Signed-off-by: Olaf Hering <olh@suse.de> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Anton Blanchard <anton@samba.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11[PATCH] relax check for AIX in msdos partition tableOlaf Hering
The patch to identify AIX disks and ignore them has caused at least one machine to fail to find the root partition on 2.6.19. The patch is: http://lkml.org/lkml/2006/7/31/117 The problem is some disk formatters do not blow away the first 4 bytes of the disk. If the disk we are installing to used to have AIX on it, then the first 4 bytes will still have IBMA in EBCDIC. The install in question was debian etch. Im not sure what the best fix is, perhaps the AIX detection code could check more than the first 4 bytes. The whole partition info for primary partitions is in this block: dd if=/dev/sdb count=$(( 4 * 16 )) bs=1 skip=$(( 0x1be )) All other data do not matter, beside the 0x55aa marker at the end of the first block. Signed-off-by: Olaf Hering <olh@suse.de> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Anton Blanchard <anton@samba.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11[PATCH] remove invalidate_inode_pages()Andrew Morton
Convert all calls to invalidate_inode_pages() into open-coded calls to invalidate_mapping_pages(). Leave the invalidate_inode_pages() wrapper in place for now, marked as deprecated. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11[PATCH] ext2: skip pages past number of blocks in ext2_find_entryEric Sandeen
This one was pointed out on the MOKB site: http://kernelfun.blogspot.com/2006/11/mokb-09-11-2006-linux-26x-ext2checkpage.html If a directory's i_size is corrupted, ext2_find_entry() will keep processing pages until the i_size is reached, even if there are no more blocks associated with the directory inode. This patch puts in some minimal sanity-checking so that we don't keep checking pages (and issuing errors) if we know there can be no more data to read, based on the block count of the directory inode. This is somewhat similar in approach to the ext3 patch I sent earlier this year. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11[PATCH] Transform kmem_cache_alloc()+memset(0) -> kmem_cache_zalloc().Robert P. J. Day
Replace appropriate pairs of "kmem_cache_alloc()" + "memset(0)" with the corresponding "kmem_cache_zalloc()" call. Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: Roland McGrath <roland@redhat.com> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: Greg KH <greg@kroah.com> Acked-by: Joel Becker <Joel.Becker@oracle.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Jan Kara <jack@ucw.cz> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: James Morris <jmorris@namei.org> Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11[PATCH] igrab() should check for I_CLEARJan Blunck
When igrab() is calling __iget() on an inode it should check if clear_inode() has been called on the inode already. Otherwise there is a race window between clear_inode() and destroy_inode() where igrab() calls __iget() which leads to already free inodes on the inode lists. Signed-off-by: Vandana Rungta <vandana@novell.com> Signed-off-by: Jan Blunck <jblunck@suse.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11[PATCH] avoid one conditional branch in touch_atime()Eric Dumazet
I added IS_NOATIME(inode) macro definition in include/linux/fs.h, true if the inode superblock is marked readonly or noatime. This new macro is then used in touch_atime() instead of separatly testing MS_RDONLY and MS_NOATIME Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11[PATCH] convert ramfs to use __set_page_dirty_no_writebackKen Chen
As pointed out by Hugh, ramfs would also benefit from using the new set_page_dirty aop method for memory backed file systems. Signed-off-by: Ken Chen <kenchen@google.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11[PATCH] Drop get_zone_counts()Christoph Lameter
Values are available via ZVC sums. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11[PATCH] Remove final references to deprecated "MAP_ANON" page protection flagRobert P. J. Day
Remove the last vestiges of the long-deprecated "MAP_ANON" page protection flag: use "MAP_ANONYMOUS" instead. Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11[PARTITION]: Add whole_disk attribute.Fabio Massimo Di Nitto
Some partitioning systems create special partitions that span the entire disk. One example are Sun partitions, and this whole-disk partition exists to tell the firmware the extent of the entire device so it can load the boot block and do other things. Such partitions should not be treated as normal partitions, because all the other partitions overlap this whole-disk one. So we'd see multiple instances of the same UUID etc. which we do not want. udev and friends can thus search for this 'whole_disk' attribute and use it to decide to ignore the partition. Signed-off-by: Fabio Massimo Di Nitto <fabbione@ubuntu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2007-02-10[XFS] Don't use kmap in xfs_iozero.David Chinner
kmap() is inefficient and does not scale well. kmap_atomic() is a better choice. Use the generic wrapper function instead of open coding the kmap-memset-dcache flush-kunmap stuff. SGI-PV: 960904 SGI-Modid: xfs-linux-melb:xfs-kern:28041a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-02-10[XFS] Remove a bunch of unused functions from XFS.Eric Sandeen
Patch provided by Eric Sandeen (sandeen@sandeen.net). SGI-PV: 960897 SGI-Modid: xfs-linux-melb:xfs-kern:28038a Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-02-10[XFS] Remove unused arguments from the XFS_BTREE_*_ADDR macros.Eric Sandeen
It makes it incrementally clearer to read the code when the top of a macro spaghetti-pile only receives the 3 arguments it uses, rather than 2 extra ones which are not used. Also when you start pulling this thread out of the sweater (i.e. remove unused args from XFS_BTREE_*_ADDR), a couple other third arms etc fall off too. If they're not used in the macro, then they sometimes don't need to be passed to the function calling the macro either, etc.... Patch provided by Eric Sandeen (sandeen@sandeen.net). SGI-PV: 960197 SGI-Modid: xfs-linux-melb:xfs-kern:28037a Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-02-10[XFS] Remove unused header files for MAC and CAP checking functionality.Eric Sandeen
xfs_mac.h and xfs_cap.h provide definitions and macros that aren't used anywhere in XFS at all. They are left-overs from "to be implement at some point in the future" functionality that Irix XFS has. If this functionality ever goes into Linux, it will be provided at a different layer, most likely through the security hooks in the kernel so we will never need this functionality in XFS. Patch provided by Eric Sandeen (sandeen@sandeen.net). SGI-PV: 960895 SGI-Modid: xfs-linux-melb:xfs-kern:28036a Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-02-10[XFS] Make freeze code a little cleaner.David Chinner
Fixes a few small issues (mostly cosmetic) that were picked up during the review cycle for the last set of freeze path changes. SGI-PV: 959267 SGI-Modid: xfs-linux-melb:xfs-kern:28035a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-02-10[XFS] Remove unused argument to xfs_bmap_finishEric Sandeen
The firstblock argument to xfs_bmap_finish is not used by that function. Remove it and cleanup the code a bit. Patch provided by Eric Sandeen. SGI-PV: 960196 SGI-Modid: xfs-linux-melb:xfs-kern:28034a Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-02-10[XFS] Clean up use of VFS attr flagsEric Sandeen
Use the the generic VFS attr flags where appropriate instead of open coding them to the same values. Patch provided by Eric Sandeen. SGI-PV: 960868 SGI-Modid: xfs-linux-melb:xfs-kern:28033a Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-02-10[XFS] Remove useless memory barrierRalf Baechle
wake_up's implementation does an implicit memory barrier so the explicit memory barrier is not needed in vfs_sync_worker. Patch provided by Ralf Baechle. SGI-PV: 960867 SGI-Modid: xfs-linux-melb:xfs-kern:28032a Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-02-10[XFS] XFS sysctl cleanupsEric W. Biederman
Removes unneeded sysctl insert at head behaviour. Cleans up sysctl definitions to use C99 initialisers. Patch provided by Eric W. Biederman. SGI-PV: 960192 SGI-Modid: xfs-linux-melb:xfs-kern:28031a Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-02-10[XFS] Fix assertion in xfs_attr_shortform_remove().Lachlan McIlroy
SGI-PV: 960791 SGI-Modid: xfs-linux-melb:xfs-kern:28021a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-02-10[XFS] Fix callers of xfs_iozero() to zero the correct range.Lachlan McIlroy
The problem is the two callers of xfs_iozero() are rounding out the range to be zeroed to the end of a fsb and in some cases this extends past the new eof. The call to commit_write() in xfs_iozero() will cause the Linux inode's file size to be set too high. SGI-PV: 960788 SGI-Modid: xfs-linux-melb:xfs-kern:28013a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-02-10[XFS] Ensure a frozen filesystem has a clean log before writing the dummyDavid Chinner
record. The current Linux XFS freeze code is a mess. We flush the metadata buffers out while we are still allowing new transactions to start and then fail to flush the dirty buffers back out before writing the unmount and dummy records to the log. This leads to problems when the frozen filesystem is used for snapshots - we do log recovery on a readonly image and often it appears that the log image in the snapshot is not correct. Hence we end up with hangs, oops and mount failures when trying to mount a snapshot image that has been created when the filesystem has not been correctly frozen. To fix this, we need to move th metadata flush to after we wait for all current transactions to complete in teh second stage of the freeze. This means that when we write the final log records, the log should be clean and recovery should never occur on a snapshot image created from a frozen filesystem. SGI-PV: 959267 SGI-Modid: xfs-linux-melb:xfs-kern:28010a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Donald Douwsma <donaldd@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-02-10[XFS] Fix sub-block zeroing for buffered writes into unwritten extents.David Chinner
When writing less than a filesystem block of data into an unwritten extent via buffered I/O, __xfs_get_blocks fails to set the buffer new flag. As a result, the generic code will not zero either edge of the block resulting in garbage being written to disk either side of the real data. Set the buffer new state on bufferd writes to unwritten extents to ensure that zeroing occurs. SGI-PV: 960328 SGI-Modid: xfs-linux-melb:xfs-kern:28000a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-02-10[XFS] Re-initialize the per-cpu superblock counters after recovery.Lachlan McIlroy
After filesystem recovery the superblock is re-read to bring in any changes. If the per-cpu superblock counters are not re-initialized from the superblock then the next time the per-cpu counters are disabled they might overwrite the global counter with a bogus value. SGI-PV: 957348 SGI-Modid: xfs-linux-melb:xfs-kern:27999a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>