summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2011-10-28Merge branch 'for-linus' of git://ceph.newdream.net/git/ceph-clientLinus Torvalds
* 'for-linus' of git://ceph.newdream.net/git/ceph-client: libceph: fix double-free of page vector ceph: fix 32-bit ino numbers libceph: force resend of osd requests if we skip an osdmap ceph: use kernel DNS resolver ceph: fix ceph_monc_init memory leak ceph: let the set_layout ioctl set single traits Revert "ceph: don't truncate dirty pages in invalidate work thread" ceph: replace leading spaces with tabs libceph: warn on msg allocation failures libceph: don't complain on msgpool alloc failures libceph: always preallocate mon connection libceph: create messenger with client ceph: document ioctls ceph: implement (optional) max read size ceph: rename rsize -> rasize ceph: make readpages fully async
2011-10-28Merge branch 'for-next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hch/vfs-queue * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/vfs-queue: (21 commits) leases: fix write-open/read-lease race nfs: drop unnecessary locking in llseek ext4: replace cut'n'pasted llseek code with generic_file_llseek_size vfs: add generic_file_llseek_size vfs: do (nearly) lockless generic_file_llseek direct-io: merge direct_io_walker into __blockdev_direct_IO direct-io: inline the complete submission path direct-io: separate map_bh from dio direct-io: use a slab cache for struct dio direct-io: rearrange fields in dio/dio_submit to avoid holes direct-io: fix a wrong comment direct-io: separate fields only used in the submission path from struct dio vfs: fix spinning prevention in prune_icache_sb vfs: add a comment to inode_permission() vfs: pass all mask flags check_acl and posix_acl_permission vfs: add hex format for MAY_* flag values vfs: indicate that the permission functions take all the MAY_* flags compat: sync compat_stats with statfs. vfs: add "device" tag to /proc/self/mountstats cleanup: vfs: small comment fix for block_invalidatepage ... Fix up trivial conflict in fs/gfs2/file.c (llseek changes)
2011-10-28Merge http://sucs.org/~rohan/git/gfs2-3.0-nmwLinus Torvalds
* http://sucs.org/~rohan/git/gfs2-3.0-nmw: (24 commits) GFS2: Move readahead of metadata during deallocation into its own function GFS2: Remove two unused variables GFS2: Misc fixes GFS2: rewrite fallocate code to write blocks directly GFS2: speed up delete/unlink performance for large files GFS2: Fix off-by-one in gfs2_blk2rgrpd GFS2: Clean up ->page_mkwrite GFS2: Correctly set goal block after allocation GFS2: Fix AIL flush issue during fsync GFS2: Use cached rgrp in gfs2_rlist_add() GFS2: Call do_strip() directly from recursive_scan() GFS2: Remove obsolete assert GFS2: Cache the most recently used resource group in the inode GFS2: Make resource groups "append only" during life of fs GFS2: Use rbtree for resource groups and clean up bitmap buffer ref count scheme GFS2: Fix lseek after SEEK_DATA, SEEK_HOLE have been added GFS2: Clean up gfs2_create GFS2: Use ->dirty_inode() GFS2: Fix bug trap and journaled data fsync GFS2: Fix inode allocation error path ...
2011-10-28Merge branch '3.2-without-smb2' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
* '3.2-without-smb2' of git://git.samba.org/sfrench/cifs-2.6: (52 commits) Fix build break when freezer not configured Add definition for share encryption CIFS: Make cifs_push_locks send as many locks at once as possible CIFS: Send as many mandatory unlock ranges at once as possible CIFS: Implement caching mechanism for posix brlocks CIFS: Implement caching mechanism for mandatory brlocks CIFS: Fix DFS handling in cifs_get_file_info CIFS: Fix error handling in cifs_readv_complete [CIFS] Fixup trivial checkpatch warning [CIFS] Show nostrictsync and noperm mount options in /proc/mounts cifs, freezer: add wait_event_freezekillable and have cifs use it cifs: allow cifs_max_pending to be readable under /sys/module/cifs/parameters cifs: tune bdi.ra_pages in accordance with the rsize cifs: allow for larger rsize= options and change defaults cifs: convert cifs_readpages to use async reads cifs: add cifs_async_readv cifs: fix protocol definition for READ_RSP cifs: add a callback function to receive the rest of the frame cifs: break out 3rd receive phase into separate function cifs: find mid earlier in receive codepath ...
2011-10-28Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfsLinus Torvalds
* 'for-linus' of git://oss.sgi.com/xfs/xfs: (69 commits) xfs: add AIL pushing tracepoints xfs: put in missed fix for merge problem xfs: do not flush data workqueues in xfs_flush_buftarg xfs: remove XFS_bflush xfs: remove xfs_buf_target_name xfs: use xfs_ioerror_alert in xfs_buf_iodone_callbacks xfs: clean up xfs_ioerror_alert xfs: clean up buffer allocation xfs: remove buffers from the delwri list in xfs_buf_stale xfs: remove XFS_BUF_STALE and XFS_BUF_SUPER_STALE xfs: remove XFS_BUF_SET_VTYPE and XFS_BUF_SET_VTYPE_REF xfs: remove XFS_BUF_FINISH_IOWAIT xfs: remove xfs_get_buftarg_list xfs: fix buffer flushing during unmount xfs: optimize fsync on directories xfs: reduce the number of log forces from tail pushing xfs: Don't allocate new buffers on every call to _xfs_buf_find xfs: simplify xfs_trans_ijoin* again xfs: unlock the inode before log force in xfs_change_file_space xfs: unlock the inode before log force in xfs_fs_nfs_commit_metadata ...
2011-10-28leases: fix write-open/read-lease raceJ. Bruce Fields
In setlease, we use i_writecount to decide whether we can give out a read lease. In open, we break leases before incrementing i_writecount. There is therefore a window between the break lease and the i_writecount increment when setlease could add a new read lease. This would leave us with a simultaneous write open and read lease, which shouldn't happen. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-28nfs: drop unnecessary locking in llseekAndi Kleen
This makes NFS follow the standard generic_file_llseek locking scheme. Cc: Trond.Myklebust@netapp.com Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-28ext4: replace cut'n'pasted llseek code with generic_file_llseek_sizeAndi Kleen
This gives ext4 the benefits of unlocked llseek. Cc: tytso@mit.edu Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-28vfs: add generic_file_llseek_sizeAndi Kleen
Add a generic_file_llseek variant to the VFS that allows passing in the maximum file size of the file system, instead of always using maxbytes from the superblock. This can be used to eliminate some cut'n'paste seek code in ext4. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-28vfs: do (nearly) lockless generic_file_llseekAndi Kleen
The i_mutex lock use of generic _file_llseek hurts. Independent processes accessing the same file synchronize over a single lock, even though they have no need for synchronization at all. Under high utilization this can cause llseek to scale very poorly on larger systems. This patch does some rethinking of the llseek locking model: First the 64bit f_pos is not necessarily atomic without locks on 32bit systems. This can already cause races with read() today. This was discussed on linux-kernel in the past and deemed acceptable. The patch does not change that. Let's look at the different seek variants: SEEK_SET: Doesn't really need any locking. If there's a race one writer wins, the other loses. For 32bit the non atomic update races against read() stay the same. Without a lock they can also happen against write() now. The read() race was deemed acceptable in past discussions, and I think if it's ok for read it's ok for write too. => Don't need a lock. SEEK_END: This behaves like SEEK_SET plus it reads the maximum size too. Reading the maximum size would have the 32bit atomic problem. But luckily we already have a way to read the maximum size without locking (i_size_read), so we can just use that instead. Without i_mutex there is no synchronization with write() anymore, however since the write() update is atomic on 64bit it just behaves like another racy SEEK_SET. On non atomic 32bit it's the same as SEEK_SET. => Don't need a lock, but need to use i_size_read() SEEK_CUR: This has a read-modify-write race window on the same file. One could argue that any application doing unsynchronized seeks on the same file is already broken. But for the sake of not adding a regression here I'm using the file->f_lock to synchronize this. Using this lock is much better than the inode mutex because it doesn't synchronize between processes. => So still need a lock, but can use a f_lock. This patch implements this new scheme in generic_file_llseek. I dropped generic_file_llseek_unlocked and changed all callers. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-28direct-io: merge direct_io_walker into __blockdev_direct_IOAndi Kleen
This doesn't change anything for the compiler, but hch thought it would make the code clearer. I moved the reference counting into its own little inline. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-28direct-io: inline the complete submission pathAndi Kleen
Add inlines to all the submission path functions. While this increases code size it also gives gcc a lot of optimization opportunities in this critical hotpath. In particular -- together with some other changes -- this allows gcc to get rid of the unnecessary clearing of sdio at the beginning and optimize the messy parameter passing. Any non inlining of a function which takes a sdio parameter would break this optimization because they cannot be done if the address of a structure is taken. Note that benefits are only seen with CONFIG_OPTIMIZE_INLINING and CONFIG_CC_OPTIMIZE_FOR_SIZE both set to off. This gives about 2.2% improvement on a large database benchmark with a high IOPS rate. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-28direct-io: separate map_bh from dioAndi Kleen
Only a single b_private field in the map_bh buffer head is needed after the submission path. Move map_bh separately to avoid storing this information in the long term slab. This avoids the weird 104 byte hole in struct dio_submit which also needed to be memseted early. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-28direct-io: use a slab cache for struct dioAndi Kleen
A direct slab call is slightly faster than kmalloc and can be better cached per CPU. It also avoids rounding to the next kmalloc slab. In addition this enforces cache line alignment for struct dio to avoid any false sharing. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-28direct-io: rearrange fields in dio/dio_submit to avoid holesAndi Kleen
Fix most problems reported by pahole. There is still a weird 104 byte hole after map_bh. I'm not sure what causes this. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-28direct-io: fix a wrong commentAndi Kleen
There's nothing on the stack, even before my changes. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-28direct-io: separate fields only used in the submission path from struct dioAndi Kleen
This large, but largely mechanic, patch moves all fields in struct dio that are only used in the submission path into a separate on stack data structure. This has the advantage that the memory is very likely cache hot, which is not guaranteed for memory fresh out of kmalloc. This also gives gcc more optimization potential because it can easier determine that there are no external aliases for these variables. The sdio initialization is a initialization now instead of memset. This allows gcc to break sdio into individual fields and optimize away unnecessary zeroing (after all the functions are inlined) Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-28vfs: fix spinning prevention in prune_icache_sbChristoph Hellwig
We need to move the inode to the end of the list to actually make the spinning prevention explained in the comment above it work. With a plain list_move it will simply stay in place as we're always reclaiming from the head of the list. Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-28vfs: add a comment to inode_permission()Andreas Gruenbacher
Acked-by: J. Bruce Fields <bfields@redhat.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andreas Gruenbacher <agruen@kernel.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-28vfs: pass all mask flags check_acl and posix_acl_permissionAndreas Gruenbacher
Acked-by: J. Bruce Fields <bfields@redhat.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andreas Gruenbacher <agruen@kernel.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-28vfs: indicate that the permission functions take all the MAY_* flagsAndreas Gruenbacher
Acked-by: J. Bruce Fields <bfields@redhat.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andreas Gruenbacher <agruen@kernel.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-28compat: sync compat_stats with statfs.Eric W. Biederman
This was found by inspection while tracking a similar bug in compat_statfs64, that has been fixed in mainline since decemeber. - This fixes a bug where not all of the f_spare fields were cleared on mips and s390. - Add the f_flags field to struct compat_statfs - Copy f_flags to userspace in case someone cares. - Use __clear_user to copy the f_spare field to userspace to ensure that all of the elements of f_spare are cleared. On some architectures f_spare is has 5 ints and on some architectures f_spare only has 4 ints. Which makes the previous technique of clearing each int individually broken. I don't expect anyone actually uses the old statfs system call anymore but if they do let them benefit from having the compat and the native version working the same. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-28vfs: add "device" tag to /proc/self/mountstatsBryan Schumaker
nfsiostat was failing to find mounted filesystems on kernels after 2.6.38 because of changes to show_vfsstat() by commit c7f404b40a3665d9f4e9a927cc5c1ee0479ed8f9. This patch adds back the "device" tag before the nfs server entry so scripts can parse the mountstats file correctly. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> CC: stable@kernel.org [>=2.6.39] Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-28cleanup: vfs: small comment fix for block_invalidatepageWang Sheng-Hui
The patch is aganist 3.1-rc3. Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2011-10-27Add definition for share encryptionSteve French
Samba supports a setfs info level to negotiate encrypted shares. This patch adds the defines so we recognize this info level. Later patches will add the enablement for it. Acked-by: Jeremy Allison <jra@samba.org> Signed-off-by: Steve French <smfrench@gmail.com>
2011-10-27fs/Makefile: Stupid typo breakage of exofs inclusionBoaz Harrosh
In my last patch I did a stupid mistake and broke the exofs compilation completely. Fix it ASAP. Instead of obj-y I did obj-$(y) Really Really sorry. Me totally blushing :-{| Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-10-26Merge branch 'for-linus' of git://git.open-osd.org/linux-open-osdLinus Torvalds
* 'for-linus' of git://git.open-osd.org/linux-open-osd: (21 commits) ore: Enable RAID5 mounts exofs: Support for RAID5 read-4-write interface. ore: RAID5 Write ore: RAID5 read fs/Makefile: Always inspect exofs/ ore: Make ore_calc_stripe_info EXPORT_SYMBOL ore/exofs: Change ore_check_io API ore/exofs: Define new ore_verify_layout ore: Support for partial component table ore: Support for short read/writes exofs: Support for short read/writes ore: Remove check for ios->kern_buff in _prepare_for_striping to later ore: cleanup: Embed an ore_striping_info inside ore_io_state ore: Only IO one group at a time (API change) ore/exofs: Change the type of the devices array (API change) ore: Make ore_striping_info and ore_calc_stripe_info public exofs: Remove unused data_map member from exofs_sb_info exofs: Rename struct ore_components comps => oc exofs/super.c: local functions should be static exofs/ore.c: local functions should be static ...
2011-10-26Merge branch 'timers-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) time, s390: Get rid of compile warning dw_apb_timer: constify clocksource name time: Cleanup old CONFIG_GENERIC_TIME references that snuck in time: Change jiffies_to_clock_t() argument type to unsigned long alarmtimers: Fix error handling clocksource: Make watchdog reset lockless posix-cpu-timers: Cure SMP accounting oddities s390: Use direct ktime path for s390 clockevent device clockevents: Add direct ktime programming function clockevents: Make minimum delay adjustments configurable nohz: Remove "Switched to NOHz mode" debugging messages proc: Consider NO_HZ when printing idle and iowait times nohz: Make idle/iowait counter update conditional nohz: Fix update_ts_time_stat idle accounting cputime: Clean up cputime_to_usecs and usecs_to_cputime macros alarmtimers: Rework RTC device selection using class interface alarmtimers: Add try_to_cancel functionality alarmtimers: Add more refined alarm state tracking alarmtimers: Remove period from alarm structure alarmtimers: Remove interval cap limit hack ...
2011-10-26Merge branch 'for-linus' of git://github.com/ericvh/linuxLinus Torvalds
* 'for-linus' of git://github.com/ericvh/linux: 9p: fix 9p.txt to advertise msize instead of maxdata net/9p: Convert net/9p protocol dumps to tracepoints fs/9p: change an int to unsigned int fs/9p: Cleanup option parsing in 9p 9p: move dereference after NULL check fs/9p: inode file operation is properly initialized init_special_inode fs/9p: Update zero-copy implementation in 9p
2011-10-25libceph: fix double-free of page vectorSage Weil
ceph_release_page_vector() kfrees the vector; we shouldn't do it here too. Reported-by: Jeff Wu <cpwu@tnsoft.com.cn> Signed-off-by: Sage Weil <sage@newdream.net>
2011-10-25ceph: fix 32-bit ino numbersAmon Ott
Fix 32-bit ino generation to not always be 1. Signed-off-by: Amon Ott <a.ott@m-privacy.de>
2011-10-25ceph: let the set_layout ioctl set single traitsGreg Farnum
Previously we were validating the passed-in stripe unit, object size, and stripe count against each other (and not testing most other stuff). Instead, make sure that the composed previous layout and new values are valid, and only send the new values to the MDS. This lets users change the pool without setting the whole layout, for instance. Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
2011-10-25Revert "ceph: don't truncate dirty pages in invalidate work thread"Sage Weil
This reverts commit c9af9fb68e01eb2c2165e1bc45cfeeed510c64e6. We need to block and truncate all pages in order to reliably invalidate them. Otherwise, we could: - have some uptodate pages in the cache - queue an invalidate - write(2) locks some pages - invalidate_work skips them - write(2) only overwrites part of the page - page now dirty and uptodate -> partial leakage of invalidated data It's not entirely clear why we started skipping locked pages in the first place. I just ran this through fsx and didn't see any problems. Signed-off-by: Sage Weil <sage@newdream.net>
2011-10-25ceph: replace leading spaces with tabsNoah Watkins
Trivial formatting fix. Signed-off-by: Noah Watkins <noahwatkins@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
2011-10-25libceph: don't complain on msgpool alloc failuresSage Weil
The pool allocation failures are masked by the pool; there is no need to spam the console about them. (That's the whole point of having the pool in the first place.) Mark msg allocations whose failure is safely handled as such. Signed-off-by: Sage Weil <sage@newdream.net>
2011-10-25libceph: create messenger with clientSage Weil
This simplifies the init/shutdown paths, and makes client->msgr available during the rest of the setup process. Signed-off-by: Sage Weil <sage@newdream.net>
2011-10-25ceph: document ioctlsSage Weil
...after some prodding by Christoph. Signed-off-by: Sage Weil <sage@newdream.net>
2011-10-25ceph: implement (optional) max read sizeSage Weil
The 'rsize' mount option limits the maximum size of an individual read(ahead) operation that is sent off to an OSD. This is distinct from 'rasize', which controls the size of the readahead window. Signed-off-by: Sage Weil <sage@newdream.net>
2011-10-25ceph: rename rsize -> rasizeSage Weil
It controls readahead. Signed-off-by: Sage Weil <sage@newdream.net>
2011-10-25ceph: make readpages fully asyncSage Weil
When we get a ->readpages() aop, submit async reads for all page ranges in the provided page list. Lock the pages immediately, so that VFS/MM will block until the reads complete. Signed-off-by: Sage Weil <sage@newdream.net>
2011-10-25Merge branch 'nfs-for-3.2' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
* 'nfs-for-3.2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (26 commits) Check validity of cl_rpcclient in nfs_server_list_show NFS: Get rid of the nfs_rdata_mempool NFS: Don't rely on PageError in nfs_readpage_release_partial NFS: Get rid of unnecessary calls to ClearPageError() in read code NFS: Get rid of nfs_restart_rpc() NFS: Get rid of the unused nfs_write_data->flags field NFS: Get rid of the unused nfs_read_data->flags field NFSv4: Translate NFS4ERR_BADNAME into ENOENT when applied to a lookup NFS: Remove the unused "lookupfh()" version of nfs4_proc_lookup() NFS: Use the inode->i_version to cache NFSv4 change attribute information SUNRPC: Remove unnecessary export of rpc_sockaddr2uaddr SUNRPC: Fix rpc_sockaddr2uaddr nfs/super.c: local functions should be static pnfsblock: fix writeback deadlock pnfsblock: fix NULL pointer dereference pnfs: recoalesce when ld read pagelist fails pnfs: recoalesce when ld write pagelist fails pnfs: make _set_lo_fail generic pnfsblock: add missing rpc_put_mount and path_put SUNRPC/NFS: make rpc pipe upcall generic ...
2011-10-25Merge branch 'for-3.2' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
* 'for-3.2' of git://linux-nfs.org/~bfields/linux: (103 commits) nfs41: implement DESTROY_CLIENTID operation nfsd4: typo logical vs bitwise negate for want_mask nfsd4: allow NFS4_SHARE_SIGNAL_DELEG_WHEN_RESRC_AVAIL | NFS4_SHARE_PUSH_DELEG_WHEN_UNCONTENDED nfsd4: seq->status_flags may be used unitialized nfsd41: use SEQ4_STATUS_BACKCHANNEL_FAULT when cb_sequence is invalid nfsd4: implement new 4.1 open reclaim types nfsd4: remove unneeded CLAIM_DELEGATE_CUR workaround nfsd4: warn on open failure after create nfsd4: preallocate open stateid in process_open1() nfsd4: do idr preallocation with stateid allocation nfsd4: preallocate nfs4_file in process_open1() nfsd4: clean up open owners on OPEN failure nfsd4: simplify process_open1 logic nfsd4: make is_open_owner boolean nfsd4: centralize renew_client() calls nfsd4: typo logical vs bitwise negate nfs: fix bug about IPv6 address scope checking nfsd4: more robust ignoring of WANT bits in OPEN nfsd4: move name-length checks to xdr nfsd4: move access/deny validity checks to xdr code ...
2011-10-25sysfs: Remove support for tagged directories with untagged members (again)Eric W. Biederman
In commit 8a9ea3237e7e ("Merge git://.../davem/net-next") where my sysfs changes from the net tree merged with the sysfs rbtree changes from Mickulas Patocka the conflict resolution failed to preserve the simplified property that was the point of my changes. That is sysfs_find_dirent can now say something is a match if and only s_name and s_ns match what we are looking for, and sysfs_readdir can simply return all of the directory entries where s_ns matches the directory that we should be returning. Now that we are back to exact matches we can tweak sysfs_find_dirent and the name rb_tree to order sysfs_dirents by s_ns s_name and remove the second loop in sysfs_find_dirent. However that change seems a bit much for a conflict resolution so it can come later. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-10-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1745 commits) dp83640: free packet queues on remove dp83640: use proper function to free transmit time stamping packets ipv6: Do not use routes from locally generated RAs |PATCH net-next] tg3: add tx_dropped counter be2net: don't create multiple RX/TX rings in multi channel mode be2net: don't create multiple TXQs in BE2 be2net: refactor VF setup/teardown code into be_vf_setup/clear() be2net: add vlan/rx-mode/flow-control config to be_setup() net_sched: cls_flow: use skb_header_pointer() ipv4: avoid useless call of the function check_peer_pmtu TCP: remove TCP_DEBUG net: Fix driver name for mdio-gpio.c ipv4: tcp: fix TOS value in ACK messages sent from TIME_WAIT rtnetlink: Add missing manual netlink notification in dev_change_net_namespaces ipv4: fix ipsec forward performance regression jme: fix irq storm after suspend/resume route: fix ICMP redirect validation net: hold sock reference while processing tx timestamps tcp: md5: add more const attributes Add ethtool -g support to virtio_net ... Fix up conflicts in: - drivers/net/Kconfig: The split-up generated a trivial conflict with removal of a stale reference to Documentation/networking/net-modules.txt. Remove it from the new location instead. - fs/sysfs/dir.c: Fairly nasty conflicts with the sysfs rb-tree usage, conflicting with Eric Biederman's changes for tagged directories.
2011-10-25Merge branch 'driver-core-next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core * 'driver-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (38 commits) mm: memory hotplug: Check if pages are correctly reserved on a per-section basis Revert "memory hotplug: Correct page reservation checking" Update email address for stable patch submission dynamic_debug: fix undefined reference to `__netdev_printk' dynamic_debug: use a single printk() to emit messages dynamic_debug: remove num_enabled accounting dynamic_debug: consolidate repetitive struct _ddebug descriptor definitions uio: Support physical addresses >32 bits on 32-bit systems sysfs: add unsigned long cast to prevent compile warning drivers: base: print rejected matches with DEBUG_DRIVER memory hotplug: Correct page reservation checking memory hotplug: Refuse to add unaligned memory regions remove the messy code file Documentation/zh_CN/SubmitChecklist ARM: mxc: convert device creation to use platform_device_register_full new helper to create platform devices with dma mask docs/driver-model: Update device class docs docs/driver-model: Document device.groups kobj_uevent: Ignore if some listeners cannot handle message dynamic_debug: make netif_dbg() call __netdev_printk() dynamic_debug: make netdev_dbg() call __netdev_printk() ...
2011-10-25Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (59 commits) MAINTAINERS: linux-m32r is moderated for non-subscribers linux@lists.openrisc.net is moderated for non-subscribers Drop default from "DM365 codec select" choice parisc: Kconfig: cleanup Kernel page size default Kconfig: remove redundant CONFIG_ prefix on two symbols cris: remove arch/cris/arch-v32/lib/nand_init.S microblaze: add missing CONFIG_ prefixes h8300: drop puzzling Kconfig dependencies MAINTAINERS: microblaze-uclinux@itee.uq.edu.au is moderated for non-subscribers tty: drop superfluous dependency in Kconfig ARM: mxc: fix Kconfig typo 'i.MX51' Fix file references in Kconfig files aic7xxx: fix Kconfig references to READMEs Fix file references in drivers/ide/ thinkpad_acpi: Fix printk typo 'bluestooth' bcmring: drop commented out line in Kconfig btmrvl_sdio: fix typo 'btmrvl_sdio_sd6888' doc: raw1394: Trivial typo fix CIFS: Don't free volume_info->UNC until we are entirely done with it. treewide: Correct spelling of successfully in comments ...
2011-10-25Merge branch 'next' of git://selinuxproject.org/~jmorris/linux-securityLinus Torvalds
* 'next' of git://selinuxproject.org/~jmorris/linux-security: (95 commits) TOMOYO: Fix incomplete read after seek. Smack: allow to access /smack/access as normal user TOMOYO: Fix unused kernel config option. Smack: fix: invalid length set for the result of /smack/access Smack: compilation fix Smack: fix for /smack/access output, use string instead of byte Smack: domain transition protections (v3) Smack: Provide information for UDS getsockopt(SO_PEERCRED) Smack: Clean up comments Smack: Repair processing of fcntl Smack: Rule list lookup performance Smack: check permissions from user space (v2) TOMOYO: Fix quota and garbage collector. TOMOYO: Remove redundant tasklist_lock. TOMOYO: Fix domain transition failure warning. TOMOYO: Remove tomoyo_policy_memory_lock spinlock. TOMOYO: Simplify garbage collector. TOMOYO: Fix make namespacecheck warnings. target: check hex2bin result encrypted-keys: check hex2bin result ...
2011-10-25ore: Enable RAID5 mountsBoaz Harrosh
Now that we support raid5 Enable it at mount. Raid6 will come next raid4 is not demanded for so it will probably not be enabled. (Until some one wants it) NOTE: That mkfs.exofs had support for raid5/6 since long time ago. (Making an empty raidX FS is just as easy as raid0 ;-} ) Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
2011-10-25exofs: Support for RAID5 read-4-write interface.Boaz Harrosh
The ore need suplied a r4w_get_page/r4w_put_page API from Filesystem so it can get cache pages to read-into when writing parial stripes. Also I commented out and NULLed the .writepage (singular) vector. Because it gives terrible write pattern to raid and is apparently not needed. Even in OOM conditions the system copes (even better) with out it. TODO: How to specify to write_cache_pages() to start or include a certain page? Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
2011-10-25ore: RAID5 WriteBoaz Harrosh
This is finally the RAID5 Write support. The bigger part of this patch is not the XOR engine itself, But the read4write logic, which is a complete mini prepare_for_striping reading engine that can read scattered pages of a stripe into cache so it can be used for XOR calculation. That is, if the write was not stripe aligned. The main algorithm behind the XOR engine is the 2 dimensional array: struct __stripe_pages_2d. A drawing might save 1000 words --- __stripe_pages_2d | n = pages_in_stripe_unit; w = group_width - parity; | pages array presented to the XOR lib | | V | __1_page_stripe[0].pages --> [c0][c1]..[cw][c_par] <---| | | __1_page_stripe[1].pages --> [c0][c1]..[cw][c_par] <--- | ... | ... | __1_page_stripe[n].pages --> [c0][c1]..[cw][c_par] ^ | data added columns first then row --- The pages are put on this array columns first. .i.e: p0-of-c0, p1-of-c0, ... pn-of-c0, p0-of-c1, ... So we are doing a corner turn of the pages. Note that pages will zigzag down and left. but are put sequentially in growing order. So when the time comes to XOR the stripe, only the beginning and end of the array need be checked. We scan the array and any NULL spot will be field by pages-to-be-read. The FS that wants to support RAID5 needs to supply an operations-vector that searches a given page in cache, and specifies if the page is uptodate or need reading. All these pages to be read are put on a slave ore_io_state and synchronously read. All the pages of a stripe are read in one IO, using the scatter gather mechanism. In write we constrain our IO to only be incomplete on a single stripe. Meaning either the complete IO is within a single stripe so we might have pages to read from both beginning or end of the strip. Or we have some reading to do at beginning but end at strip boundary. The left over pages are pushed to the next IO by the API already established by previous work, where an IO offset/length combination presented to the ORE might get the length truncated and the user must re-submit the leftover pages. (Both exofs and NFS support this) But any ORE user should make it's best effort to align it's IO before hand and avoid complications. A cached ore_layout->stripe_size member can be used for that calculation. (NOTE: that ORE demands that stripe_size may not be bigger then 32bit) What else? Well read it and tell me. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>