summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2010-08-10vfs: fix warning: 'dirent' is used uninitialized in this functionKevin Winchester
Using: gcc (GCC) 4.5.0 20100610 (prerelease) The following warnings appear: fs/readdir.c: In function `filldir64': fs/readdir.c:240:15: warning: `dirent' is used uninitialized in this function fs/readdir.c: In function `filldir': fs/readdir.c:155:15: warning: `dirent' is used uninitialized in this function fs/compat.c: In function `compat_filldir64': fs/compat.c:1071:11: warning: `dirent' is used uninitialized in this function fs/compat.c: In function `compat_filldir': fs/compat.c:984:15: warning: `dirent' is used uninitialized in this function The warnings are related to the use of the NAME_OFFSET() macro. Luckily, it appears as though the standard offsetof() macro is what is being implemented by NAME_OFFSET(), thus we can fix the warning and use a more standard code construct at the same time. Signed-off-by: Kevin Winchester <kjwinchester@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-10mm: avoid resetting wb_start after each writeback roundJan Kara
WB_SYNC_NONE writeback is done in rounds of 1024 pages so that we don't write out some huge inode for too long while starving writeout of other inodes. To avoid livelocks, we record time we started writeback in wbc->wb_start and do not write out inodes which were dirtied after this time. But currently, writeback_inodes_wb() resets wb_start each time it is called thus effectively invalidating this logic and making any WB_SYNC_NONE writeback prone to livelocks. This patch makes sure wb_start is set only once when we start writeback. Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Wu Fengguang <fengguang.wu@intel.com> Cc: Christoph Hellwig <hch@lst.de> Acked-by: Jens Axboe <jaxboe@fusionio.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-10oom: deprecate oom_adj tunableDavid Rientjes
/proc/pid/oom_adj is now deprecated so that that it may eventually be removed. The target date for removal is August 2012. A warning will be printed to the kernel log if a task attempts to use this interface. Future warning will be suppressed until the kernel is rebooted to prevent spamming the kernel log. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Nick Piggin <npiggin@suse.de> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-10oom: badness heuristic rewriteDavid Rientjes
This a complete rewrite of the oom killer's badness() heuristic which is used to determine which task to kill in oom conditions. The goal is to make it as simple and predictable as possible so the results are better understood and we end up killing the task which will lead to the most memory freeing while still respecting the fine-tuning from userspace. Instead of basing the heuristic on mm->total_vm for each task, the task's rss and swap space is used instead. This is a better indication of the amount of memory that will be freeable if the oom killed task is chosen and subsequently exits. This helps specifically in cases where KDE or GNOME is chosen for oom kill on desktop systems instead of a memory hogging task. The baseline for the heuristic is a proportion of memory that each task is currently using in memory plus swap compared to the amount of "allowable" memory. "Allowable," in this sense, means the system-wide resources for unconstrained oom conditions, the set of mempolicy nodes, the mems attached to current's cpuset, or a memory controller's limit. The proportion is given on a scale of 0 (never kill) to 1000 (always kill), roughly meaning that if a task has a badness() score of 500 that the task consumes approximately 50% of allowable memory resident in RAM or in swap space. The proportion is always relative to the amount of "allowable" memory and not the total amount of RAM systemwide so that mempolicies and cpusets may operate in isolation; they shall not need to know the true size of the machine on which they are running if they are bound to a specific set of nodes or mems, respectively. Root tasks are given 3% extra memory just like __vm_enough_memory() provides in LSMs. In the event of two tasks consuming similar amounts of memory, it is generally better to save root's task. Because of the change in the badness() heuristic's baseline, it is also necessary to introduce a new user interface to tune it. It's not possible to redefine the meaning of /proc/pid/oom_adj with a new scale since the ABI cannot be changed for backward compatability. Instead, a new tunable, /proc/pid/oom_score_adj, is added that ranges from -1000 to +1000. It may be used to polarize the heuristic such that certain tasks are never considered for oom kill while others may always be considered. The value is added directly into the badness() score so a value of -500, for example, means to discount 50% of its memory consumption in comparison to other tasks either on the system, bound to the mempolicy, in the cpuset, or sharing the same memory controller. /proc/pid/oom_adj is changed so that its meaning is rescaled into the units used by /proc/pid/oom_score_adj, and vice versa. Changing one of these per-task tunables will rescale the value of the other to an equivalent meaning. Although /proc/pid/oom_adj was originally defined as a bitshift on the badness score, it now shares the same linear growth as /proc/pid/oom_score_adj but with different granularity. This is required so the ABI is not broken with userspace applications and allows oom_adj to be deprecated for future removal. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Nick Piggin <npiggin@suse.de> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-10oom: move badness() declaration into oom.hAndrew Morton
Cc: Minchan Kim <minchan.kim@gmail.com> Cc: David Rientjes <rientjes@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-10oom: /proc/<pid>/oom_score treat kernel thread honestlyKOSAKI Motohiro
If a kernel thread is using use_mm(), badness() returns a positive value. This is not a big issue because caller take care of it correctly. But there is one exception, /proc/<pid>/oom_score calls badness() directly and doesn't care that the task is a regular process. Another example, /proc/1/oom_score return !0 value. But it's unkillable. This incorrectness makes administration a little confusing. This patch fixes it. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: David Rientjes <rientjes@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-08Merge branch 'bkl/core' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing * 'bkl/core' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing: do_coredump: Do not take BKL init: Remove the BKL from startup code
2010-08-07Merge branch 'for-2.6.36' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
* 'for-2.6.36' of git://linux-nfs.org/~bfields/linux: (34 commits) nfsd4: fix file open accounting for RDWR opens nfsd: don't allow setting maxblksize after svc created nfsd: initialize nfsd versions before creating svc net: sunrpc: removed duplicated #include nfsd41: Fix a crash when a callback is retried nfsd: fix startup/shutdown order bug nfsd: minor nfsd read api cleanup gcc-4.6: nfsd: fix initialized but not read warnings nfsd4: share file descriptors between stateid's nfsd4: fix openmode checking on IO using lock stateid nfsd4: miscellaneous process_open2 cleanup nfsd4: don't pretend to support write delegations nfsd: bypass readahead cache when have struct file nfsd: minor nfsd_svc() cleanup nfsd: move more into nfsd_startup() nfsd: just keep single lockd reference for nfsd nfsd: clean up nfsd_create_serv error handling nfsd: fix error handling in __write_ports_addxprt nfsd: fix error handling when starting nfsd with rpcbind down nfsd4: fix v4 state shutdown error paths ...
2010-08-07AFS: Fix the module init error handlingDavid Howells
Fix the module init error handling. There are a bunch of goto labels for aborting the init procedure at different points and just undoing what needs undoing - they aren't all in the right places, however. This can lead to an oops like the following: BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 IP: [<ffffffff81042a31>] destroy_workqueue+0x17/0xc0 ... Modules linked in: kafs(+) dns_resolver rxkad af_rxrpc fscache Pid: 2171, comm: insmod Not tainted 2.6.35-cachefs+ #319 DG965RY/ ... Process insmod (pid: 2171, threadinfo ffff88003ca6a000, task ffff88003dcc3050) ... Call Trace: [<ffffffffa0055994>] afs_callback_update_kill+0x10/0x12 [kafs] [<ffffffffa007d1c5>] afs_init+0x190/0x1ce [kafs] [<ffffffffa007d035>] ? afs_init+0x0/0x1ce [kafs] [<ffffffff810001ef>] do_one_initcall+0x59/0x14e [<ffffffff8105f7ee>] sys_init_module+0x9c/0x1de [<ffffffff81001eab>] system_call_fastpath+0x16/0x1b Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-07Merge branch 'nfs-for-2.6.36' of ↵Linus Torvalds
git://git.linux-nfs.org/projects/trondmy/nfs-2.6 * 'nfs-for-2.6.36' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (42 commits) NFS: NFSv4.1 is no longer a "developer only" feature NFS: NFS_V4 is no longer an EXPERIMENTAL feature NFS: Fix /proc/mount for legacy binary interface NFS: Fix the locking in nfs4_callback_getattr SUNRPC: Defer deleting the security context until gss_do_free_ctx() SUNRPC: prevent task_cleanup running on freed xprt SUNRPC: Reduce asynchronous RPC task stack usage SUNRPC: Move the bound cred to struct rpc_rqst SUNRPC: Clean up of rpc_bindcred() SUNRPC: Move remaining RPC client related task initialisation into clnt.c SUNRPC: Ensure that rpc_exit() always wakes up a sleeping task SUNRPC: Make the credential cache hashtable size configurable SUNRPC: Store the hashtable size in struct rpc_cred_cache NFS: Ensure the AUTH_UNIX credcache is allocated dynamically NFS: Fix the NFS users of rpc_restart_call() SUNRPC: The function rpc_restart_call() should return success/failure NFSv4: Get rid of the bogus RPC_ASSASSINATED(task) checks NFSv4: Clean up the process of renewing the NFSv4 lease NFSv4.1: Handle NFS4ERR_DELAY on SEQUENCE correctly NFS: nfs_rename() should not have to flush out writebacks ...
2010-08-07Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: add retrieve request fuse: add store request fuse: don't use atomic kmap
2010-08-07Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2: (45 commits) nilfs2: reject filesystem with unsupported block size nilfs2: avoid rec_len overflow with 64KB block size nilfs2: simplify nilfs_get_page function nilfs2: reject incompatible filesystem nilfs2: add feature set fields to super block nilfs2: clarify byte offset in super block format nilfs2: apply read-ahead for nilfs_btree_lookup_contig nilfs2: introduce check flag to btree node buffer nilfs2: add btree get block function with readahead option nilfs2: add read ahead mode to nilfs_btnode_submit_block nilfs2: fix buffer head leak in nilfs_btnode_submit_block nilfs2: eliminate inline keywords in btree implementation nilfs2: get maximum number of child nodes from bmap object nilfs2: reduce repetitive calculation of max number of child nodes nilfs2: optimize calculation of min/max number of btree node children nilfs2: remove redundant pointer checks in bmap lookup functions nilfs2: get rid of nilfs_bmap_union nilfs2: unify bmap set_target_v operations nilfs2: get rid of nilfs_btree uses nilfs2: get rid of nilfs_direct uses ...
2010-08-07Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4Linus Torvalds
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (40 commits) ext4: Adding error check after calling ext4_mb_regular_allocator() ext4: Fix dirtying of journalled buffers in data=journal mode ext4: re-inline ext4_rec_len_(to|from)_disk functions jbd2: Remove t_handle_lock from start_this_handle() jbd2: Change j_state_lock to be a rwlock_t jbd2: Use atomic variables to avoid taking t_handle_lock in jbd2_journal_stop ext4: Add mount options in superblock ext4: force block allocation on quota_off ext4: fix freeze deadlock under IO ext4: drop inode from orphan list if ext4_delete_inode() fails ext4: check to make make sure bd_dev is set before dereferencing it jbd2: Make barrier messages less scary ext4: don't print scary messages for allocation failures post-abort ext4: fix EFBIG edge case when writing to large non-extent file ext4: fix ext4_get_blocks references ext4: Always journal quota file modifications ext4: Fix potential memory leak in ext4_fill_super ext4: Don't error out the fs if the user tries to make a file too big ext4: allocate stripe-multiple IOs on stripe boundaries ext4: move aio completion after unwritten extent conversion ... Fix up conflicts in fs/ext4/inode.c as per Ted. Fix up xfs conflicts as per earlier xfs merge.
2010-08-07Merge branch 'for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6: ext3: Fix dirtying of journalled buffers in data=journal mode ext3: default to ordered mode quota: Use mark_inode_dirty_sync instead of mark_inode_dirty quota: Change quota error message to print out disk and function name MAINTAINERS: Update entries of ext2 and ext3 MAINTAINERS: Update address of Andreas Dilger ext3: Avoid filesystem corruption after a crash under heavy delete load ext3: remove vestiges of nobh support ext3: Fix set but unused variables quota: clean up quota active checks quota: Clean up the namespace in dqblk_xfs.h quota: check quota reservation on remove_dquot_ref
2010-08-07Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm: fs/dlm: Drop unnecessary null test dlm: use genl_register_family_with_ops()
2010-08-07Merge branch 'for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6: udf: super.c Fix warning: variable 'sbi' set but not used udf: remove duplicated #include
2010-08-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: [DNS RESOLVER] Minor typo correction DNS: Fixes for the DNS query module cifs: Include linux/err.h for IS_ERR and PTR_ERR DNS: Make AFS go to the DNS for AFSDB records for unknown cells DNS: Separate out CIFS DNS Resolver code cifs: account for new creduid=0x%x parameter in spnego upcall string cifs: reduce false positives with inode aliasing serverino autodisable CIFS: Make cifs_convert_address() take a const src pointer and a length cifs: show features compiled in as part of DebugData cifs: update README Fix up trivial conflicts in fs/cifs/cifsfs.c due to workqueue changes
2010-08-07Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wqLinus Torvalds
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (55 commits) workqueue: mark init_workqueues() as early_initcall() workqueue: explain for_each_*cwq_cpu() iterators fscache: fix build on !CONFIG_SYSCTL slow-work: kill it gfs2: use workqueue instead of slow-work drm: use workqueue instead of slow-work cifs: use workqueue instead of slow-work fscache: drop references to slow-work fscache: convert operation to use workqueue instead of slow-work fscache: convert object to use workqueue instead of slow-work workqueue: fix how cpu number is stored in work->data workqueue: fix mayday_mask handling on UP workqueue: fix build problem on !CONFIG_SMP workqueue: fix locking in retry path of maybe_create_worker() async: use workqueue for worker pool workqueue: remove WQ_SINGLE_CPU and use WQ_UNBOUND instead workqueue: implement unbound workqueue workqueue: prepare for WQ_UNBOUND implementation libata: take advantage of cmwq and remove concurrency limitations workqueue: fix worker management invocation without pending works ... Fixed up conflicts in fs/cifs/* as per Tejun. Other trivial conflicts in include/linux/workqueue.h, kernel/trace/Kconfig and kernel/workqueue.c
2010-08-07nfsd4: fix file open accounting for RDWR opensJ. Bruce Fields
Commit f9d7562fdb9dc0ada3a7aba5dbbe9d965e2a105d "nfsd4: share file descriptors between stateid's" didn't correctly account for O_RDWR opens. Symptoms include leaked files, resulting in failures to unmount and/or warnings about orphaned inodes on reboot. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-08-06nfsd: don't allow setting maxblksize after svc createdJ. Bruce Fields
It's harmless to set this after the server is created, but also ineffective, since the value is only used at the time of svc_create_pooled(). So fail the attempt, in keeping with the pattern set by write_versions, write_{lease,grace}time and write_recoverydir. (This could break userspace that tried to write to nfsd/max_block_size between setting up sockets and starting the server. However, such code wouldn't have worked anyway, and I don't know of any examples--rpc.nfsd in nfs-utils, probably the only user of the interface, doesn't do that.) Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-08-06nfsd: initialize nfsd versions before creating svcJ. Bruce Fields
Commit 59db4a0c102e0de226a3395dbf25ea51bf845937 "nfsd: move more into nfsd_startup()" inadvertently moved nfsd_versions after nfsd_create_svc(). On older distributions using an rpc.nfsd that does not explicitly set the list of nfsd versions, this results in svc-create_pooled() being called with an empty versions array. The resulting incomplete initialization leads to a NULL dereference in svc_process_common() the first time a client accesses the server. Move nfsd_reset_versions() back before the svc_create_pooled(); this time, put it closer to the svc_create_pooled() call, to make this mistake more difficult in the future. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-08-06nfsd41: Fix a crash when a callback is retriedBoaz Harrosh
If a callback is retried at nfsd4_cb_recall_done() due to some error, the returned rpc reply crashes here: @@ -514,6 +514,7 @@ decode_cb_sequence(struct xdr_stream *xdr, struct nfsd4_cb_sequence *res, u32 dummy; __be32 *p; + BUG_ON(!res); if (res->cbs_minorversion == 0) return 0; [BUG_ON added for demonstration] This is because the nfsd4_cb_done_sequence() has NULLed out the task->tk_msg.rpc_resp pointer. Also eventually the rpc would use the new slot without making sure it is free by calling nfsd41_cb_setup_sequence(). This problem was introduced by a 4.1 protocol addition patch: [0421b5c5] nfsd41: Backchannel: Implement cb_recall over NFSv4.1 Which was overlooking the possibility of an RPC callback retries. For not-4.1 case redoing the _prepare is harmless. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-08-06nfsd: fix startup/shutdown order bugJ. Bruce Fields
We must create the server before we can call init_socks or check the number of threads. Symptoms were a NULL pointer dereference in nfsd_svc(). Problem identified by Jeff Layton. Also fix a minor cleanup-on-error case in nfsd_startup(). Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-08-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (28 commits) driver core: device_rename's new_name can be const sysfs: Remove owner field from sysfs struct attribute powerpc/pci: Remove owner field from attribute initialization in PCI bridge init regulator: Remove owner field from attribute initialization in regulator core driver leds: Remove owner field from attribute initialization in bd2802 driver scsi: Remove owner field from attribute initialization in ARCMSR driver scsi: Remove owner field from attribute initialization in LPFC driver cgroupfs: create /sys/fs/cgroup to mount cgroupfs on Driver core: Add BUS_NOTIFY_BIND_DRIVER driver core: fix memory leak on one error path in bus_register() debugfs: no longer needs to depend on SYSFS sysfs: Fix one more signature discrepancy between sysfs implementation and docs. sysfs: fix discrepancies between implementation and documentation dcdbas: remove a redundant smi_data_buf_free in dcdbas_exit dmi-id: fix a memory leak in dmi_id_init error path sysfs: sysfs_chmod_file's attr can be const firmware: Update hotplug script Driver core: move platform device creation helpers to .init.text (if MODULE=n) Driver core: reduce duplicated code for platform_device creation Driver core: use kmemdup in platform_device_add_resources ...
2010-08-06NFS: NFSv4.1 is no longer a "developer only" featureTrond Myklebust
Mark it as 'experimental' instead, since in practice, NFSv4.1 should now be relatively stable. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-08-06NFS: NFS_V4 is no longer an EXPERIMENTAL featureTrond Myklebust
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-08-06NFS: Fix /proc/mount for legacy binary interfaceBryan Schumaker
Add a flag so we know if we mounted the NFS server using the legacy binary interface. If we used the legacy interface, then we should not show the mountd options. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-08-06NFS: Fix the locking in nfs4_callback_getattrTrond Myklebust
The delegation is protected by RCU now, so we need to replace the nfsi->rwsem protection with an rcu protected section. Reported-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-08-06Merge branch 'perf-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (162 commits) tracing/kprobes: unregister_trace_probe needs to be called under mutex perf: expose event__process function perf events: Fix mmap offset determination perf, powerpc: fsl_emb: Restore setting perf_sample_data.period perf, powerpc: Convert the FSL driver to use local64_t perf tools: Don't keep unreferenced maps when unmaps are detected perf session: Invalidate last_match when removing threads from rb_tree perf session: Free the ref_reloc_sym memory at the right place x86,mmiotrace: Add support for tracing STOS instruction perf, sched migration: Librarize task states and event headers helpers perf, sched migration: Librarize the GUI class perf, sched migration: Make the GUI class client agnostic perf, sched migration: Make it vertically scrollable perf, sched migration: Parameterize cpu height and spacing perf, sched migration: Fix key bindings perf, sched migration: Ignore unhandled task states perf, sched migration: Handle ignored migrate out events perf: New migration tool overview tracing: Drop cpparg() macro perf: Use tracepoint_synchronize_unregister() to flush any pending tracepoint call ... Fix up trivial conflicts in Makefile and drivers/cpufreq/cpufreq.c
2010-08-06Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: Revert "net: Make accesses to ->br_port safe for sparse RCU" mce: convert to rcu_dereference_index_check() net: Make accesses to ->br_port safe for sparse RCU vfs: add fs.h to define struct file lockdep: Add an in_workqueue_context() lockdep-based test function rcu: add __rcu API for later sparse checking rcu: add an rcu_dereference_index_check() tree/tiny rcu: Add debug RCU head objects mm: remove all rcu head initializations fs: remove all rcu head initializations, except on_stack initializations powerpc: remove all rcu head initializations
2010-08-06Fix init ordering of /dev/console vs callers of modprobeDavid Howells
Make /dev/console get initialised before any initialisation routine that invokes modprobe because if modprobe fails, it's going to want to open /dev/console, presumably to write an error message to. The problem with that is that if the /dev/console driver is not yet initialised, the chardev handler will call request_module() to invoke modprobe, which will fail, because we never compile /dev/console as a module. This will lead to a modprobe loop, showing the following in the kernel log: request_module: runaway loop modprobe char-major-5-1 request_module: runaway loop modprobe char-major-5-1 request_module: runaway loop modprobe char-major-5-1 request_module: runaway loop modprobe char-major-5-1 request_module: runaway loop modprobe char-major-5-1 This can happen, for example, when the built in md5 module can't find the built in cryptomgr module (because the latter fails to initialise). The md5 module comes before the call to tty_init(), presumably because 'crypto' comes before 'drivers' alphabetically. Fix this by calling tty_init() from chrdev_init(). Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-05sysfs: sysfs_chmod_file's attr can be constJean Delvare
sysfs_chmod_file doesn't change the attribute it operates on, so this attribute can be marked const. Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-05ext4: Adding error check after calling ext4_mb_regular_allocator()Aditya Kali
If the bitmap block on disk is bad, ext4_mb_load_buddy() returns an error. This error is returned to the caller, ext4_mb_regular_allocator() and then to ext4_mb_new_blocks(). But ext4_mb_new_blocks() did not check for the return value of ext4_mb_regular_allocator() and would repeatedly try to load the bitmap block. The fix simply catches the return value and exits out of the 'repeat' loop after cleanup. We also take the opportunity to clean up the error handling in ext4_mb_new_blocks(). Google-Bug-Id: 2853530 Signed-off-by: Aditya Kali <adityakali@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-08-05aio: fix wrong subsystem commentsSatoru Takeuchi
- sys_io_destroy(): acutually return -EINVAL if the context pointed to is invalidIndex: linux-2.6.33-rc4/fs/aio.c - sys_io_getevents(): An argument specifying timeout is not `when', but `timeout'. - sys_io_getevents(): Should describe what is returned if this syscall succeeds. Signed-off-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-05ext3: Fix dirtying of journalled buffers in data=journal modeJan Kara
In data=journal mode, we still use block_write_begin() to prepare page for writing. This function can occasionally mark buffer dirty which violates journalling assumptions - when a buffer is part of a transaction, it should be dirty and a buffer can be already part of a forget list of some transaction when block_write_begin() gets called. This violation of journalling assumptions then results in "JBD: Spotted dirty metadata buffer..." warnings. In fact, temporary dirtying the buffer while the page is still locked does not really cause problems to the journalling because we won't write the buffer until the page gets unlocked. So we just have to make sure to clear dirty bits before unlocking the page. Reviewed-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Jan Kara <jack@suse.cz>
2010-08-05fs/dlm: Drop unnecessary null testJulia Lawall
hlist_for_each_entry binds its first argument to a non-null value, and thus any null test on the value of that argument is superfluous. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ iterator I; expression x,E,E1,E2; statement S,S1,S2; @@ I(x,...) { <... - (x != NULL) && E ...> } // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: David Teigland <teigland@redhat.com>
2010-08-05dlm: use genl_register_family_with_ops()Changli Gao
Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: David Teigland <teigland@redhat.com>
2010-08-05ext4: Fix dirtying of journalled buffers in data=journal modeJan Kara
In data=journal mode, we still use block_write_begin() to prepare page for writing. This function can occasionally mark buffer dirty which violates journalling assumptions - when a buffer is part of a transaction, it should be dirty and a buffer can be already part of a forget list of some transaction when block_write_begin() gets called. This violation of journalling assumptions then results in "JBD: Spotted dirty metadata buffer..." warnings. In fact, temporary dirtying the buffer while the page is still locked does not really cause problems to the journalling because we won't write the buffer until the page gets unlocked. So we just have to make sure to clear dirty bits before unlocking the page. Signed-off-by: Jan Kara <jack@suse.cz>
2010-08-05DNS: Make AFS go to the DNS for AFSDB records for unknown cellsWang Lei
Add DNS query support for AFS so that it can get the IP addresses of Volume Location servers from the DNS using an AFSDB record. This requires userspace support. /etc/request-key.conf must be configured to invoke a helper for dns_resolver type keys with a subtype of "afsdb:" in the description. Signed-off-by: Wang Lei <wang840925@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2010-08-05DNS: Separate out CIFS DNS Resolver codeWang Lei
Separate out the DNS resolver key type from the CIFS filesystem into its own module so that it can be made available for general use, including the AFS filesystem module. This facility makes it possible for the kernel to upcall to userspace to have it issue DNS requests, package up the replies and present them to the kernel in a useful form. The kernel is then able to cache the DNS replies as keys can be retained in keyrings. Resolver keys are of type "dns_resolver" and have a case-insensitive description that is of the form "[<type>:]<domain_name>". The optional <type> indicates the particular DNS lookup and packaging that's required. The <domain_name> is the query to be made. If <type> isn't given, a basic hostname to IP address lookup is made, and the result is stored in the key in the form of a printable string consisting of a comma-separated list of IPv4 and IPv6 addresses. This key type is supported by userspace helpers driven from /sbin/request-key and configured through /etc/request-key.conf. The cifs.upcall utility is invoked for UNC path server name to IP address resolution. The CIFS functionality is encapsulated by the dns_resolve_unc_to_ip() function, which is used to resolve a UNC path to an IP address for CIFS filesystem. This part remains in the CIFS module for now. See the added Documentation/networking/dns_resolver.txt for more information. Signed-off-by: Wang Lei <wang840925@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2010-08-05cifs: account for new creduid=0x%x parameter in spnego upcall stringJeff Layton
The commit that added the creduid=0x%x parameter failed to increase the buffer allocation to account for it. Reported-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2010-08-05cifs: reduce false positives with inode aliasing serverino autodisableJeff Layton
It turns out that not all directory inodes with dentries on the i_dentry list are unusable here. We only consider them unusable if they are still hashed or if they have a root dentry attached. Full disclosure -- this check is inherently racy. There's nothing that stops someone from slapping a new dentry onto this inode just after this check, or hashing an existing one that's already attached. So, this is really a "best effort" thing to work around misbehaving servers. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2010-08-05CIFS: Make cifs_convert_address() take a const src pointer and a lengthDavid Howells
Make cifs_convert_address() take a const src pointer and a length so that all the strlen() calls in their can be cut out and to make it unnecessary to modify the src string. Also return the data length from dns_resolve_server_name_to_ip() so that a strlen() can be cut out of cifs_compose_mount_options() too. Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2010-08-05cifs: show features compiled in as part of DebugDataSuresh Jayaraman
Fixed the nit pointed out by Jeff. From: Suresh Jayaraman <sjayaraman@suse.de> Subject: [PATCH 1/2] cifs: show features compiled in as part of DebugData This patch adds the features that are compiled in to the CIFS debugging data as shown below: $cat /proc/fs/cifs/DebugData Display Internal CIFS Data Structures for Debugging --------------------------------------------------- CIFS Version 1.64 Features: dfs fscache posix spnego xattr Active VFS Requests: 0 ... This patch provides a definitive way to tell what features are currently enabled in the running kernel. This could also help debugging. Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de> Cc: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2010-08-05cifs: update READMESuresh Jayaraman
Update the README file to reflect that now DebugData shows all the features enabled. Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de> Cc: Jeff Layton <jlayton@redhat.com> -- fs/cifs/README | 5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-) Signed-off-by: Steve French <sfrench@us.ibm.com>
2010-08-05ext4: re-inline ext4_rec_len_(to|from)_disk functionsEric Sandeen
commit 3d0518f4, "ext4: New rec_len encoding for very large blocksizes" made several changes to this path, but from a perf perspective, un-inlining ext4_rec_len_from_disk() seems most significant. This function is called from ext4_check_dir_entry(), which on a file-creation workload is called extremely often. I tested this with bonnie: # bonnie++ -u root -s 0 -f -x 200 -d /mnt/test -n 32 (this does 200 iterations) and got this for the file creations: ext4 stock: Average = 21206.8 files/s ext4 inlined: Average = 22346.7 files/s (+5%) Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-08-04Merge branch 'for-next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (48 commits) Documentation: update broken web addresses. fix comment typo "choosed" -> "chosen" hostap:hostap_hw.c Fix typo in comment Fix spelling contorller -> controller in comments Kconfig.debug: FAIL_IO_TIMEOUT: typo Faul -> Fault fs/Kconfig: Fix typo Userpace -> Userspace Removing dead MACH_U300_BS26 drivers/infiniband: Remove unnecessary casts of private_data fs/ocfs2: Remove unnecessary casts of private_data libfc: use ARRAY_SIZE scsi: bfa: use ARRAY_SIZE drm: i915: use ARRAY_SIZE drm: drm_edid: use ARRAY_SIZE synclink: use ARRAY_SIZE block: cciss: use ARRAY_SIZE comment typo fixes: charater => character fix comment typos concerning "challenge" arm: plat-spear: fix typo in kerneldoc reiserfs: typo comment fix update email address ...
2010-08-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1443 commits) phy/marvell: add 88ec048 support igb: Program MDICNFG register prior to PHY init e1000e: correct MAC-PHY interconnect register offset for 82579 hso: Add new product ID can: Add driver for esd CAN-USB/2 device l2tp: fix export of header file for userspace can-raw: Fix skb_orphan_try handling Revert "net: remove zap_completion_queue" net: cleanup inclusion phy/marvell: add 88e1121 interface mode support u32: negative offset fix net: Fix a typo from "dev" to "ndev" igb: Use irq_synchronize per vector when using MSI-X ixgbevf: fix null pointer dereference due to filter being set for VLAN 0 e1000e: Fix irq_synchronize in MSI-X case e1000e: register pm_qos request on hardware activation ip_fragment: fix subtracting PPPOE_SES_HLEN from mtu twice net: Add getsockopt support for TCP thin-streams cxgb4: update driver version cxgb4: add new PCI IDs ... Manually fix up conflicts in: - drivers/net/e1000e/netdev.c: due to pm_qos registration infrastructure changes - drivers/net/phy/marvell.c: conflict between adding 88ec048 support and cleaning up the IDs - drivers/net/wireless/ipw2x00/ipw2100.c: trivial ipw2100_pm_qos_req conflict (registration change vs marking it static)
2010-08-04block_dev: always serialize exclusive open attemptsTejun Heo
bd_prepare_to_claim() incorrectly allowed multiple attempts for exclusive open to progress in parallel if the attempting holders are identical. This triggered BUG_ON() as reported in the following bug. https://bugzilla.kernel.org/show_bug.cgi?id=16393 __bd_abort_claiming() is used to finish claiming blocks and doesn't work if multiple openers are inside a claiming block. Allowing multiple parallel open attempts to continue doesn't gain anything as those are serialized down in the call chain anyway. Fix it by always allowing only single open attempt in a claiming block. This problem can easily be reproduced by adding a delay after bd_prepare_to_claim() and attempting to mount two partitions of a disk. stable: only applicable to v2.6.35 Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-04Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (90 commits) AppArmor: fix build warnings for non-const use of get_task_cred selinux: convert the policy type_attr_map to flex_array AppArmor: Enable configuring and building of the AppArmor security module TOMOYO: Use pathname specified by policy rather than execve() AppArmor: update path_truncate method to latest version AppArmor: core policy routines AppArmor: policy routines for loading and unpacking policy AppArmor: mediation of non file objects AppArmor: LSM interface, and security module initialization AppArmor: Enable configuring and building of the AppArmor security module AppArmor: update Maintainer and Documentation AppArmor: functions for domain transitions AppArmor: file enforcement routines AppArmor: userspace interfaces AppArmor: dfa match engine AppArmor: contexts used in attaching policy to system objects AppArmor: basic auditing infrastructure. AppArmor: misc. base functions and defines TOMOYO: Update version to 2.3.0 TOMOYO: Fix quota check. ...