summaryrefslogtreecommitdiff
path: root/init
AgeCommit message (Collapse)Author
2009-09-04Merge branch 'linus' into core/rcuIngo Molnar
Merge reason: Avoid fuzz in init/main.c and update from rc6 to rc8. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-28SFI: add platform-independent core supportFeng Tang
drivers/sfi/sfi_core.c contains the generic SFI implementation. It has a private header, sfi_core.h, for its own use and the private use of future files in drivers/sfi/ Signed-off-by: Feng Tang <feng.tang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2009-08-27init: Move sched_clock_init after late_time_initThomas Gleixner
Some architectures initialize clocks and timers in late_time_init and x86 wants to do the same to avoid FIXMAP hackery for calibrating the TSC. That would result in undefined sched_clock readout and wreckaged printk timestamps again. We probably have those already on archs which do all their time/clock setup in late_time_init. There is no harm to move that after late_time_init except that a few more boot timestamps are stale. The scheduler is not active at that point so no real wreckage is expected. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <new-submission> Cc: linux-arch@vger.kernel.org
2009-08-25Merge branch 'tracing-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: tracing: Fix too large stack usage in do_one_initcall() tracing: handle broken names in ftrace filter ftrace: Unify effect of writing to trace_options and option/*
2009-08-23rcu: Remove CONFIG_PREEMPT_RCUPaul E. McKenney
Now that CONFIG_TREE_PREEMPT_RCU is in place, there is no further need for CONFIG_PREEMPT_RCU. Remove it, along with whatever subtle bugs it may (or may not) contain. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: akpm@linux-foundation.org Cc: mathieu.desnoyers@polymtl.ca Cc: josht@linux.vnet.ibm.com Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org LKML-Reference: <125097461396-git-send-email-> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-23rcu: Merge preemptable-RCU functionality into hierarchical RCUPaul E. McKenney
Create a kernel/rcutree_plugin.h file that contains definitions for preemptable RCU (or, under the #else branch of the #ifdef, empty definitions for the classic non-preemptable semantics). These definitions fit into plugins defined in kernel/rcutree.c for this purpose. This variant of preemptable RCU uses a new algorithm whose read-side expense is roughly that of classic hierarchical RCU under CONFIG_PREEMPT. This new algorithm's update-side expense is similar to that of classic hierarchical RCU, and, in absence of read-side preemption or blocking, is exactly that of classic hierarchical RCU. Perhaps more important, this new algorithm has a much simpler implementation, saving well over 1,000 lines of code compared to mainline's implementation of preemptable RCU, which will hopefully be retired in favor of this new algorithm. The simplifications are obtained by maintaining per-task nesting state for running tasks, and using a simple lock-protected algorithm to handle accounting when tasks block within RCU read-side critical sections, making use of lessons learned while creating numerous user-level RCU implementations over the past 18 months. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: akpm@linux-foundation.org Cc: mathieu.desnoyers@polymtl.ca Cc: josht@linux.vnet.ibm.com Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org LKML-Reference: <12509746134003-git-send-email-> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-21tracing: Fix too large stack usage in do_one_initcall()Ingo Molnar
One of my testboxes triggered this nasty stack overflow crash during SCSI probing: [ 5.874004] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 5.875004] device: 'sda': device_add [ 5.878004] BUG: unable to handle kernel NULL pointer dereference at 00000a0c [ 5.878004] IP: [<b1008321>] print_context_stack+0x81/0x110 [ 5.878004] *pde = 00000000 [ 5.878004] Thread overran stack, or stack corrupted [ 5.878004] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC [ 5.878004] last sysfs file: [ 5.878004] [ 5.878004] Pid: 1, comm: swapper Not tainted (2.6.31-rc6-tip-01272-g9919e28-dirty #5685) [ 5.878004] EIP: 0060:[<b1008321>] EFLAGS: 00010083 CPU: 0 [ 5.878004] EIP is at print_context_stack+0x81/0x110 [ 5.878004] EAX: cf8a3000 EBX: cf8a3fe4 ECX: 00000049 EDX: 00000000 [ 5.878004] ESI: b1cfce84 EDI: 00000000 EBP: cf8a3018 ESP: cf8a2ff4 [ 5.878004] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 [ 5.878004] Process swapper (pid: 1, ti=cf8a2000 task=cf8a8000 task.ti=cf8a3000) [ 5.878004] Stack: [ 5.878004] b1004867 fffff000 cf8a3ffc [ 5.878004] Call Trace: [ 5.878004] [<b1004867>] ? kernel_thread_helper+0x7/0x10 [ 5.878004] BUG: unable to handle kernel NULL pointer dereference at 00000a0c [ 5.878004] IP: [<b1008321>] print_context_stack+0x81/0x110 [ 5.878004] *pde = 00000000 [ 5.878004] Thread overran stack, or stack corrupted [ 5.878004] Oops: 0000 [#2] PREEMPT SMP DEBUG_PAGEALLOC The oops did not reveal any more details about the real stack that we have and the system got into an infinite loop of recursive pagefaults. So i booted with CONFIG_STACK_TRACER=y and the 'stacktrace' boot parameter. The box did not crash (timings/conditions probably changed a tiny bit to trigger the catastrophic crash), but the /debug/tracing/stack_trace file was rather revealing: Depth Size Location (72 entries) ----- ---- -------- 0) 3704 52 __change_page_attr+0xb8/0x290 1) 3652 24 __change_page_attr_set_clr+0x43/0x90 2) 3628 60 kernel_map_pages+0x108/0x120 3) 3568 40 prep_new_page+0x7d/0x130 4) 3528 84 get_page_from_freelist+0x106/0x420 5) 3444 116 __alloc_pages_nodemask+0xd7/0x550 6) 3328 36 allocate_slab+0xb1/0x100 7) 3292 36 new_slab+0x1c/0x160 8) 3256 36 __slab_alloc+0x133/0x2b0 9) 3220 4 kmem_cache_alloc+0x1bb/0x1d0 10) 3216 108 create_object+0x28/0x250 11) 3108 40 kmemleak_alloc+0x81/0xc0 12) 3068 24 kmem_cache_alloc+0x162/0x1d0 13) 3044 52 scsi_pool_alloc_command+0x29/0x70 14) 2992 20 scsi_host_alloc_command+0x22/0x70 15) 2972 24 __scsi_get_command+0x1b/0x90 16) 2948 28 scsi_get_command+0x35/0x90 17) 2920 24 scsi_setup_blk_pc_cmnd+0xd4/0x100 18) 2896 128 sd_prep_fn+0x332/0xa70 19) 2768 36 blk_peek_request+0xe7/0x1d0 20) 2732 56 scsi_request_fn+0x54/0x520 21) 2676 12 __generic_unplug_device+0x2b/0x40 22) 2664 24 blk_execute_rq_nowait+0x59/0x80 23) 2640 172 blk_execute_rq+0x6b/0xb0 24) 2468 32 scsi_execute+0xe0/0x140 25) 2436 64 scsi_execute_req+0x152/0x160 26) 2372 60 scsi_vpd_inquiry+0x6c/0x90 27) 2312 44 scsi_get_vpd_page+0x112/0x160 28) 2268 52 sd_revalidate_disk+0x1df/0x320 29) 2216 92 rescan_partitions+0x98/0x330 30) 2124 52 __blkdev_get+0x309/0x350 31) 2072 8 blkdev_get+0xf/0x20 32) 2064 44 register_disk+0xff/0x120 33) 2020 36 add_disk+0x6e/0xb0 34) 1984 44 sd_probe_async+0xfb/0x1d0 35) 1940 44 __async_schedule+0xf4/0x1b0 36) 1896 8 async_schedule+0x12/0x20 37) 1888 60 sd_probe+0x305/0x360 38) 1828 44 really_probe+0x63/0x170 39) 1784 36 driver_probe_device+0x5d/0x60 40) 1748 16 __device_attach+0x49/0x50 41) 1732 32 bus_for_each_drv+0x5b/0x80 42) 1700 24 device_attach+0x6b/0x70 43) 1676 16 bus_attach_device+0x47/0x60 44) 1660 76 device_add+0x33d/0x400 45) 1584 52 scsi_sysfs_add_sdev+0x6a/0x2c0 46) 1532 108 scsi_add_lun+0x44b/0x460 47) 1424 116 scsi_probe_and_add_lun+0x182/0x4e0 48) 1308 36 __scsi_add_device+0xd9/0xe0 49) 1272 44 ata_scsi_scan_host+0x10b/0x190 50) 1228 24 async_port_probe+0x96/0xd0 51) 1204 44 __async_schedule+0xf4/0x1b0 52) 1160 8 async_schedule+0x12/0x20 53) 1152 48 ata_host_register+0x171/0x1d0 54) 1104 60 ata_pci_sff_activate_host+0xf3/0x230 55) 1044 44 ata_pci_sff_init_one+0xea/0x100 56) 1000 48 amd_init_one+0xb2/0x190 57) 952 8 local_pci_probe+0x13/0x20 58) 944 32 pci_device_probe+0x68/0x90 59) 912 44 really_probe+0x63/0x170 60) 868 36 driver_probe_device+0x5d/0x60 61) 832 20 __driver_attach+0x89/0xa0 62) 812 32 bus_for_each_dev+0x5b/0x80 63) 780 12 driver_attach+0x1e/0x20 64) 768 72 bus_add_driver+0x14b/0x2d0 65) 696 36 driver_register+0x6e/0x150 66) 660 20 __pci_register_driver+0x53/0xc0 67) 640 8 amd_init+0x14/0x16 68) 632 572 do_one_initcall+0x2b/0x1d0 69) 60 12 do_basic_setup+0x56/0x6a 70) 48 20 kernel_init+0x84/0xce 71) 28 28 kernel_thread_helper+0x7/0x10 There's a lot of fat functions on that stack trace, but the largest of all is do_one_initcall(). This is due to the boot trace entry variables being on the stack. Fixing this is relatively easy, initcalls are fundamentally serialized, so we can move the local variables to file scope. Note that this large stack footprint was present for a couple of months already - what pushed my system over the edge was the addition of kmemleak to the call-chain: 6) 3328 36 allocate_slab+0xb1/0x100 7) 3292 36 new_slab+0x1c/0x160 8) 3256 36 __slab_alloc+0x133/0x2b0 9) 3220 4 kmem_cache_alloc+0x1bb/0x1d0 10) 3216 108 create_object+0x28/0x250 11) 3108 40 kmemleak_alloc+0x81/0xc0 12) 3068 24 kmem_cache_alloc+0x162/0x1d0 13) 3044 52 scsi_pool_alloc_command+0x29/0x70 This pushes the total to ~3800 bytes, only a tiny bit more was needed to corrupt the on-kernel-stack thread_info. The fix reduces the stack footprint from 572 bytes to 28 bytes. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <srostedt@redhat.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: <stable@kernel.org> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-15Merge commit 'v2.6.31-rc6' into core/rcuIngo Molnar
Merge reason: the branch was on pre-rc1 .30, update to latest. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-14Merge branch 'percpu-for-linus' into percpu-for-nextTejun Heo
Conflicts: arch/sparc/kernel/smp_64.c arch/x86/kernel/cpu/perf_counter.c arch/x86/kernel/setup_percpu.c drivers/cpufreq/cpufreq_ondemand.c mm/percpu.c Conflicts in core and arch percpu codes are mostly from commit ed78e1e078dd44249f88b1dd8c76dafb39567161 which substituted many num_possible_cpus() with nr_cpu_ids. As for-next branch has moved all the first chunk allocators into mm/percpu.c, the changes are moved from arch code to mm/percpu.c. Signed-off-by: Tejun Heo <tj@kernel.org>
2009-08-14init: set nr_cpu_ids before setup_per_cpu_areas()Tejun Heo
nr_cpu_ids is dependent only on cpu_possible_map and setup_per_cpu_areas() already depends on cpu_possible_map and will use nr_cpu_ids. Initialize nr_cpu_ids before setting up percpu areas. Signed-off-by: Tejun Heo <tj@kernel.org>
2009-08-04perf_counter: Set the CONFIG_PERF_COUNTERS default to y if CONFIG_PROFILING=yIngo Molnar
If user has already enabled profiling support in the kernel (for oprofile, old-style profiling of ftrace) then offer up perfcounters with a y default in interactive kconfig sessions. Still keep it off by default otherwise. Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-02tracing, perf_counter: Add help text to CONFIG_EVENT_PROFILEPeter Zijlstra
Explain what tracepoint profiling sources are about. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Jeff Garzik <jeff@garzik.org> LKML-Reference: <1248856508.6987.3041.camel@twins> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-13perf_counter: Fix the tracepoint channel to perfcountersChris Wilson
Fix a missed rename in EVENT_PROFILE support so that it gets built and allows tracepoint tracing from the 'perf' tool. Fix a typo in the (never before built & enabled) portion in perf_counter.c as well, and update that code to the attr.config changes as well. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Gamari <bgamari.foss@gmail.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <1246869094-21237-1-git-send-email-chris@chris-wilson.co.uk> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-03Merge branch 'master' into for-nextTejun Heo
Pull linus#master to merge PER_CPU_DEF_ATTRIBUTES and alpha build fix changes. As alpha in percpu tree uses 'weak' attribute instead of inline assembly, there's no need for __used attribute. Conflicts: arch/alpha/include/asm/percpu.h arch/mn10300/kernel/vmlinux.lds.S include/linux/percpu-defs.h
2009-06-24rcu: Remove Classic RCUPaul E. McKenney
Remove Classic RCU, given that the combination of Tree RCU and the proposed Bloatwatch RCU do everything that Classic RCU can with fewer bugs. Tree RCU has been default in x86 builds for almost six months, and seems to be quite reliable, so there does not seem to be much justification for keeping the Classic RCU code and config complexity around anymore. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> Cc: akpm@linux-foundation.org Cc: niv@us.ibm.com Cc: dvhltc@us.ibm.com Cc: dipankar@in.ibm.com Cc: dhowells@redhat.com Cc: lethal@linux-sh.org Cc: kernel@wantstofly.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-24percpu: use dynamic percpu allocator as the default percpu allocatorTejun Heo
This patch makes most !CONFIG_HAVE_SETUP_PER_CPU_AREA archs use dynamic percpu allocator. The first chunk is allocated using embedding helper and 8k is reserved for modules. This ensures that the new allocator behaves almost identically to the original allocator as long as static percpu variables are concerned, so it shouldn't introduce much breakage. s390 and alpha use custom SHIFT_PERCPU_PTR() to work around addressing range limit the addressing model imposes. Unfortunately, this breaks if the address is specified using a variable, so for now, the two archs aren't converted. The following architectures are affected by this change. * sh * arm * cris * mips * sparc(32) * blackfin * avr32 * parisc (broken, under investigation) * m32r * powerpc(32) As this change makes the dynamic allocator the default one, CONFIG_HAVE_DYNAMIC_PER_CPU_AREA is replaced with its invert - CONFIG_HAVE_LEGACY_PER_CPU_AREA, which is added to yet-to-be converted archs. These archs implement their own setup_per_cpu_areas() and the conversion is not trivial. * powerpc(64) * sparc(64) * ia64 * alpha * s390 Boot and batch alloc/free tests on x86_32 with debug code (x86_32 doesn't use default first chunk initialization). Compile tested on sparc(32), powerpc(32), arm and alpha. Kyle McMartin reported that this change breaks parisc. The problem is still under investigation and he is okay with pushing this patch forward and fixing parisc later. [ Impact: use dynamic allocator for most archs w/o custom percpu setup ] Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Reviewed-by: Christoph Lameter <cl@linux.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Mikael Starvik <starvik@axis.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Bryan Wu <cooloney@kernel.org> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Grant Grundler <grundler@parisc-linux.org> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@elte.hu>
2009-06-24Merge branches 'acerhdf', 'acpi-pci-bind', 'bjorn-pci-root', ↵Len Brown
'bugzilla-12904', 'bugzilla-13121', 'bugzilla-13396', 'bugzilla-13533', 'bugzilla-13612', 'c3_lock', 'hid-cleanups', 'misc-2.6.31', 'pdc-leak-fix', 'pnpacpi', 'power_nocheck', 'thinkpad_acpi', 'video' and 'wmi' into release
2009-06-23mm/init: cpu_hotplug_init() must be initialized before SLABLinus Torvalds
SLAB uses get/put_online_cpus() which use a mutex which is itself only initialized when cpu_hotplug_init() is called. Currently we hang suring boot in SLAB due to doing that too late. Reported by James Bottomley and Sachin Sant (and possibly others). Debugged by Benjamin Herrenschmidt. This just removes the dynamic initialization of the data structures, and replaces it with a static one, avoiding this dependency entirely, and removing one unnecessary special initcall. Tested-by: Sachin Sant <sachinp@in.ibm.com> Tested-by: James Bottomley <James.Bottomley@HansenPartnership.com> Tested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18mm: Extend gfp masking to the page allocatorBenjamin Herrenschmidt
The page allocator also needs the masking of gfp flags during boot, so this moves it out of slab/slub and uses it with the page allocator as well. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18kernel: constructor supportPeter Oberparleiter
Call constructors (gcc-generated initcall-like functions) during kernel start and module load. Constructors are e.g. used for gcov data initialization. Disable constructor support for usermode Linux to prevent conflicts with host glibc. Signed-off-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: WANG Cong <xiyou.wangcong@gmail.com> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Huang Ying <ying.huang@intel.com> Cc: Li Wei <W.Li@Sun.COM> Cc: Michael Ellerman <michaele@au1.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Heiko Carstens <heicars2@linux.vnet.ibm.com> Cc: Martin Schwidefsky <mschwid2@linux.vnet.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-17mm: Move pgtable_cache_init() earlierBenjamin Herrenschmidt
Some architectures need to initialize SLAB caches to be able to allocate page tables. They do that from pgtable_cache_init() so the later should be called earlier now, best is before vmalloc_init(). Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-17Merge branch 'akpm'Linus Torvalds
* akpm: (182 commits) fbdev: bf54x-lq043fb: use kzalloc over kmalloc/memset fbdev: *bfin*: fix __dev{init,exit} markings fbdev: *bfin*: drop unnecessary calls to memset fbdev: bfin-t350mcqb-fb: drop unused local variables fbdev: blackfin has __raw I/O accessors, so use them in fb.h fbdev: s1d13xxxfb: add accelerated bitblt functions tcx: use standard fields for framebuffer physical address and length fbdev: add support for handoff from firmware to hw framebuffers intelfb: fix a bug when changing video timing fbdev: use framebuffer_release() for freeing fb_info structures radeon: P2G2CLK_ALWAYS_ONb tested twice, should 2nd be P2G2CLK_DAC_ALWAYS_ONb? s3c-fb: CPUFREQ frequency scaling support s3c-fb: fix resource releasing on error during probing carminefb: fix possible access beyond end of carmine_modedb[] acornfb: remove fb_mmap function mb862xxfb: use CONFIG_OF instead of CONFIG_PPC_OF mb862xxfb: restrict compliation of platform driver to PPC Samsung SoC Framebuffer driver: add Alpha Channel support atmel-lcdc: fix pixclock upper bound detection offb: use framebuffer_alloc() to allocate fb_info struct ... Manually fix up conflicts due to kmemcheck in mm/slab.c
2009-06-17cpuset,mm: update tasks' mems_allowed in timeMiao Xie
Fix allocating page cache/slab object on the unallowed node when memory spread is set by updating tasks' mems_allowed after its cpuset's mems is changed. In order to update tasks' mems_allowed in time, we must modify the code of memory policy. Because the memory policy is applied in the process's context originally. After applying this patch, one task directly manipulates anothers mems_allowed, and we use alloc_lock in the task_struct to protect mems_allowed and memory policy of the task. But in the fast path, we didn't use lock to protect them, because adding a lock may lead to performance regression. But if we don't add a lock,the task might see no nodes when changing cpuset's mems_allowed to some non-overlapping set. In order to avoid it, we set all new allowed nodes, then clear newly disallowed ones. [lee.schermerhorn@hp.com: The rework of mpol_new() to extract the adjusting of the node mask to apply cpuset and mpol flags "context" breaks set_mempolicy() and mbind() with MPOL_PREFERRED and a NULL nodemask--i.e., explicit local allocation. Fix this by adding the check for MPOL_PREFERRED and empty node mask to mpol_new_mpolicy(). Remove the now unneeded 'nodes = NULL' from mpol_new(). Note that mpol_new_mempolicy() is always called with a non-NULL 'nodes' parameter now that it has been removed from mpol_new(). Therefore, we don't need to test nodes for NULL before testing it for 'empty'. However, just to be extra paranoid, add a VM_BUG_ON() to verify this assumption.] [lee.schermerhorn@hp.com: I don't think the function name 'mpol_new_mempolicy' is descriptive enough to differentiate it from mpol_new(). This function applies cpuset set context, usually constraining nodes to those allowed by the cpuset. However, when the 'RELATIVE_NODES flag is set, it also translates the nodes. So I settled on 'mpol_set_nodemask()', because the comment block for mpol_new() mentions that we need to call this function to "set nodes". Some additional minor line length, whitespace and typo cleanup.] Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Paul Menage <menage@google.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-16Merge branch 'for-linus2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vegard/kmemcheck * 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/vegard/kmemcheck: (39 commits) signal: fix __send_signal() false positive kmemcheck warning fs: fix do_mount_root() false positive kmemcheck warning fs: introduce __getname_gfp() trace: annotate bitfields in struct ring_buffer_event net: annotate struct sock bitfield c2port: annotate bitfield for kmemcheck net: annotate inet_timewait_sock bitfields ieee1394/csr1212: fix false positive kmemcheck report ieee1394: annotate bitfield net: annotate bitfields in struct inet_sock net: use kmemcheck bitfields API for skbuff kmemcheck: introduce bitfield API kmemcheck: add opcode self-testing at boot x86: unify pte_hidden x86: make _PAGE_HIDDEN conditional kmemcheck: make kconfig accessible for other architectures kmemcheck: enable in the x86 Kconfig kmemcheck: add hooks for the page allocator kmemcheck: add hooks for page- and sg-dma-mappings kmemcheck: don't track page tables ...
2009-06-16driver core: set default SYSFS_DEPRECATED=nKay Sievers
All recent distros depend on the non-deprecated sysfs layout, so change the default value of the option to reflect that. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2009-06-15Merge commit 'linus/master' into HEADVegard Nossum
Conflicts: MAINTAINERS Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
2009-06-15fs: fix do_mount_root() false positive kmemcheck warningVegard Nossum
This false positive is due to the fact that do_mount_root() fakes a mount option (which is normally read from userspace), and the kernel unconditionally reads a whole page for the mount option. Hide the false positive by using the new __getname_gfp() with the __GFP_NOTRACK_FALSE_POSITIVE flag. Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
2009-06-14Merge branch 'master' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-next * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-next: (53 commits) .gitignore: ignore *.lzma files kbuild: add generic --set-str option to scripts/config kbuild: simplify argument loop in scripts/config kbuild: handle non-existing options in scripts/config kallsyms: generalize text region handling kallsyms: support kernel symbols in Blackfin on-chip memory documentation: make version fix kbuild: fix a compile warning gitignore: Add GNU GLOBAL files to top .gitignore kbuild: fix delay in setlocalversion on readonly source README: fix misleading pointer to the defconf directory vmlinux.lds.h update kernel-doc: cleanup perl script Improve vmlinux.lds.h support for arch specific linker scripts kbuild: fix headers_exports with boolean expression kbuild/headers_check: refine extern check kbuild: fix "Argument list too long" error for "make headers_check", ignore *.patch files Remove bashisms from scripts menu: fix embedded menu presentation ...
2009-06-13kmemcheck: add the kmemcheck coreVegard Nossum
General description: kmemcheck is a patch to the linux kernel that detects use of uninitialized memory. It does this by trapping every read and write to memory that was allocated dynamically (e.g. using kmalloc()). If a memory address is read that has not previously been written to, a message is printed to the kernel log. Thanks to Andi Kleen for the set_memory_4k() solution. Andrew Morton suggested documenting the shadow member of struct page. Signed-off-by: Vegard Nossum <vegardno@ifi.uio.no> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> [export kmemcheck_mark_initialized] [build fix for setup_max_cpus] Signed-off-by: Ingo Molnar <mingo@elte.hu> [rebased for mainline inclusion] Signed-off-by: Vegard Nossum <vegardno@ifi.uio.no>
2009-06-13ACPI: move declaration acpi_early_init() to acpi.hLen Brown
Signed-off-by: Len Brown <len.brown@intel.com>
2009-06-12Merge branch 'perfcounters-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf_counter: Start documenting HAVE_PERF_COUNTERS requirements perf_counter: Add forward/backward attribute ABI compatibility perf record: Explicity program a default counter perf_counter: Remove PERF_TYPE_RAW special casing perf_counter: PERF_TYPE_HW_CACHE is a hardware counter too powerpc, perf_counter: Fix performance counter event types perf_counter/x86: Add a quirk for Atom processors perf_counter tools: Remove one L1-data alias
2009-06-12perf_counter: Start documenting HAVE_PERF_COUNTERS requirementsMike Frysinger
Help out arch porters who want to support perf counters by listing some basic requirements. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1244827063-24046-1-git-send-email-vapier@gentoo.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-12slab,slub: don't enable interrupts during early bootPekka Enberg
As explained by Benjamin Herrenschmidt: Oh and btw, your patch alone doesn't fix powerpc, because it's missing a whole bunch of GFP_KERNEL's in the arch code... You would have to grep the entire kernel for things that check slab_is_available() and even then you'll be missing some. For example, slab_is_available() didn't always exist, and so in the early days on powerpc, we used a mem_init_done global that is set form mem_init() (not perfect but works in practice). And we still have code using that to do the test. Therefore, mask out __GFP_WAIT, __GFP_IO, and __GFP_FS in the slab allocators in early boot code to avoid enabling interrupts. Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-06-12memcg: fix page_cgroup fatal error in FLATMEMKAMEZAWA Hiroyuki
Now, SLAB is configured in very early stage and it can be used in init routine now. But replacing alloc_bootmem() in FLAT/DISCONTIGMEM's page_cgroup() initialization breaks the allocation, now. (Works well in SPARSEMEM case...it supports MEMORY_HOTPLUG and size of page_cgroup is in reasonable size (< 1 << MAX_ORDER.) This patch revive FLATMEM+memory cgroup by using alloc_bootmem. In future, We stop to support FLATMEM (if no users) or rewrite codes for flatmem completely.But this will adds more messy codes and overheads. Reported-by: Li Zefan <lizf@cn.fujitsu.com> Tested-by: Li Zefan <lizf@cn.fujitsu.com> Tested-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-06-11Merge branch 'for-linus' of git://git.infradead.org/users/eparis/notifyLinus Torvalds
* 'for-linus' of git://git.infradead.org/users/eparis/notify: fsnotify: allow groups to set freeing_mark to null inotify/dnotify: should_send_event shouldn't match on FS_EVENT_ON_CHILD dnotify: do not bother to lock entry->lock when reading mask dnotify: do not use ?true:false when assigning to a bool fsnotify: move events should indicate the event was on a child inotify: reimplement inotify using fsnotify fsnotify: handle filesystem unmounts with fsnotify marks fsnotify: fsnotify marks on inodes pin them in core fsnotify: allow groups to add private data to events fsnotify: add correlations between events fsnotify: include pathnames with entries when possible fsnotify: generic notification queue and waitq dnotify: reimplement dnotify using fsnotify fsnotify: parent event notification fsnotify: add marks to inodes so groups can interpret how to handle those inodes fsnotify: unified filesystem notification backend
2009-06-11Merge branch 'for-linus' of git://linux-arm.org/linux-2.6Linus Torvalds
* 'for-linus' of git://linux-arm.org/linux-2.6: kmemleak: Add the corresponding MAINTAINERS entry kmemleak: Simple testing module for kmemleak kmemleak: Enable the building of the memory leak detector kmemleak: Remove some of the kmemleak false positives kmemleak: Add modules support kmemleak: Add kmemleak_alloc callback from alloc_large_system_hash kmemleak: Add the vmalloc memory allocation/freeing hooks kmemleak: Add the slub memory allocation/freeing hooks kmemleak: Add the slob memory allocation/freeing hooks kmemleak: Add the slab memory allocation/freeing hooks kmemleak: Add documentation on the memory leak detector kmemleak: Add the base support Manual conflict resolution (with the slab/earlyboot changes) in: drivers/char/vt.c init/main.c mm/slab.c
2009-06-11Merge branch 'perfcounters-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perfcounters-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (574 commits) perf_counter: Turn off by default perf_counter: Add counter->id to the throttle event perf_counter: Better align code perf_counter: Rename L2 to LL cache perf_counter: Standardize event names perf_counter: Rename enums perf_counter tools: Clean up u64 usage perf_counter: Rename perf_counter_limit sysctl perf_counter: More paranoia settings perf_counter: powerpc: Implement generalized cache events for POWER processors perf_counters: powerpc: Add support for POWER7 processors perf_counter: Accurate period data perf_counter: Introduce struct for sample data perf_counter tools: Normalize data using per sample period data perf_counter: Annotate exit ctx recursion perf_counter tools: Propagate signals properly perf_counter tools: Small frequency related fixes perf_counter: More aggressive frequency adjustment perf_counter/x86: Fix the model number of Intel Core2 processors perf_counter, x86: Correct some event and umask values for Intel processors ...
2009-06-11inotify: reimplement inotify using fsnotifyEric Paris
Reimplement inotify_user using fsnotify. This should be feature for feature exactly the same as the original inotify_user. This does not make any changes to the in kernel inotify feature used by audit. Those patches (and the eventual removal of in kernel inotify) will come after the new inotify_user proves to be working correctly. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de>
2009-06-11init: introduce mm_init()Pekka Enberg
As suggested by Christoph Lameter, introduce mm_init() now that we initialize all the kernel memory allocations together. Cc: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-06-11vmalloc: use kzalloc() instead of alloc_bootmem()Pekka Enberg
We can call vmalloc_init() after kmem_cache_init() and use kzalloc() instead of the bootmem allocator when initializing vmalloc data structures. Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Nick Piggin <npiggin@suse.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-06-11slab: setup allocators earlier in the boot sequencePekka Enberg
This patch makes kmalloc() available earlier in the boot sequence so we can get rid of some bootmem allocations. The bulk of the changes are due to kmem_cache_init() being called with interrupts disabled which requires some changes to allocator boostrap code. Note: 32-bit x86 does WP protect test in mem_init() so we must setup traps before we call mem_init() during boot as reported by Ingo Molnar: We have a hard crash in the WP-protect code: [ 0.000000] Checking if this processor honours the WP bit even in supervisor mode...BUG: Int 14: CR2 ffcff000 [ 0.000000] EDI 00000188 ESI 00000ac7 EBP c17eaf9c ESP c17eaf8c [ 0.000000] EBX 000014e0 EDX 0000000e ECX 01856067 EAX 00000001 [ 0.000000] err 00000003 EIP c10135b1 CS 00000060 flg 00010002 [ 0.000000] Stack: c17eafa8 c17fd410 c16747bc c17eafc4 c17fd7e5 000011fd f8616000 c18237cc [ 0.000000] 00099800 c17bb000 c17eafec c17f1668 000001c5 c17f1322 c166e039 c1822bf0 [ 0.000000] c166e033 c153a014 c18237cc 00020800 c17eaff8 c17f106a 00020800 01ba5003 [ 0.000000] Pid: 0, comm: swapper Not tainted 2.6.30-tip-02161-g7a74539-dirty #52203 [ 0.000000] Call Trace: [ 0.000000] [<c15357c2>] ? printk+0x14/0x16 [ 0.000000] [<c10135b1>] ? do_test_wp_bit+0x19/0x23 [ 0.000000] [<c17fd410>] ? test_wp_bit+0x26/0x64 [ 0.000000] [<c17fd7e5>] ? mem_init+0x1ba/0x1d8 [ 0.000000] [<c17f1668>] ? start_kernel+0x164/0x2f7 [ 0.000000] [<c17f1322>] ? unknown_bootoption+0x0/0x19c [ 0.000000] [<c17f106a>] ? __init_begin+0x6a/0x6f Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by Linus Torvalds <torvalds@linux-foundation.org> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Matt Mackall <mpm@selenic.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-06-11kmemleak: Add the base supportCatalin Marinas
This patch adds the base support for the kernel memory leak detector. It traces the memory allocation/freeing in a way similar to the Boehm's conservative garbage collector, the difference being that the unreferenced objects are not freed but only shown in /sys/kernel/debug/kmemleak. Enabling this feature introduces an overhead to memory allocations. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Ingo Molnar <mingo@elte.hu> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2009-06-11Merge branch 'linus' into perfcounters/coreIngo Molnar
Conflicts: arch/x86/kernel/irqinit.c arch/x86/kernel/irqinit_64.c arch/x86/kernel/traps.c arch/x86/mm/fault.c include/linux/sched.h kernel/exit.c
2009-06-11perf_counter: Turn off by defaultIngo Molnar
Perfcounters were enabled by default to help testing - but now that we are submitting it upstream, make it default-disabled. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-11Merge branch 'tracing-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (244 commits) Revert "x86, bts: reenable ptrace branch trace support" tracing: do not translate event helper macros in print format ftrace/documentation: fix typo in function grapher name tracing/events: convert block trace points to TRACE_EVENT(), fix !CONFIG_BLOCK tracing: add protection around module events unload tracing: add trace_seq_vprint interface tracing: fix the block trace points print size tracing/events: convert block trace points to TRACE_EVENT() ring-buffer: fix ret in rb_add_time_stamp ring-buffer: pass in lockdep class key for reader_lock tracing: add annotation to what type of stack trace is recorded tracing: fix multiple use of __print_flags and __print_symbolic tracing/events: fix output format of user stack tracing/events: fix output format of kernel stack tracing/trace_stack: fix the number of entries in the header ring-buffer: discard timestamps that are at the start of the buffer ring-buffer: try to discard unneeded timestamps ring-buffer: fix bug in ring_buffer_discard_commit ftrace: do not profile functions when disabled tracing: make trace pipe recognize latency format flag ...
2009-06-11Merge branch 'rcu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: rcu: rcu_sched_grace_period(): kill the bogus flush_signals() rculist: use list_entry_rcu in places where it's appropriate rculist.h: introduce list_entry_rcu() and list_first_entry_rcu() rcu: Update RCU tracing documentation for __rcu_pending rcu: Add __rcu_pending tracing to hierarchical RCU RCU: make treercu be default
2009-06-09menu: fix embedded menu presentationRandy Dunlap
The STRIP_ASM_SYMS kconfig symbol mucks up the embedded menu because STRIP_ASM_SYMS is in the middle of the embedded menu items but it does not depend on EMBEDDED. Move it to beyond the end of the embedded menu so that the menu is presented correctly. Or if STRIP_ASM_SYMS should depend on EMBEDDED, that can also be fixed. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2009-06-01Merge branch 'linus' into perfcounters/coreIngo Molnar
Merge reason: merge almost-rc8 into perfcounters/core, which was -rc6 based - to pick up the latest upstream fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-24Use a format for linux_bannerAlex Riesen
There is no format specifiers left in the linux_banner, and gcc-4.3 complains seeing the printk. Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-05-18Merge commit 'v2.6.30-rc6' into tracing/coreIngo Molnar
Merge reason: we were on an -rc4 base, sync up to -rc6 Signed-off-by: Ingo Molnar <mingo@elte.hu>