linux-fsl-qoriq - Freescale linux tree with Scalys patches

Age	Commit message (Collapse)	Author
2008-08-08	sparc64: Fix end-of-stack checking in save_stack_trace().	David S. Miller
	Bug reported by Alexander Beregalov. Before we dereference the stack frame or try to peek at the pt_regs magic value, make sure the entire object is within the kernel stack bounds. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-07	sparc: don't use asm/of_device.h	Stephen Rothwell
	Use linux/of_device.h instead. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-07	sparc64: Use kernel/uid16.c helpers instead of own copy.	David S. Miller
	Noticed by Adrian Bunk. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-04	sparc64: Remove all cpumask_t local variables in xcall dispatch.	David S. Miller
	All of the xcall delivery implementation is cpumask agnostic, so we can pass around pointers to const cpumask_t objects everywhere. The sad remaining case is the argument to arch_send_call_function_ipi(). Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-04	sparc64: Kill error_mask from hypervisor_xcall_deliver().	David S. Miller
	It can eat up a lot of stack space when NR_CPUS is large. We retain some of it's functionality by reporting at least one of the cpu's which are seen in error state. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-04	sparc64: Build cpu list and mondo block at top-level xcall_deliver().	David S. Miller
	Then modify all of the xcall dispatch implementations get passed and use this information. Now all of the xcall dispatch implementations do not need to be mindful of details such as "is current cpu in the list?" and "is cpu online?" Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-04	sparc64: Disable local interrupts around xcall_deliver_impl() invocation.	David S. Miller
	Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-04	sparc64: Make all xcall_deliver's go through common helper function.	David S. Miller
	This just facilitates the next changeset where we'll be building the cpu list and mondo block in this helper function. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-04	sparc64: Always allocate the send mondo blocks, even on non-sun4v.	David S. Miller
	The idea is that we'll use this cpu list array and mondo block even for non-hypervisor platforms. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-04	sparc64: Make smp_cross_call_masked() take a cpumask_t pointer.	David S. Miller
	Ideally this could be simplified further such that we could pass the pointer down directly into the xcall_deliver() implementation. But if we do that we need to do the "cpu_online(cpu)" and "cpu != self" checks down in those functions. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-04	sparc64: Directly call xcall_deliver() in smp_start_sync_tick_client.	David S. Miller
	We know the cpu is online and not the current cpu here. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-04	sparc64: Call xcall_deliver() directly in some cases.	David S. Miller
	For these cases the callers make sure: 1) The cpus indicated are online. 2) The current cpu is not in the list of indicated cpus. Therefore we can pass a pointer to the mask directly. One of the motivations in this transformation is to make use of "&cpumask_of_cpu(cpu)" which evaluates to a pointer to constant data in the kernel and thus takes up no stack space. Hopefully someone in the future will change the interface of arch_send_call_function_ipi() such that it passes a const cpumask_t pointer so that this will optimize ever further. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-04	sparc64: Use cpumask_t pointers and for_each_cpu_mask_nr() in xcall_deliver.	David S. Miller
	Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-04	sparc64: Use xcall_deliver() consistently.	David S. Miller
	There remained some spots still vectoring to the appropriate *_xcall_deliver() function manually. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-04	sparc64: Use function pointer for cross-call sending.	David S. Miller
	Initialize it using the smp_setup_processor_id() hook. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-04	arch/sparc64/kernel/signal.c: removed duplicated #include	Huang Weiyi
	Removed duplicated #include <linux/tracehook.h> in arch/sparc64/kernel/signal.c. Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-04	sparc64: Need to disable preemption around smp_tsb_sync().	David S. Miller
	Based upon a bug report by Mariusz Kozlowski It uses smp_call_function_masked() now, which has a preemption-disabled requirement. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-01	sparc64: Do not clobber %g7 in setcontext() trap.	David S. Miller
	That's the userland thread register, so we should never try to change it like this. Based upon glibc bug nptl/6577 and suggestions by Jakub Jelinek. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-01	sparc64: Kill __show_regs().	David S. Miller
	The story is that what we used to do when we actually used smp_report_regs() is that if you specifically only wanted to have the current cpu's registers dumped you would call "__show_regs()" otherwise you would call show_regs() which also invoked smp_report_regs(). Now that we killed off smp_report_regs() there is no longer any reason to have these two routines, just show_regs() is sufficient. Also kill off a stray declaration of show_regs() in sparc64_ksym.c Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-31	sparc64: Kill smp_report_regs().	David S. Miller
	All the call sites are #if 0'd out and we have a much more useful global cpu dumping facility these days. smp_report_regs() is way too verbose to be usable. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-31	sparc64: Kill VERBOSE_SHOWREGS code.	David S. Miller
	It just clutters everything up and even though I wrote that hack I can't remember having used it in the last 5 years or so. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-31	sparc64: Hook up trigger_all_cpu_backtrace().	David S. Miller
	We already have code that does this, but it is only currently attached to sysrq-'y'. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-31	sparc64: Make global reg dumping even more useful.	David S. Miller
	Record one more level of stack frame program counter. Particularly when lockdep and all sorts of spinlock debugging is enabled, figuring out the caller of spin_lock() is difficult when the cpu is stuck on the lock. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-30	sparc64: Kill isa_bus_type.	David S. Miller
	I forgot to delete this when I removed the ISA bus layer from the sparc ports. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-28	sparc64: Fix global reg snapshotting on self-cpu.	David S. Miller
	We were picking %i7 out of the wrong register window stack slot. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-28	sparc64: tracehook: CONFIG_HAVE_ARCH_TRACEHOOK	Roland McGrath
	The sparc64 arch code has all the prerequisites, so set HAVE_ARCH_TRACEHOOK. Signed-off-by: Roland McGrath <roland@redhat.com>
2008-07-28	sparc64: tracehook_signal_handler	Roland McGrath
	Call the standard hook after setting up signal handlers. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-28	sparc64: tracehook: TIF_NOTIFY_RESUME	Roland McGrath
	This adds TIF_NOTIFY_RESUME support for sparc64. When set, we call tracehook_notify_resume() on the way to user mode. Signed-off-by: Roland McGrath <roland@redhat.com>
2008-07-28	sparc64: tracehook syscall	Roland McGrath
	This changes sparc64 syscall tracing to use the new tracehook.h entry points. [ Add assembly changes to force an immediate -ENOSYS return from the system call when syscall_trace() returns non-zero at syscall entry. -DaveM ] Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-27	sparc, sparc64: use arch/sparc/include	Sam Ravnborg
	The majority of this patch was created by the following script: *** ASM=arch/sparc/include/asm mkdir -p $ASM git mv include/asm-sparc64/ftrace.h $ASM git rm include/asm-sparc64/* git mv include/asm-sparc/* $ASM sed -ie 's/asm-sparc64/asm/g' $ASM/* sed -ie 's/asm-sparc/asm/g' $ASM/* *** The rest was an update of the top-level Makefile to use sparc for header files when sparc64 is being build. And a small fixlet to pick up the correct unistd.h from sparc64 code. Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-26	sparc64: use generic show_mem()	Johannes Weiner
	Remove arch-specific show_mem() in favor of the generic version. This also removes the following redundant information display: - free swap pages, printed by show_swap_cache_info() - pages in swapcache, printed by show_swap_cache_info() - dirty pages, writeback pages, mapped pages, slab pages, pagetables pages, printed by show_free_areas() where show_mem() calls show_free_areas(), which calls show_swap_cache_info(). Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6	Linus Torvalds
	* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6: sparc: Wire up new system calls.
2008-07-25	sparc: Wire up new system calls.	David S. Miller
	This wires up the recently added Wire up signalfd4, eventfd2, epoll_create1, dup3, pipe2, and inotify_init1 system calls. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-25	kprobes: improve kretprobe scalability with hashed locking	Srinivasa D S
	Currently list of kretprobe instances are stored in kretprobe object (as used_instances,free_instances) and in kretprobe hash table. We have one global kretprobe lock to serialise the access to these lists. This causes only one kretprobe handler to execute at a time. Hence affects system performance, particularly on SMP systems and when return probe is set on lot of functions (like on all systemcalls). Solution proposed here gives fine-grain locks that performs better on SMP system compared to present kretprobe implementation. Solution: 1) Instead of having one global lock to protect kretprobe instances present in kretprobe object and kretprobe hash table. We will have two locks, one lock for protecting kretprobe hash table and another lock for kretporbe object. 2) We hold lock present in kretprobe object while we modify kretprobe instance in kretprobe object and we hold per-hash-list lock while modifying kretprobe instances present in that hash list. To prevent deadlock, we never grab a per-hash-list lock while holding a kretprobe lock. 3) We can remove used_instances from struct kretprobe, as we can track used instances of kretprobe instances using kretprobe hash table. Time duration for kernel compilation ("make -j 8") on a 8-way ppc64 system with return probes set on all systemcalls looks like this. cacheline non-cacheline Un-patched kernel aligned patch aligned patch =============================================================================== real 9m46.784s 9m54.412s 10m2.450s user 40m5.715s 40m7.142s 40m4.273s sys 2m57.754s 2m58.583s 3m17.430s =========================================================== Time duration for kernel compilation ("make -j 8) on the same system, when kernel is not probed. ========================= real 9m26.389s user 40m8.775s sys 2m7.283s ========================= Signed-off-by: Srinivasa DS <srinivasa@in.ibm.com> Signed-off-by: Jim Keniston <jkenisto@us.ibm.com> Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Masami Hiramatsu <mhiramat@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24	Merge branch 'timers-fixes-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: nohz: adjust tick_nohz_stop_sched_tick() call of s390 as well nohz: prevent tick stop outside of the idle loop
2008-07-24	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6	Linus Torvalds
	* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6: sparc64: Fix cpufreq notifier registry. sparc64: Fix lockdep issues in LDC protocol layer.
2008-07-24	flag parameters: pipe	Ulrich Drepper
	This patch introduces the new syscall pipe2 which is like pipe but it also takes an additional parameter which takes a flag value. This patch implements the handling of O_CLOEXEC for the flag. I did not add support for the new syscall for the architectures which have a special sys_pipe implementation. I think the maintainers of those archs have the chance to go with the unified implementation but that's up to them. The implementation introduces do_pipe_flags. I did that instead of changing all callers of do_pipe because some of the callers are written in assembler. I would probably screw up changing the assembly code. To avoid breaking code do_pipe is now a small wrapper around do_pipe_flags. Once all callers are changed over to do_pipe_flags the old do_pipe function can be removed. The following test must be adjusted for architectures other than x86 and x86-64 and in case the syscall numbers changed. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <fcntl.h> #include <stdio.h> #include <unistd.h> #include <sys/syscall.h> #ifndef __NR_pipe2 # ifdef __x86_64__ # define __NR_pipe2 293 # elif defined __i386__ # define __NR_pipe2 331 # else # error "need __NR_pipe2" # endif #endif int main (void) { int fd[2]; if (syscall (__NR_pipe2, fd, 0) != 0) { puts ("pipe2(0) failed"); return 1; } for (int i = 0; i < 2; ++i) { int coe = fcntl (fd[i], F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if (coe & FD_CLOEXEC) { printf ("pipe2(0) set close-on-exit for fd[%d]\n", i); return 1; } } close (fd[0]); close (fd[1]); if (syscall (__NR_pipe2, fd, O_CLOEXEC) != 0) { puts ("pipe2(O_CLOEXEC) failed"); return 1; } for (int i = 0; i < 2; ++i) { int coe = fcntl (fd[i], F_GETFD); if (coe == -1) { puts ("fcntl failed"); return 1; } if ((coe & FD_CLOEXEC) == 0) { printf ("pipe2(O_CLOEXEC) does not set close-on-exit for fd[%d]\n", i); return 1; } } close (fd[0]); close (fd[1]); puts ("OK"); return 0; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24	PAGE_ALIGN(): correctly handle 64-bit values on 32-bit architectures	Andrea Righi
	On 32-bit architectures PAGE_ALIGN() truncates 64-bit values to the 32-bit boundary. For example: u64 val = PAGE_ALIGN(size); always returns a value < 4GB even if size is greater than 4GB. The problem resides in PAGE_MASK definition (from include/asm-x86/page.h for example): #define PAGE_SHIFT 12 #define PAGE_SIZE (_AC(1,UL) << PAGE_SHIFT) #define PAGE_MASK (~(PAGE_SIZE-1)) ... #define PAGE_ALIGN(addr) (((addr)+PAGE_SIZE-1)&PAGE_MASK) The "~" is performed on a 32-bit value, so everything in "and" with PAGE_MASK greater than 4GB will be truncated to the 32-bit boundary. Using the ALIGN() macro seems to be the right way, because it uses typeof(addr) for the mask. Also move the PAGE_ALIGN() definitions out of include/asm-*/page.h in include/linux/mm.h. See also lkml discussion: http://lkml.org/lkml/2008/6/11/237 [akpm@linux-foundation.org: fix drivers/media/video/uvc/uvc_queue.c] [akpm@linux-foundation.org: fix v850] [akpm@linux-foundation.org: fix powerpc] [akpm@linux-foundation.org: fix arm] [akpm@linux-foundation.org: fix mips] [akpm@linux-foundation.org: fix drivers/media/video/pvrusb2/pvrusb2-dvb.c] [akpm@linux-foundation.org: fix drivers/mtd/maps/uclinux.c] [akpm@linux-foundation.org: fix powerpc] Signed-off-by: Andrea Righi <righi.andrea@gmail.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24	hugetlb: introduce pud_huge	Andi Kleen
	Straight forward extensions for huge pages located in the PUD instead of PMDs. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24	hugetlb: modular state for hugetlb page size	Andi Kleen
	The goal of this patchset is to support multiple hugetlb page sizes. This is achieved by introducing a new struct hstate structure, which encapsulates the important hugetlb state and constants (eg. huge page size, number of huge pages currently allocated, etc). The hstate structure is then passed around the code which requires these fields, they will do the right thing regardless of the exact hstate they are operating on. This patch adds the hstate structure, with a single global instance of it (default_hstate), and does the basic work of converting hugetlb to use the hstate. Future patches will add more hstate structures to allow for different hugetlbfs mounts to have different page sizes. [akpm@linux-foundation.org: coding-style fixes] Acked-by: Adam Litke <agl@us.ibm.com> Acked-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24	mm: move bootmem descriptors definition to a single place	Johannes Weiner
	There are a lot of places that define either a single bootmem descriptor or an array of them. Use only one central array with MAX_NUMNODES items instead. Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Acked-by: Ralf Baechle <ralf@linux-mips.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Richard Henderson <rth@twiddle.net> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Tony Luck <tony.luck@intel.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kyle McMartin <kyle@parisc-linux.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: David S. Miller <davem@davemloft.net> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-23	sparc64: Fix cpufreq notifier registry.	David S. Miller
	Based upon a report by Daniel Smolik. We do it too early, which triggers a BUG in cpufreq_register_notifier(). Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-23	sparc64: Fix lockdep issues in LDC protocol layer.	David S. Miller
	We're calling request_irq() with a IRQs disabled. No straightforward fix exists because we want to enable these IRQs and setup state atomically before getting into the IRQ handler the first time. What happens now is that we mark the VIRQ to not be automatically enabled by request_irq(). Then we make explicit enable_irq() calls when we grab the LDC channel. This way we don't need to call request_irq() illegally under the LDC channel lock any more. Bump LDC version and release date. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-22	Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus	Linus Torvalds
	* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: remove CONFIG_KMOD from core kernel code remove CONFIG_KMOD from lib remove CONFIG_KMOD from sparc64 rework try_then_request_module to do less in non-modular kernels remove mention of CONFIG_KMOD from documentation make CONFIG_KMOD invisible modules: Take a shortcut for checking if an address is in a module module: turn longs into ints for module sizes Shrink struct module: CONFIG_UNUSED_SYMBOLS ifdefs module: reorder struct module to save space on 64 bit builds module: generic each_symbol iterator function module: don't use stop_machine for waiting rmmod
2008-07-22	remove CONFIG_KMOD from sparc64	Johannes Berg
	One place is just a comment, the other a conditional, unused inclusion of linux/kmod.h. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-07-22	sparc64: fix up bus_id changes in sparc core code	Greg Kroah-Hartman
	This converts all instances of bus_id in the sparc core kernel to use either dev_set_name(), or dev_name() depending on the need. This is done in anticipation of removing the bus_id field from struct driver. Cc: Kay Sievers <kay.sievers@vrfy.org> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-07-22	sysdev: Pass the attribute to the low level sysdev show/store function	Andi Kleen
	This allow to dynamically generate attributes and share show/store functions between attributes. Right now most attributes are generated by special macros and lots of duplicated code. With the attribute passed it's instead possible to attach some data to the attribute and then use that in shared low level functions to do different things. I need this for the dynamically generated bank attributes in the x86 machine check code, but it'll allow some further cleanups. I converted all users in tree to the new show/store prototype. It's a single huge patch to avoid unbisectable sections. Runtime tested: x86-32, x86-64 Compiled only: ia64, powerpc Not compile tested/only grep converted: sh, arm, avr32 Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-07-22	driver core: remove KOBJ_NAME_LEN define	Kay Sievers
	Kobjects do not have a limit in name size since a while, so stop pretending that they do. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-07-18	Merge branch 'linus' into timers/nohz	Ingo Molnar

2008-07-18	nohz: prevent tick stop outside of the idle loop	Thomas Gleixner
	Jack Ren and Eric Miao tracked down the following long standing problem in the NOHZ code: scheduler switch to idle task enable interrupts Window starts here ----> interrupt happens (does not set NEED_RESCHED) irq_exit() stops the tick ----> interrupt happens (does set NEED_RESCHED) return from schedule() cpu_idle(): preempt_disable(); Window ends here The interrupts can happen at any point inside the race window. The first interrupt stops the tick, the second one causes the scheduler to rerun and switch away from idle again and we end up with the tick disabled. The fact that it needs two interrupts where the first one does not set NEED_RESCHED and the second one does made the bug obscure and extremly hard to reproduce and analyse. Kudos to Jack and Eric. Solution: Limit the NOHZ functionality to the idle loop to make sure that we can not run into such a situation ever again. cpu_idle() { preempt_disable(); while(1) { tick_nohz_stop_sched_tick(1); <- tell NOHZ code that we are in the idle loop while (!need_resched()) halt(); tick_nohz_restart_sched_tick(); <- disables NOHZ mode preempt_enable_no_resched(); schedule(); preempt_disable(); } } In hindsight we should have done this forever, but ... /me grabs a large brown paperbag. Debugged-by: Jack Ren <jack.ren@marvell.com>, Debugged-by: eric miao <eric.y.miao@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>