summaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)Author
2013-02-08uprobes: Kill UPROBE_RUN_HANDLER flagOleg Nesterov
Simply remove UPROBE_RUN_HANDLER and the corresponding code. It can only help if uprobe has a single consumer, and in fact it is no longer needed after handler_chain() was changed to use ->register_rwsem, we simply can not race with uprobe_register(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
2013-02-08uprobes: Change filter_chain() to iterate ->consumers listOleg Nesterov
Now that it safe to use ->consumer_rwsem under ->mmap_sem we can almost finish the implementation of filter_chain(). It still lacks the actual uc->filter(...) call but othewrwise it is ready, just it pretends that ->filter() always returns true. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
2013-02-08uprobes: Introduce uprobe->register_rwsemOleg Nesterov
Introduce uprobe->register_rwsem. It is taken for writing around __uprobe_register/unregister. Change handler_chain() to use this sem rather than consumer_rwsem. The main reason for this change is that we have the nasty problem with mmap_sem/consumer_rwsem dependency. filter_chain() needs to protect uprobe->consumers like handler_chain(), but they can not use the same lock. filter_chain() can be called under ->mmap_sem (currently this is always true), but we want to allow ->handler() to play with the probed task's memory, and this needs ->mmap_sem. Alternatively we could use srcu, but synchronize_srcu() is very slow and ->register_rwsem allows us to do more. In particular, we can teach handler_chain() to do remove_breakpoint() if this bp is "nacked" by all consumers, we know that we can't race with the new consumer which does uprobe_register(). See also the next patches. uprobes_mutex[] is almost ready to die. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
2013-02-08uprobes: _register() should always do register_for_each_vma(true)Oleg Nesterov
To support the filtering uprobe_register() should do register_for_each_vma(true) every time the new consumer comes, we need to install the previously nacked breakpoints. Note: - uprobes_mutex[] should die, what it actually protects is alloc_uprobe(). - UPROBE_RUN_HANDLER should die too, obviously it can't work unless uprobe has a single consumer. The consumer should serialize with _register/_unregister itself. Or this flag should live in uprobe_consumer->state. - Perhaps we can do some optimizations later. For example, if filter_chain() never returns false uprobe can record this fact and avoid the unnecessary register_for_each_vma(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
2013-02-08uprobes: _unregister() should always do register_for_each_vma(false)Oleg Nesterov
uprobe_unregister() removes the breakpoints only if the last consumer goes away. To support the filtering it should do this every time, we want to remove the breakpoints which nobody else want to keep. Note: given that filter_chain() is not actually implemented, this patch itself doesn't change the behaviour yet, register_for_each_vma(false) is a heavy "nop" unless there are no more consumers. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
2013-02-08uprobes: Introduce filter_chain()Oleg Nesterov
Add the new helper filter_chain(). Currently it is only placeholder, the comment explains what is should do. We will change it later to consult every consumer to decide whether we need to install the swbp. Until then it works as if any consumer returns true, this matches the current behavior. Change install_breakpoint() to call filter_chain() instead of checking uprobe->consumers != NULL. We obviously need this, and this equally closes the race with _unregister(). Change remove_breakpoint() to call this helper too. Currently this is pointless because remove_breakpoint() is only called when the last consumer goes away, but we will change this. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
2013-02-08uprobes: Kill uprobe_consumer->filter()Oleg Nesterov
uprobe_consumer->filter() is pointless in its current form, kill it. We will add it back, but with the different signature/semantics. Perhaps we will even re-introduce the callsite in handler_chain(), but not to just skip uc->handler(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
2013-02-08uprobes: Kill the pointless inode/uc checks in register/unregisterOleg Nesterov
register/unregister verifies that inode/uc != NULL. For what? This really looks like "hide the potential problem", the caller should pass the valid data. register() also checks uc->next == NULL, probably to prevent the double-register but the caller can do other stupid/wrong things. If we do this check, then we should document that uc->next should be cleared before register() and add BUG_ON(). Also add the small comment about the i_size_read() check. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
2013-02-08uprobes: Move __set_bit(UPROBE_SKIP_SSTEP) into alloc_uprobe()Oleg Nesterov
Cosmetic. __set_bit(UPROBE_SKIP_SSTEP) is the part of initialization, it is not clear why it is set in insert_uprobe(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
2013-02-08timeconst.pl: Eliminate Perl warningH. Peter Anvin
defined(@array) is deprecated in Perl and gives off a warning. Restructure the code to remove that warning. [ hpa: it would be interesting to revert to the timeconst.bc script. It appears that the failures reported by akpm during testing of that script was due to a known broken version of make, not a problem with bc. The Makefile rules could probably be restructured to avoid the make bug, or it is probably old enough that it doesn't matter. ] Reported-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: <stable@vger.kernel.org>
2013-02-07srcu: use ACCESS_ONCE() to access sp->completed in srcu_read_lock()Lai Jiangshan
The old SRCU implementation loads sp->completed within an RCU-sched section, courtesy of preempt_disable(). This was required due to the use of synchronize_sched() in the old implemenation's synchronize_srcu(). However, the new implementation does not rely on synchronize_sched(), so it in turn does not require the load of sp->completed and the ->c[] counter to be in a single preempt-disabled region of code. This commit therefore moves the sp->completed access outside of the preempt-disabled region and applies ACCESS_ONCE(). The resulting code is almost as the same as before, but it removes the now-misleading rcu_dereference_index_check() call. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2013-02-07srcu: Update synchronize_srcu_expedited()'s commentsLai Jiangshan
Because synchronize_srcu_expedited() no longer uses synchronize_rcu_sched_expedited(), synchronize_srcu_expedited() no longer indirectly acquires any CPU-hotplug-related locks. This commit therefore updates the comments accordingly. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2013-02-07srcu: Update synchronize_srcu()'s commentsLai Jiangshan
The core of SRCU is changed, but synchronize_srcu()'s comments describe the old algorithm. This commit therefore updates them to match the new algorithm. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2013-02-07srcu: Simple cleanup for cleanup_srcu_struct()Lai Jiangshan
Pack six lines of code into two lines. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2013-02-07srcu: Add might_sleep() annotation to synchronize_srcu()Lai Jiangshan
Although synchronize_srcu() can sleep, it will not sleep if the fast path succeeds, which means that illegal use of synchronize_rcu() might go unnoticed. This commit therefore adds might_sleep(), which unconditionally catches illegal use of synchronize_rcu() from atomic context. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2013-02-07srcu: Simplify __srcu_read_unlock() via this_cpu_dec()Lai Jiangshan
This commit replaces disabling of preemption and decrement of a per-CPU variable with this_cpu_dec(), which avoids preemption disabling on x86 and shortens the code on all platforms. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2013-02-07sched/rt: Move rt specific bits into new header fileClark Williams
Move rt scheduler definitions out of include/linux/sched.h into new file include/linux/sched/rt.h Signed-off-by: Clark Williams <williams@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20130207094707.7b9f825f@riff.lan Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-02-07sched/rt: Add a tuning knob to allow changing SCHED_RR timesliceClark Williams
Add a /proc/sys/kernel scheduler knob named sched_rr_timeslice_ms that allows global changing of the SCHED_RR timeslice value. User visable value is in milliseconds but is stored as jiffies. Setting to 0 (zero) resets to the default (currently 100ms). Signed-off-by: Clark Williams <williams@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20130207094704.13751796@riff.lan Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-02-07sched: Move sched.h sysctl bits into separate headerClark Williams
Move the sysctl-related bits from include/linux/sched.h into a new file: include/linux/sched/sysctl.h. Then update source files requiring access to those bits by including the new header file. Signed-off-by: Clark Williams <williams@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20130207094659.06dced96@riff.lan Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-02-06Merge branch 'rcu/next' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU build fixlet from Paul E. McKenney. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-02-05Merge tag 'full-dynticks-cputime-for-mingo' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks into sched/core Pull full-dynticks (user-space execution is undisturbed and receives no timer IRQs) preparation changes that convert the cputime accounting code to be full-dynticks ready, from Frederic Weisbecker: "This implements the cputime accounting on full dynticks CPUs. Typical cputime stats infrastructure relies on the timer tick and its periodic polling on the CPU to account the amount of time spent by the CPUs and the tasks per high level domains such as userspace, kernelspace, guest, ... Now we are preparing to implement full dynticks capability on Linux for Real Time and HPC users who want full CPU isolation. This feature requires a cputime accounting that doesn't depend on the timer tick. To implement it, this new cputime infrastructure plugs into kernel/user/guest boundaries to take snapshots of cputime and flush these to the stats when needed. This performs pretty much like CONFIG_VIRT_CPU_ACCOUNTING except that context location and cputime snaphots are synchronized between write and read side such that the latter can safely retrieve the pending tickless cputime of a task and add it to its latest cputime snapshot to return the correct result to the user." Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-02-05sched: Fix signedness bug in yield_to()Dan Carpenter
In 7b270f6099 "sched: Bail out of yield_to when source and target runqueue has one task" we changed this to store -ESRCH so it needs to be signed. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kbuild@01.org Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Mike Galbraith <efault@gmx.de> Link: http://lkml.kernel.org/r/20130205113751.GA20521@elgon.mountain Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-02-05hrtimer: Prevent hrtimer_enqueue_reprogram raceLeonid Shatz
hrtimer_enqueue_reprogram contains a race which could result in timer.base switch during unlock/lock sequence. hrtimer_enqueue_reprogram is releasing the lock protecting the timer base for calling raise_softirq_irqsoff() due to a lock ordering issue versus rq->lock. If during that time another CPU calls __hrtimer_start_range_ns() on the same hrtimer, the timer base might switch, before the current CPU can lock base->lock again and therefor the unlock_timer_base() call will unlock the wrong lock. [ tglx: Added comment and massaged changelog ] Signed-off-by: Leonid Shatz <leonid.shatz@ravellosystems.com> Signed-off-by: Izik Eidus <izik.eidus@ravellosystems.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1359981217-389-1-git-send-email-izik.eidus@ravellosystems.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-02-04Merge branch 'nohz/printk-v8' into irq/coreFrederic Weisbecker
Conflicts: kernel/irq_work.c Add support for printk in full dynticks CPU. * Don't stop tick with irq works pending. This fix is generally useful and concerns archs that can't raise self IPIs. * Flush irq works before CPU offlining. * Introduce "lazy" irq works that can wait for the next tick to be executed, unless it's stopped. * Implement klogd wake up using irq work. This removes the ad-hoc printk_tick()/printk_needs_cpu() hooks and make it working even in dynticks mode. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2013-02-04Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "Three small fixlets" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/debug: Fix format string for 32-bit platforms sched: Fix warning in kernel/sched/fair.c sched/rt: Use root_domain of rt_rq not current processor
2013-02-04Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Three fixlets and two small (and low risk) hw-enablement changes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Fix event group context move x86/perf: Add IvyBridge EP support perf/x86: Fix P6 driver section warning arch/x86/tools/insn_sanity.c: Identify source of messages perf/x86: Enable Intel Lincroft/Penwell/Cloverview Atom support
2013-02-04Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull two small RCU fixlets from Ingo Molnar. * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: rcu: Make rcu_nocb_poll an early_param instead of module_param rcu: Prevent soft-lockup complaints about no-CBs CPUs
2013-02-04rcu: Allow rcutorture to be built at low optimization levelsSteven Rostedt
The uses of trace_clock_local() are dead code when CONFIG_RCU_TRACE=n, but some compilers might nevertheless generate code calling this function. This commit therefore ensures that trace_clock_local() is invoked only when CONFIG_RCU_TRACE=y. Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-02-04sched: Fix select_idle_sibling() bouncing cow syndromeMike Galbraith
If the previous CPU is cache affine and idle, select it. The current implementation simply traverses the sd_llc domain, taking the first idle CPU encountered, which walks buddy pairs hand in hand over the package, inflicting excruciating pain. 1 tbench pair (worst case) in a 10 core + SMT package: pre 15.22 MB/sec 1 procs post 252.01 MB/sec 1 procs Signed-off-by: Mike Galbraith <bitbucket@online.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1359371965.5783.127.camel@marge.simpson.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-02-04Merge branch 'rcu/next' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU updates from Paul E. McKenney: 1. Changes to rcutorture and to RCU documentation. Posted to LKML at https://lkml.org/lkml/2013/1/26/188. 2. Enhancements to uniprocessor handling in tiny RCU. Posted to LKML at https://lkml.org/lkml/2013/1/27/2. 3. Tag RCU callbacks with grace-period number to simplify callback advancement. Posted to LKML at https://lkml.org/lkml/2013/1/26/203. 4. Miscellaneous fixes. Posted to LKML at https://lkml.org/lkml/2013/1/26/204. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-02-04irq_work: Remove return value from the irq_work_queue() functionanish kumar
As no one is using the return value of irq_work_queue(), so it is better to just make it void. Signed-off-by: anish kumar <anish198519851985@gmail.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> [ Fix stale comments, remove now unnecessary __irq_work_queue() intermediate function ] Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/1359925703-24304-1-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-02-04Merge branch 'fortglx/3.9/time' of git://git.linaro.org/people/jstultz/linux ↵Thomas Gleixner
into timers/core Trivial conflict in arch/x86/Kconfig Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-02-03sched/rt: Further simplify pick_rt_task()Kirill Tkhai
Function next_prio() has been removed and pull_rt_task() is the only user of pick_next_highest_task_rt() at the moment. pull_rt_task is not interested in p->nr_cpus_allowed, its only interest is the fact that cpu is allowed to execute p. If nr_cpus_allowed == 1, cpu != task_cpu(p) and cpu is allowed then it means that task p is in the middle of the migration techniques; the task waits until it is moved by migration thread. So, lets pull it earlier. Signed-off-by: Kirill V Tkhai <tkhai@yandex.ru> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> CC: linux-rt-users <linux-rt-users@vger.kernel.org> Link: http://lkml.kernel.org/r/70871359644177@web16d.yandex.ru Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-02-03perf: Fix event group context moveJiri Olsa
When we have group with mixed events (hw/sw) we want to end up with group leader being in hw context. So if group leader is initialy sw event, we move all the events under hw context. The move is done for each event by removing it from its context and adding it back into proper one. As a part of the removal the event is automatically disabled, which is not what we want at this stage of creating groups. The fix is to initialize event state after removal from sw context. This fix resulted from the following discussion: http://thread.gmane.org/gmane.linux.kernel.perf.user/1144 Reported-by: Andreas Hollmann <hollmann@in.tum.de> Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Vince Weaver <vince@deater.net> Link: http://lkml.kernel.org/r/1359714225-4231-1-git-send-email-jolsa@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-02-03Merge branch 'tip/perf/core' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/core Pull tracing updated from Steve Rostedt. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-02-01tracing: Init current_trace to nop_trace and remove NULL checksSteven Rostedt (Red Hat)
On early boot up, when the ftrace ring buffer is initialized, the static variable current_trace is initialized to &nop_trace. Before this initialization, current_trace is NULL and will never become NULL again. It is always reassigned to a ftrace tracer. Several places check if current_trace is NULL before it uses it, and this check is frivolous, because at the point in time when the checks are made the only way current_trace could be NULL is if ftrace failed its allocations at boot up, and the paths to these locations would probably not be possible. By initializing current_trace to &nop_trace where it is declared, current_trace will never be NULL, and we can remove all these checks of current_trace being NULL which never needed to be checked in the first place. Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-01-31clockevents: Add generic timer broadcast functionMark Rutland
Currently, the timer broadcast mechanism is defined by a function pointer on struct clock_event_device. As the fundamental mechanism for broadcast is architecture-specific, this means that clock_event_device drivers cannot be shared across multiple architectures. This patch adds an (optional) architecture-specific function for timer tick broadcast, allowing drivers which may require broadcast functionality to be shared across multiple architectures. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Cc: linux-arm-kernel@lists.infradead.org Cc: nico@linaro.org Cc: Will.Deacon@arm.com Cc: Marc.Zyngier@arm.com Cc: john.stultz@linaro.org Link: http://lkml.kernel.org/r/1358183124-28461-3-git-send-email-mark.rutland@arm.com Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-01-31clockevents: Add generic timer broadcast receiverMark Rutland
Currently the broadcast mechanism used for timers is abstracted by a function pointer on struct clock_event_device. As the fundamental mechanism for broadcast is architecture-specific, this ties each clock_event_device driver to a single architecture, even where the driver is otherwise generic. This patch adds a standard path for the receipt of timer broadcasts, so drivers and/or architecture backends need not manage redundant lists of timers for the purpose of routing broadcast timer ticks. [tglx: Made the implementation depend on the config switch as well ] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Cc: linux-arm-kernel@lists.infradead.org Cc: nico@linaro.org Cc: Will.Deacon@arm.com Cc: Marc.Zyngier@arm.com Cc: john.stultz@linaro.org Link: http://lkml.kernel.org/r/1358183124-28461-2-git-send-email-mark.rutland@arm.com Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-01-31sched/rt: Do not account zero delta_exec in update_curr_rt()Kirill Tkhai
There are several places of consecutive calls of dequeue_task_rt() and put_prev_task_rt() in the scheduler. For example, function rt_mutex_setprio() does it. The both calls lead to update_curr_rt(), the second of it receives zeroed delta_exec. The only effective action in this case is call of sched_rt_avg_update(), which can change rq->age_stamp and rq->rt_avg. But it is possible in case of ""floating"" rq->clock. This fact is not reasonable to be accounted. Another actions do nothing. Signed-off-by: Kirill V Tkhai <tkhai@yandex.ru> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> CC: linux-rt-users <linux-rt-users@vger.kernel.org> Link: http://lkml.kernel.org/r/931541359550236@web1g.yandex.ru Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-01-31Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Peter Anvin: "This is a collection of miscellaneous fixes, the most important one is the fix for the Samsung laptop bricking issue (auto-blacklisting the samsung-laptop driver); the efi_enabled() changes you see below are prerequisites for that fix. The other issues fixed are booting on OLPC XO-1.5, an UV fix, NMI debugging, and requiring CAP_SYS_RAWIO for MSR references, just as with I/O port references." * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: samsung-laptop: Disable on EFI hardware efi: Make 'efi_enabled' a function to query EFI facilities smp: Fix SMP function call empty cpu mask race x86/msr: Add capabilities check x86/dma-debug: Bump PREALLOC_DMA_DEBUG_ENTRIES x86/olpc: Fix olpc-xo1-sci.c build errors arch/x86/platform/uv: Fix incorrect tlb flush all issue x86-64: Fix unwind annotations in recent NMI changes x86-32: Start out cr0 clean, disable paging before modifying cr3/4
2013-01-31Revert "console: implement lockdep support for console_lock"Dave Airlie
This reverts commit daee779718a319ff9f83e1ba3339334ac650bb22. I'll requeue this after the console locking fixes, so lockdep is useful again for people until fbcon is fixed. Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-01-30tracing: Make a snapshot feature available from userspaceHiraku Toyooka
Ftrace has a snapshot feature available from kernel space and latency tracers (e.g. irqsoff) are using it. This patch enables user applictions to take a snapshot via debugfs. Add "snapshot" debugfs file in "tracing" directory. snapshot: This is used to take a snapshot and to read the output of the snapshot. # echo 1 > snapshot This will allocate the spare buffer for snapshot (if it is not allocated), and take a snapshot. # cat snapshot This will show contents of the snapshot. # echo 0 > snapshot This will free the snapshot if it is allocated. Any other positive values will clear the snapshot contents if the snapshot is allocated, or return EINVAL if it is not allocated. Link: http://lkml.kernel.org/r/20121226025300.3252.86850.stgit@liselsia Cc: Jiri Olsa <jolsa@redhat.com> Cc: David Sharp <dhsharp@google.com> Signed-off-by: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com> [ Fixed irqsoff selftest and also a conflict with a change that fixes the update_max_tr. ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-01-30tracing: Replace static old_tracer check of tracer nameHiraku Toyooka
Currently the trace buffer read functions use a static variable "old_tracer" for detecting if the current tracer changes. This was suitable for a single trace file ("trace"), but to add a snapshot feature that will use the same function for its file, a check against a static variable is not sufficient. To use the output functions for two different files, instead of storing the current tracer in a static variable, as the trace iterator descriptor contains a pointer to the original current tracer's name, that pointer can now be used to check if the current tracer has changed between different reads of the trace file. Link: http://lkml.kernel.org/r/20121226025252.3252.9276.stgit@liselsia Signed-off-by: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-01-30tracing: Use sched_clock_cpu for trace_clock_globalNamhyung Kim
For systems with an unstable sched_clock, all cpu_clock() does is enable/ disable local irq during the call to sched_clock_cpu(). And for stable systems they are same. trace_clock_global() already disables interrupts, so it can call sched_clock_cpu() directly. Link: http://lkml.kernel.org/r/1356576585-28782-2-git-send-email-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-01-30ring-buffer: Add stats field for amount read from trace ring bufferSteven Rostedt (Red Hat)
Add a stat about the number of events read from the ring buffer: # cat /debug/tracing/per_cpu/cpu0/stats entries: 39869 overrun: 870512 commit overrun: 0 bytes: 1449912 oldest event ts: 6561.368690 now ts: 6565.246426 dropped events: 0 read events: 112 <-- Added Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-01-29timekeeping: Switch HAS_PERSISTENT_CLOCK to ALWAYS_USE_PERSISTENT_CLOCKJohn Stultz
Jason pointed out the HAS_PERSISTENT_CLOCK name isn't quite accurate for the config, as some systems may have the persistent_clock in some cases, but not always. So change the config name to the more clear ALWAYS_USE_PERSISTENT_CLOCK. Signed-off-by: John Stultz <john.stultz@linaro.org>
2013-01-29tracing/fgraph: Adjust fgraph depth before calling trace return callbackSteven Rostedt (Red Hat)
While debugging the virtual cputime with the function graph tracer with a max_depth of 1 (most common use of the max_depth so far), I found that I was missing kernel execution because of a race condition. The code for the return side of the function has a slight race: ftrace_pop_return_trace(&trace, &ret, frame_pointer); trace.rettime = trace_clock_local(); ftrace_graph_return(&trace); barrier(); current->curr_ret_stack--; The ftrace_pop_return_trace() initializes the trace structure for the callback. The ftrace_graph_return() uses the trace structure for its own use as that structure is on the stack and is local to this function. Then the curr_ret_stack is decremented which is what the trace.depth is set to. If an interrupt comes in after the ftrace_graph_return() but before the curr_ret_stack, then the called function will get a depth of 2. If max_depth is set to 1 this function will be ignored. The problem is that the trace has already been called, and the timestamp for that trace will not reflect the time the function was about to re-enter userspace. Calls to the interrupt will not be traced because the max_depth has prevented this. To solve this issue, the ftrace_graph_return() can safely be moved after the current->curr_ret_stack has been updated. This way the timestamp for the return callback will reflect the actual time. If an interrupt comes in after the curr_ret_stack update and ftrace_graph_return(), it will be traced. It may look a little confusing to see it within the other function, but at least it will not be lost. Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-01-29tracing: Remove second iterator initializerJovi Zhang
The trace iterator is already initialized by trace_init_global_iter(), so there is no need to initialize it again. Link: http://lkml.kernel.org/r/CACV3sb+G1YnO6168JhY3dEadmJi58pA5-2cSZT8E0WVHJNFt9Q@mail.gmail.com Signed-off-by: Jovi Zhang <bookjovi@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-01-29Merge branches 'doctorture.2013.01.29a', 'fixes.2013.01.26a', ↵Paul E. McKenney
'tagcb.2013.01.24a' and 'tiny.2013.01.29b' into HEAD doctorture.2013.01.11a: Changes to rcutorture and to RCU documentation. fixes.2013.01.26a: Miscellaneous fixes. tagcb.2013.01.24a: Tag RCU callbacks with grace-period number to simplify callback advancement. tiny.2013.01.29b: Enhancements to uniprocessor handling in tiny RCU.
2013-01-29rcu: Make rcutorture's shuffler task shuffle recently added tasksPaul E. McKenney
A number of kthreads have been added to rcutorture, but the shuffler task was not informed of them, and thus did not shuffle them. This commit therefore adds the requisite shuffling, and, while in the area fixes up some whitespace issues. However, the shuffling is intended to keep randomly selected CPUs idle, which means that the RCU priority boosting kthreads need to avoid waking up every jiffy. This commit also makes that fix. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>