From 8f2f748b0656257153bcf0941df8d6060acc5ca6 Mon Sep 17 00:00:00 2001 From: "Srivatsa S. Bhat" Date: Thu, 23 Feb 2012 15:27:15 +0530 Subject: CPU hotplug, cpusets, suspend: Don't touch cpusets during suspend/resume Currently, during CPU hotplug, the cpuset callbacks modify the cpusets to reflect the state of the system, and this handling is asymmetric. That is, upon CPU offline, that CPU is removed from all cpusets. However when it comes back online, it is put back only to the root cpuset. This gives rise to a significant problem during suspend/resume. During suspend, we offline all non-boot cpus and during resume we online them back. Which means, after a resume, all cpusets (except the root cpuset) will be restricted to just one single CPU (the boot cpu). But the whole point of suspend/resume is to restore the system to a state which is as close as possible to how it was before suspend. So to fix this, don't touch cpusets during suspend/resume. That is, modify the cpuset-related CPU hotplug callback to just ignore CPU hotplug when it is initiated as part of the suspend/resume sequence. Reported-by: Prashanth Nageshappa Signed-off-by: Srivatsa S. Bhat Cc: Linus Torvalds Cc: Andrew Morton Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/4F460D7B.1020703@linux.vnet.ibm.com Signed-off-by: Peter Zijlstra Signed-off-by: Ingo Molnar diff --git a/kernel/sched/core.c b/kernel/sched/core.c index b342f57..33a0676 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6728,7 +6728,7 @@ int __init sched_create_sysfs_power_savings_entries(struct device *dev) static int cpuset_cpu_active(struct notifier_block *nfb, unsigned long action, void *hcpu) { - switch (action & ~CPU_TASKS_FROZEN) { + switch (action) { case CPU_ONLINE: case CPU_DOWN_FAILED: cpuset_update_active_cpus(); @@ -6741,7 +6741,7 @@ static int cpuset_cpu_active(struct notifier_block *nfb, unsigned long action, static int cpuset_cpu_inactive(struct notifier_block *nfb, unsigned long action, void *hcpu) { - switch (action & ~CPU_TASKS_FROZEN) { + switch (action) { case CPU_DOWN_PREPARE: cpuset_update_active_cpus(); return NOTIFY_OK; -- cgit v0.10.2 From 30ce2f7eef095d1b8d070740f1948629814fe3c7 Mon Sep 17 00:00:00 2001 From: Namhyung Kim Date: Tue, 28 Feb 2012 10:19:38 +0900 Subject: perf/hwbp: Fix a possible memory leak If kzalloc() for TYPE_DATA failed on a given cpu, previous chunk of TYPE_INST will be leaked. Fix it. Thanks to Peter Zijlstra for suggesting this better solution. It should work as long as the initial value of the region is all 0's and that's the case of static (per-cpu) memory allocation. Signed-off-by: Namhyung Kim Acked-by: Frederic Weisbecker Acked-by: Peter Zijlstra Cc: Paul Mackerras Cc: Arnaldo Carvalho de Melo Link: http://lkml.kernel.org/r/1330391978-28070-1-git-send-email-namhyung.kim@lge.com Signed-off-by: Ingo Molnar diff --git a/kernel/events/hw_breakpoint.c b/kernel/events/hw_breakpoint.c index b7971d6..ee706ce 100644 --- a/kernel/events/hw_breakpoint.c +++ b/kernel/events/hw_breakpoint.c @@ -651,10 +651,10 @@ int __init init_hw_breakpoint(void) err_alloc: for_each_possible_cpu(err_cpu) { - if (err_cpu == cpu) - break; for (i = 0; i < TYPE_MAX; i++) kfree(per_cpu(nr_task_bp_pinned[i], cpu)); + if (err_cpu == cpu) + break; } return -ENOMEM; -- cgit v0.10.2 From 30e68bcc67e41ab6dab4e4e1efc7ea8ca893c0af Mon Sep 17 00:00:00 2001 From: Namhyung Kim Date: Mon, 20 Feb 2012 10:47:26 +0900 Subject: perf evlist: Return first evsel for non-sample event on old kernel On old kernels that don't support sample_id_all feature, perf_evlist__id2evsel() returns NULL for non-sampling events. This breaks perf top when multiple events are given on command line. Fix it by using first evsel in the evlist. This will also prevent getting the same (potential) problem in such new tool/ old kernel combo. Suggested-by: Arnaldo Carvalho de Melo Cc: Ingo Molnar Cc: Paul Mackerras Cc: Peter Zijlstra Link: http://lkml.kernel.org/r/1329702447-25045-1-git-send-email-namhyung.kim@lge.com Signed-off-by: Namhyung Kim Signed-off-by: Arnaldo Carvalho de Melo diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c index 3f16e08..ea32a06 100644 --- a/tools/perf/util/evlist.c +++ b/tools/perf/util/evlist.c @@ -349,6 +349,10 @@ struct perf_evsel *perf_evlist__id2evsel(struct perf_evlist *evlist, u64 id) hlist_for_each_entry(sid, pos, head, node) if (sid->id == id) return sid->evsel; + + if (!perf_evlist__sample_id_all(evlist)) + return list_entry(evlist->entries.next, struct perf_evsel, node); + return NULL; } -- cgit v0.10.2 From 26b7952494772f0e695271fbd6cf83a852f60f25 Mon Sep 17 00:00:00 2001 From: Prashanth Nageshappa Date: Fri, 24 Feb 2012 13:11:39 +0530 Subject: perf probe: Ensure offset provided is not greater than function length The perf probe command allows kprobe to be inserted at any offset from a function start, which results in adding kprobes to unintended location. Example: perf probe do_fork+10000 is allowed even though size of do_fork is ~904. This patch will ensure probe addition fails when the offset specified is greater than size of the function. Acked-by: Masami Hiramatsu Cc: Ananth N Mavinakayanahalli Cc: Srikar Dronamraju Cc: Steven Rostedt Cc: Andrew Morton Cc: Jason Baron Cc: Masami Hiramatsu Link: http://lkml.kernel.org/r/4F473F33.4060409@linux.vnet.ibm.com Signed-off-by: Prashanth Nageshappa Signed-off-by: Arnaldo Carvalho de Melo diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c index 5d73262..74bd2e6 100644 --- a/tools/perf/util/probe-finder.c +++ b/tools/perf/util/probe-finder.c @@ -672,7 +672,7 @@ static int find_variable(Dwarf_Die *sc_die, struct probe_finder *pf) static int convert_to_trace_point(Dwarf_Die *sp_die, Dwarf_Addr paddr, bool retprobe, struct probe_trace_point *tp) { - Dwarf_Addr eaddr; + Dwarf_Addr eaddr, highaddr; const char *name; /* Copy the name of probe point */ @@ -683,6 +683,16 @@ static int convert_to_trace_point(Dwarf_Die *sp_die, Dwarf_Addr paddr, dwarf_diename(sp_die)); return -ENOENT; } + if (dwarf_highpc(sp_die, &highaddr) != 0) { + pr_warning("Failed to get end address of %s\n", + dwarf_diename(sp_die)); + return -ENOENT; + } + if (paddr > highaddr) { + pr_warning("Offset specified is greater than size of %s\n", + dwarf_diename(sp_die)); + return -EINVAL; + } tp->symbol = strdup(name); if (tp->symbol == NULL) return -ENOMEM; -- cgit v0.10.2 From cfbd70c17c4535e64be92ea442a2a45078a18184 Mon Sep 17 00:00:00 2001 From: David Ahern Date: Fri, 24 Feb 2012 12:31:38 -0700 Subject: perf tools: Ensure comm string is properly terminated If threads in a multi-threaded process have names shorter than the main thread the comm for the named threads is not properly terminated. E.g., for the process 'namedthreads' where each thread is named noploop%d where %d is the thread number: Before: perf script -f comm,tid,ip,sym,dso noploop:4ads 21616 400a49 noploop (/tmp/namedthreads) The 'ads' in the thread comm bleeds over from the process name. After: perf script -f comm,tid,ip,sym,dso noploop:4 21616 400a49 noploop (/tmp/namedthreads) Cc: Frederic Weisbecker Cc: Ingo Molnar Cc: Paul Mackerras Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/1330111898-68071-1-git-send-email-dsahern@gmail.com Signed-off-by: David Ahern Signed-off-by: Arnaldo Carvalho de Melo diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c index 2044324..2a6f33c 100644 --- a/tools/perf/util/event.c +++ b/tools/perf/util/event.c @@ -74,6 +74,7 @@ static pid_t perf_event__get_comm_tgid(pid_t pid, char *comm, size_t len) if (size >= len) size = len - 1; memcpy(comm, name, size); + comm[size] = '\0'; } else if (memcmp(bf, "Tgid:", 5) == 0) { char *tgids = bf + 5; -- cgit v0.10.2 From 1c1bc9223387dacc48eb2b61b0baabe7e9cf47f6 Mon Sep 17 00:00:00 2001 From: Prashanth Nageshappa Date: Tue, 28 Feb 2012 09:43:01 +0530 Subject: perf probe: Ensure offset provided is not greater than function length without DWARF info too The 'perf probe' command allows kprobe to be inserted at any offset from a function start, which results in adding kprobes to unintended location. (example: perf probe do_fork+10000 is allowed even though size of do_fork is ~904). My previous patch https://lkml.org/lkml/2012/2/24/42 addressed the case where DWARF info was available for the kernel. This patch fixes the case where perf probe is used on a kernel without debuginfo available. Acked-by: Masami Hiramatsu Cc: Ananth N Mavinakayanahalli Cc: Jason Baron Cc: Masami Hiramatsu Cc: Srikar Dronamraju Cc: Steven Rostedt Cc: Andrew Morton Link: http://lkml.kernel.org/r/4F4C544D.1010909@linux.vnet.ibm.com Signed-off-by: Prashanth Nageshappa Signed-off-by: Arnaldo Carvalho de Melo diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c index 29cb654..e33554a 100644 --- a/tools/perf/util/probe-event.c +++ b/tools/perf/util/probe-event.c @@ -1867,6 +1867,12 @@ static int convert_to_probe_trace_events(struct perf_probe_event *pev, tev->point.symbol); ret = -ENOENT; goto error; + } else if (tev->point.offset > sym->end - sym->start) { + pr_warning("Offset specified is greater than size of %s\n", + tev->point.symbol); + ret = -ENOENT; + goto error; + } return 1; -- cgit v0.10.2 From 847854f5988a04fe7e02d2fdd4fa0df9f96360fe Mon Sep 17 00:00:00 2001 From: Tejun Heo Date: Wed, 29 Feb 2012 05:56:21 +0900 Subject: memblock: Fix size aligning of memblock_alloc_base_nid() memblock allocator aligns @size to @align to reduce the amount of fragmentation. Commit: 7bd0b0f0da ("memblock: Reimplement memblock allocation using reverse free area iterator") Broke it by incorrectly relocating @size aligning to memblock_find_in_range_node(). As the aligned size is not propagated back to memblock_alloc_base_nid(), the actually reserved size isn't aligned. While this increases memory use for memblock reserved array, this shouldn't cause any critical failure; however, it seems that the size aligning was hiding a use-beyond-allocation bug in sparc64 and losing the aligning causes boot failure. The underlying problem is currently being debugged but this is a proper fix in itself, it's already pretty late in -rc cycle for boot failures and reverting the change for debugging isn't difficult. Restore the size aligning moving it to memblock_alloc_base_nid(). Reported-by: Meelis Roos Signed-off-by: Tejun Heo Cc: David S. Miller Cc: Grant Likely Cc: Rob Herring Cc: Linus Torvalds Cc: Andrew Morton Link: http://lkml.kernel.org/r/20120228205621.GC3252@dhcp-172-17-108-109.mtv.corp.google.com Signed-off-by: Ingo Molnar LKML-Reference: diff --git a/mm/memblock.c b/mm/memblock.c index 77b5f22..99f2855 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -99,9 +99,6 @@ phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t start, phys_addr_t this_start, this_end, cand; u64 i; - /* align @size to avoid excessive fragmentation on reserved array */ - size = round_up(size, align); - /* pump up @end */ if (end == MEMBLOCK_ALLOC_ACCESSIBLE) end = memblock.current_limit; @@ -731,6 +728,9 @@ static phys_addr_t __init memblock_alloc_base_nid(phys_addr_t size, { phys_addr_t found; + /* align @size to avoid excessive fragmentation on reserved array */ + size = round_up(size, align); + found = memblock_find_in_range_node(0, max_addr, size, align, nid); if (found && !memblock_reserve(found, size)) return found; -- cgit v0.10.2