summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2015-04-02samples/bpf: Add simple non-portable kprobe filter exampleAlexei Starovoitov
tracex1_kern.c - C program compiled into BPF. It attaches to kprobe:netif_receive_skb() When skb->dev->name == "lo", it prints sample debug message into trace_pipe via bpf_trace_printk() helper function. tracex1_user.c - corresponding user space component that: - loads BPF program via bpf() syscall - opens kprobes:netif_receive_skb event via perf_event_open() syscall - attaches the program to event via ioctl(event_fd, PERF_EVENT_IOC_SET_BPF, prog_fd); - prints from trace_pipe Note, this BPF program is non-portable. It must be recompiled with current kernel headers. kprobe is not a stable ABI and BPF+kprobe scripts may no longer be meaningful when kernel internals change. No matter in what way the kernel changes, neither the kprobe, nor the BPF program can ever crash or corrupt the kernel, assuming the kprobes, perf and BPF subsystem has no bugs. The verifier will detect that the program is using bpf_trace_printk() and the kernel will print 'this is a DEBUG kernel' warning banner, which means that bpf_trace_printk() should be used for debugging of the BPF program only. Usage: $ sudo tracex1 ping-19826 [000] d.s2 63103.382648: : skb ffff880466b1ca00 len 84 ping-19826 [000] d.s2 63103.382684: : skb ffff880466b1d300 len 84 ping-19826 [000] d.s2 63104.382533: : skb ffff880466b1ca00 len 84 ping-19826 [000] d.s2 63104.382594: : skb ffff880466b1d300 len 84 Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David S. Miller <davem@davemloft.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1427312966-8434-7-git-send-email-ast@plumgrid.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-02tracing: Allow BPF programs to call bpf_trace_printk()Alexei Starovoitov
Debugging of BPF programs needs some form of printk from the program, so let programs call limited trace_printk() with %d %u %x %p modifiers only. Similar to kernel modules, during program load verifier checks whether program is calling bpf_trace_printk() and if so, kernel allocates trace_printk buffers and emits big 'this is debug only' banner. Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David S. Miller <davem@davemloft.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1427312966-8434-6-git-send-email-ast@plumgrid.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-02tracing: Allow BPF programs to call bpf_ktime_get_ns()Alexei Starovoitov
bpf_ktime_get_ns() is used by programs to compute time delta between events or as a timestamp Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David S. Miller <davem@davemloft.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1427312966-8434-5-git-send-email-ast@plumgrid.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-02tracing, perf: Implement BPF programs attached to kprobesAlexei Starovoitov
BPF programs, attached to kprobes, provide a safe way to execute user-defined BPF byte-code programs without being able to crash or hang the kernel in any way. The BPF engine makes sure that such programs have a finite execution time and that they cannot break out of their sandbox. The user interface is to attach to a kprobe via the perf syscall: struct perf_event_attr attr = { .type = PERF_TYPE_TRACEPOINT, .config = event_id, ... }; event_fd = perf_event_open(&attr,...); ioctl(event_fd, PERF_EVENT_IOC_SET_BPF, prog_fd); 'prog_fd' is a file descriptor associated with BPF program previously loaded. 'event_id' is an ID of the kprobe created. Closing 'event_fd': close(event_fd); ... automatically detaches BPF program from it. BPF programs can call in-kernel helper functions to: - lookup/update/delete elements in maps - probe_read - wraper of probe_kernel_read() used to access any kernel data structures BPF programs receive 'struct pt_regs *' as an input ('struct pt_regs' is architecture dependent) and return 0 to ignore the event and 1 to store kprobe event into the ring buffer. Note, kprobes are a fundamentally _not_ a stable kernel ABI, so BPF programs attached to kprobes must be recompiled for every kernel version and user must supply correct LINUX_VERSION_CODE in attr.kern_version during bpf_prog_load() call. Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David S. Miller <davem@davemloft.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1427312966-8434-4-git-send-email-ast@plumgrid.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-02tracing: Add kprobe flagAlexei Starovoitov
add TRACE_EVENT_FL_KPROBE flag to differentiate kprobe type of tracepoints, since bpf programs can only be attached to kprobe type of PERF_TYPE_TRACEPOINT perf events. Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David S. Miller <davem@davemloft.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1427312966-8434-3-git-send-email-ast@plumgrid.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-02bpf: Make internal bpf API independent of CONFIG_BPF_SYSCALL #ifdefsDaniel Borkmann
Socket filter code and other subsystems with upcoming eBPF support should not need to deal with the fact that we have CONFIG_BPF_SYSCALL defined or not. Having the bpf syscall as a config option is a nice thing and I'd expect it to stay that way for expert users (I presume one day the default setting of it might change, though), but code making use of it should not care if it's actually enabled or not. Instead, hide this via header files and let the rest deal with it. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: David S. Miller <davem@davemloft.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1427312966-8434-2-git-send-email-ast@plumgrid.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-02Merge branch 'perf/timer' into perf/coreIngo Molnar
This WIP branch is now ready to be merged. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-01Merge tag 'perf-core-for-mingo' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: User visible changes: - Fix 'perf script' pipe mode segfault, by always initializing ordered_events in perf_session__new(). (Arnaldo Carvalho de Melo) - Fix ppid for synthesized fork events. (David Ahern) - Fix kernel symbol resolution of callchains in S/390 by remembering the cpumode. (David Hildenbrand) Infrastructure changes: - Disable libbabeltrace check by default in the build system. (Jiri Olsa) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-31perf ordered_samples: Remove references to perf_{evlist,tool} and machinesArnaldo Carvalho de Melo
As these can be obtained from the ordered_events pointer, via container_of, reducing the cross section of ordered_samples. These were added to ordered_samples in: commit b7b61cbebd789a3dbca522e3fdb727fe5c95593f Author: Arnaldo Carvalho de Melo <acme@redhat.com> Date: Tue Mar 3 11:58:45 2015 -0300 perf ordered_events: Shorten function signatures By keeping pointers to machines, evlist and tool in ordered_events. But that was more a transitional patch while moving stuff out from perf_session.c to ordered_events.c and possibly not even needed by then, as we could use the container_of() method and instead of having the nr_unordered_samples stats in events_stats, we can have it in ordered_samples. Based-on-a-patch-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-4lk0t9js82g0tfc0x1onpkjt@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-03-31perf session: Always initialize ordered_eventsArnaldo Carvalho de Melo
Even when it is not used to actually reorder events, some of its fields are used, like session->ordered_events->tool, to shorten function signatures where tool, for instance, was being passed, as the tool is needed for the ordered_events code, we need it there and might as well use it for other perf_session needs. This fixes a problem where 'perf script' had some condition that made session->ordered_events not to be initialized even with its script->tool ordered_events related flags asking for it to be, which looks like another bug and needs to be investigated further. Always initializing session->ordered_events at least leaves the current assumptions in place, so do it now. Reported-by: David Ahern <dsahern@gmail.com> Reviewed-by: David Ahern <dsahern@gmail.com> Tested-by: David Ahern <dsahern@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-b1xxk0rwkz2a0gip1uufmjqg@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-03-31perf tools: Fix ppid for synthesized fork eventsDavid Ahern
363b785f38 added synthesized fork events and set a thread's parent id to itself. Since we are already processing /proc/<pid>/status the ppid can be determined properly. Make it so. Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: Don Zickus <dzickus@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Joe Mario <jmario@redhat.com> Link: http://lkml.kernel.org/r/1427747758-18510-2-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-03-31perf tools: Refactor comm/tgid lookupDavid Ahern
Rather than parsing /proc/pid/status file one line at a time, read it into a buffer in one shot and search for all strings in one pass. tgid conversion also simplified -- removing the isspace walk. As noted by Arnaldo those are not needed for atoi == strtol calls. Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: Don Zickus <dzickus@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Joe Mario <jmario@redhat.com> Link: http://lkml.kernel.org/r/1427747758-18510-1-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-03-31perf callchain: Fix kernel symbol resolution by remembering the cpumodeDavid Hildenbrand
Commit 2e77784bb7d8 ("perf callchain: Move cpumode resolve code to add_callchain_ip") promised "No change in behavior.". As this commit breaks callchains on s390x (symbols not getting resolved, observed when profiling the kernel), this statement is wrong. The cpumode must be kept when iterating over all ips, otherwise the default (PERF_RECORD_MISC_USER) will be used by error. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Hildenbrand <dahi@linux.vnet.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1427703060-59883-1-git-send-email-dahi@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-03-30perf build: Disable libbabeltrace check by defaultJiri Olsa
Disabling libbabeltrace check by default and replacing the NO_LIBBABELTRACE make variable with LIBBABELTRACE. Users wanting the libbabeltrace feature need to build via: $ make LIBBABELTRACE=1 The reason for this is that the libababeltrace interface we use (version 1.3) hasn't been packaged/released yet, thus the failing feature check only slows down build and confuses other (non CTF) developers. Requested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jeremie Galarneau <jgalar@efficios.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20150328103030.GA8431@krava.redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-03-27perf: Add per event clockid supportPeter Zijlstra
While thinking on the whole clock discussion it occurred to me we have two distinct uses of time: 1) the tracking of event/ctx/cgroup enabled/running/stopped times which includes the self-monitoring support in struct perf_event_mmap_page. 2) the actual timestamps visible in the data records. And we've been conflating them. The first is all about tracking time deltas, nobody should really care in what time base that happens, its all relative information, as long as its internally consistent it works. The second however is what people are worried about when having to merge their data with external sources. And here we have the discussion on MONOTONIC vs MONOTONIC_RAW etc.. Where MONOTONIC is good for correlating between machines (static offset), MONOTNIC_RAW is required for correlating against a fixed rate hardware clock. This means configurability; now 1) makes that hard because it needs to be internally consistent across groups of unrelated events; which is why we had to have a global perf_clock(). However, for 2) it doesn't really matter, perf itself doesn't care what it writes into the buffer. The below patch makes the distinction between these two cases by adding perf_event_clock() which is used for the second case. It further makes this configurable on a per-event basis, but adds a few sanity checks such that we cannot combine events with different clocks in confusing ways. And since we then have per-event configurability we might as well retain the 'legacy' behaviour as a default. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27Merge branch 'perf/core' into perf/timer, before applying new changesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27Merge branch 'timers/core' into perf/timer, to apply dependent patchIngo Molnar
An upcoming patch will depend on tai_ns() and NMI-safe ktime_get_raw_fast(), so merge timers/core here in a separate topic branch until it's all cooked and timers/core is merged upstream. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27perf: Fix racy group accessPeter Zijlstra
While looking at some fuzzer output I noticed that we do not hold any locks on leader->ctx and therefore the sibling_list iteration is unsafe. Acquire the relevant ctx->mutex before calling into the pmu specific code. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Sasha Levin <sasha.levin@oracle.com> Link: http://lkml.kernel.org/r/20150225151639.GL5029@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27perf/x86: Remove redundant calls to perf_pmu_{dis|en}able()David Ahern
perf_pmu_disable() is called before pmu->add() and perf_pmu_enable() is called afterwards. No need to call these inside of x86_pmu_add() as well. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: http://lkml.kernel.org/r/1424281543-67335-1-git-send-email-dsahern@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27Merge branch 'perf/x86' into perf/core, because it's readyIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27Merge branch 'perf/urgent' into perf/core, to pick up fixes and to refresh ↵Ingo Molnar
the tree Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27time: Add ktime_get_tai_ns()Peter Zijlstra
Because it was the only clock for which we didn't have a _ns() accessor yet. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: John Stultz <john.stultz@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27time: Introduce tk_fast_rawPeter Zijlstra
Add the NMI safe CLOCK_MONOTONIC_RAW accessor.. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: John Stultz <john.stultz@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20150319093400.562746929@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27time: Parametrize all tk_fast_mono usersPeter Zijlstra
In preparation for more tk_fast instances, remove all hard-coded tk_fast_mono references. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: John Stultz <john.stultz@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20150319093400.484279927@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27time: Add timerkeeper::tkr_rawPeter Zijlstra
Introduce tkr_raw and make use of it. base_raw -> tkr_raw.base clock->{mult,shift} -> tkr_raw.{mult.shift} Kill timekeeping_get_ns_raw() in favour of timekeeping_get_ns(&tkr_raw), this removes all mono_raw special casing. Duplicate the updates to tkr_mono.cycle_last into tkr_raw.cycle_last, both need the same value. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: John Stultz <john.stultz@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20150319093400.422589590@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27time: Rename timekeeper::tkr to timekeeper::tkr_monoPeter Zijlstra
In preparation of adding another tkr field, rename this one to tkr_mono. Also rename tk_read_base::base_mono to tk_read_base::base, since the structure is not specific to CLOCK_MONOTONIC and the mono name got added to the tk_read_base instance. Lots of trivial churn. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: John Stultz <john.stultz@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20150319093400.344679419@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27perf/x86/intel: Add INST_RETIRED.ALL workaroundsAndi Kleen
On Broadwell INST_RETIRED.ALL cannot be used with any period that doesn't have the lowest 6 bits cleared. And the period should not be smaller than 128. This is erratum BDM11 and BDM55: http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/5th-gen-core-family-spec-update.pdf BDM11: When using a period < 100; we may get incorrect PEBS/PMI interrupts and/or an invalid counter state. BDM55: When bit0-5 of the period are !0 we may get redundant PEBS records on overflow. Add a new callback to enforce this, and set it for Broadwell. How does this handle the case when an app requests a specific period with some of the bottom bits set? Short answer: Any useful instruction sampling period needs to be 4-6 orders of magnitude larger than 128, as an PMI every 128 instructions would instantly overwhelm the system and be throttled. So the +-64 error from this is really small compared to the period, much smaller than normal system jitter. Long answer (by Peterz): IFF we guarantee perf_event_attr::sample_period >= 128. Suppose we start out with sample_period=192; then we'll set period_left to 192, we'll end up with left = 128 (we truncate the lower bits). We get an interrupt, find that period_left = 64 (>0 so we return 0 and don't get an overflow handler), up that to 128. Then we trigger again, at n=256. Then we find period_left = -64 (<=0 so we return 1 and do get an overflow). We increment with sample_period so we get left = 128. We fire again, at n=384, period_left = 0 (<=0 so we return 1 and get an overflow). And on and on. So while the individual interrupts are 'wrong' we get then with interval=256,128 in exactly the right ratio to average out at 192. And this works for everything >=128. So the num_samples*fixed_period thing is still entirely correct +- 127, which is good enough I'd say, as you already have that error anyhow. So no need to 'fix' the tools, al we need to do is refuse to create INST_RETIRED:ALL events with sample_period < 128. Signed-off-by: Andi Kleen <ak@linux.intel.com> [ Updated comments and changelog a bit. ] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: http://lkml.kernel.org/r/1424225886-18652-3-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27perf/x86/intel: Add Broadwell core supportAndi Kleen
Add Broadwell support for Broadwell to perf. The basic support is very similar to Haswell. We use the new cache event list added for Haswell earlier. The only differences are a few bits related to remote nodes. To avoid an extra, mostly identical, table these are patched up in the initialization code. The constraint list has one new event that needs to be handled over Haswell. Includes code and testing from Kan Liang. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: http://lkml.kernel.org/r/1424225886-18652-2-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27perf/x86/intel: Add new cache events table for HaswellAndi Kleen
Haswell offcore events are quite different from Sandy Bridge. Add a new table to handle Haswell properly. Note that the offcore bits listed in the SDM are not quite correct (this is currently being fixed). An uptodate list of bits is in the patch. The basic setup is similar to Sandy Bridge. The prefetch columns have been removed, as prefetch counting is not very reliable on Haswell. One L1 event that is not in the event list anymore has been also removed. - data reads do not include code reads (comparable to earlier Sandy Bridge tables) - data counts include speculative execution (except L1 write, dtlb, bpu) - remote node access includes both remote memory, remote cache, remote mmio. - prefetches are not included in the counts for consistency (different from Sandy Bridge, which includes prefetches in the remote node) Signed-off-by: Andi Kleen <ak@linux.intel.com> [ Removed the HSM30 comments; we don't have them for SNB/IVB either. ] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: http://lkml.kernel.org/r/1424225886-18652-1-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27Merge tag 'perf-core-for-mingo' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: User visible changes: - Show the first event with an invalid filter (David Ahern, Arnaldo Carvalho de Melo) - Fix garbage output when intermixing syscalls from different threads in 'perf trace' (Arnaldo Carvalho de Melo) - Fix 'perf timechart' SIBGUS error on sparc64 (David Ahern) Infrastructure changes: - Set JOBS based on CPU or processor, making it work on SPARC, where /proc/cpuinfo has "CPU", not "processor" (David Ahern) - Zero should not be considered "not found" in libtraceevent's eval_flag() (Steven Rostedt) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27timers, sched/clock: Clean up the code a bitIngo Molnar
Trivial cleanups, to improve the readability of the generic sched_clock() code: - Improve and standardize comments - Standardize the coding style - Use vertical spacing where appropriate - etc. No code changed: md5: 19a053b31e0c54feaeff1492012b019a sched_clock.o.before.asm 19a053b31e0c54feaeff1492012b019a sched_clock.o.after.asm Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Daniel Thompson <daniel.thompson@linaro.org> Cc: John Stultz <john.stultz@linaro.org> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Stephen Boyd <sboyd@codeaurora.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27timers, sched/clock: Avoid deadlock during read from NMIDaniel Thompson
Currently it is possible for an NMI (or FIQ on ARM) to come in and read sched_clock() whilst update_sched_clock() has locked the seqcount for writing. This results in the NMI handler locking up when it calls raw_read_seqcount_begin(). This patch fixes the NMI safety issues by providing banked clock data. This is a similar approach to the one used in Thomas Gleixner's 4396e058c52e("timekeeping: Provide fast and NMI safe access to CLOCK_MONOTONIC"). Suggested-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: John Stultz <john.stultz@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/1427397806-20889-6-git-send-email-john.stultz@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27timers, sched/clock: Remove redundant notrace from update functionDaniel Thompson
Currently update_sched_clock() is marked as notrace but this function is not called by ftrace. This is trivially fixed by removing the mark up. Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: John Stultz <john.stultz@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/1427397806-20889-5-git-send-email-john.stultz@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27timers, sched/clock: Remove suspend from clock_read_data()Daniel Thompson
Currently cd.read_data.suspended is read by the hotpath function sched_clock(). This variable need not be accessed on the hotpath. In fact, once it is removed, we can remove the conditional branches from sched_clock() and install a dummy read_sched_clock function to suspend the clock. The new master copy of the function pointer (actual_read_sched_clock) is introduced and is used for all reads of the clock hardware except those within sched_clock itself. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: John Stultz <john.stultz@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/1427397806-20889-4-git-send-email-john.stultz@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27timers, sched/clock: Optimize cache line usageDaniel Thompson
Currently sched_clock(), a very hot code path, is not optimized to minimise its cache profile. In particular: 1. cd is not ____cacheline_aligned, 2. struct clock_data does not distinguish between hotpath and coldpath data, reducing locality of reference in the hotpath, 3. Some hotpath data is missing from struct clock_data and is marked __read_mostly (which more or less guarantees it will not share a cache line with cd). This patch corrects these problems by extracting all hotpath data into a separate structure and using ____cacheline_aligned to ensure the hotpath uses a single (64 byte) cache line. Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: John Stultz <john.stultz@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/1427397806-20889-3-git-send-email-john.stultz@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27timers, sched/clock: Match scope of read and write seqcountsDaniel Thompson
Currently the scope of the raw_write_seqcount_begin/end() in sched_clock_register() far exceeds the scope of the read section in sched_clock(). This gives the impression of safety during cursory review but achieves little. Note that this is likely to be a latent issue at present because sched_clock_register() is typically called before we enable interrupts, however the issue does risk bugs being needlessly introduced as the code evolves. This patch fixes the problem by increasing the scope of the read locking performed by sched_clock() to cover all data modified by sched_clock_register. We also improve clarity by moving writes to struct clock_data that do not impact sched_clock() outside of the critical section. Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> [ Reworked it slightly to apply to tip/timers/core] Signed-off-by: John Stultz <john.stultz@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/1427397806-20889-2-git-send-email-john.stultz@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-26Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds
Pull drm refcounting fixes from Dave Airlie: "Here is the complete set of i915 bug/warn/refcounting fixes" * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: drm/i915: Fixup legacy plane->crtc link for initial fb config drm/i915: Fix atomic state when reusing the firmware fb drm/i915: Keep ring->active_list and ring->requests_list consistent drm/i915: Don't try to reference the fb in get_initial_plane_config() drm: Fixup racy refcounting in plane_force_disable
2015-03-26Merge tag 'dm-4.0-fix-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fix from Mike Snitzer: "Fix DM core device cleanup regression -- due to a latent race that was exposed by the bdi changes that were introduced during the 4.0 merge" * tag 'dm-4.0-fix-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm: fix add_disk() NULL pointer due to race with free_dev()
2015-03-26Merge tag 'linux-kselftest-4.0-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kselftest fix from Shuah Khan. * tag 'linux-kselftest-4.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests: Fix build failures when invoked from kselftest target
2015-03-26Merge tag 'drm-intel-fixes-2015-03-26' of ↵Dave Airlie
git://anongit.freedesktop.org/drm-intel into drm-fixes This should cover the final warnings in -rc5 with two more backports from our development branch (drm-intel-next-queued). They're the ones from Daniel and Damien, with references to the reports. This is on top of drm-fixes because of the dependency on the two earlier fixes not yet in Linus' tree. There's an additional regression fix from Chris. * tag 'drm-intel-fixes-2015-03-26' of git://anongit.freedesktop.org/drm-intel: drm/i915: Fixup legacy plane->crtc link for initial fb config drm/i915: Fix atomic state when reusing the firmware fb drm/i915: Keep ring->active_list and ring->requests_list consistent
2015-03-26Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Martin Schwidefsky: "A couple of bug fixes for s390. The ftrace comile fix is quite large for a -rc6 release, but it would be nice to have it in 4.0" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/smp: reenable smt after resume s390/mm: limit STACK_RND_MASK for compat tasks s390/ftrace: fix compile error if CONFIG_KPROBES is disabled s390/cpum_sf: add diagnostic sampling event only if it is authorized
2015-03-26tools lib traceevent: Zero should not be considered "not found" in eval_flag()Steven Rostedt
Guilherme Cox found that: There is, however, a potential bug if there is an item with code zero that is not the first one in the symbol list, since eval_flag(..) returns 0 when it doesn't find anything. That is, if you have the following enums: enum { FOO_START = 0, FOO_GO = 1, FOO_END = 2 } and then have: __print_symbolic(foo, FOO_GO, "go", FOO_START, "start", FOO_END, "end") If none of the enums are known to pevent, then eval_flag() will return zero, and it will match it to the first item in the list, which would be FOO_GO, which is not zero. Luckily, in most cases, the first element would be zero, and the parsing would match out of sheer luck. Reported-by: Guilherme Cox <cox@computer.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20150324145813.0bfe95ba@gandalf.local.home Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-03-26perf trace: Fix syscall enter formatting bugArnaldo Carvalho de Melo
commit e596663ebb28a068f5cca57f83285b7b293a2c83 Author: Arnaldo Carvalho de Melo <acme@redhat.com> Date: Fri Feb 13 13:22:21 2015 -0300 perf trace: Handle multiple threads better wrt syscalls being intermixed Introduced a bug where it considered the number of bytes output directly to the output file when formatting the syscall entry buffer that is stored to be finally printed at syscall exit, ending up leaving garbage at the start of syscalls that appeared while another syscall was being processed, in another thread. Fix it. Example of garbage in the output before this patch: 4280.102 ( 0.000 ms): lsmd/763 ... [continued]: select()) = 0 Timeout 4280.107 (275.250 ms): tuned/852 select(tvp: 0x7f41f7ffde50 ) ... 4280.109 ( 0.002 ms): lsmd/763 Xl�� ) = -10 4639.197 ( 0.000 ms): systemd-journa/542 ... [continued]: epoll_wait()) = 1 4639.202 (359.088 ms): lsmd/763 select(n: 6, inp: 0x7ffff21daad0, tvp: 0x7ffff21daac0) ... 4639.207 ( 0.005 ms): systemd-journa/542 Hn�� ) = 106 4639.221 ( 0.002 ms): systemd-journa/542 uname(name: 0x7ffdbaed8e00) = 0 4639.271 ( 0.008 ms): systemd-journa/542 ftruncate(fd: 11</run/log/journal/60cd52417cf440a4a80107518bbd3c20/system.journal>, length: 50331648) = 0 Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-9ckfe8mvsedgkg6y80gz1ul8@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-03-26perf tools: Set JOBS based on CPU or processorDavid Ahern
Number of JOBS to use is set automatically to the number of processors found in /proc/cpuinfo. SPARC uses 'CPU' lines rather than 'processor'. Update the check in perf's Makefile to work for SPARC. Signed-off-by: David Ahern <david.ahern@oracle.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1427213455-127249-1-git-send-email-david.ahern@oracle.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-03-26perf: Bump max number of cpus to 1024David Ahern
SPARC based systems currently support up to 1024 cpus (e.g. T5-8). Allow perf to work on those systems. Signed-off-by: David Ahern <david.ahern@oracle.com> Link: http://lkml.kernel.org/r/1427213438-127216-1-git-send-email-david.ahern@oracle.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-03-26perf evlist: Return the first evsel with an invalid filter in apply_filters()Arnaldo Carvalho de Melo
Use of a bad filter currently generates the message: Error: failed to set filter with 22 (Invalid argument) Add the event name to make it clear to which event the filter failed to apply: Error: Failed to set filter "foo" on event sched:sg_lb_stats: 22: Invalid argument To test it use something like: # perf record -e sched:sched_switch -e sched:*fork --filter parent_pid==1 -e sched:*wait* --filter bla usleep 1 Error: failed to set filter "bla" on event sched:sched_stat_iowait with 22 (Invalid argument) # Based-on-a-patch-by: David Ahern <dsahern@gmail.com> Acked-by: David Ahern <dsahern@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-d7gq2fjvaecozp9o2i0siifu@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-03-26perf timechart: Fix SIBGUS error on sparc64David Ahern
perf timechart -T on sparc64 is terminating due to SIGBUS. Backtrace: Program received signal SIGBUS, Bus error. 0x0000000000173d7c in perf_evsel__intval (evsel=<value optimized out>, sample=0x7feffffda28, name=0x289b28 "prev_state") at util/evsel.c:1918 1918 util/evsel.c: No such file or directory. in util/evsel.c Missing separate debuginfos, use: debuginfo-install audit-libs-2.3.7-1.0.1.el6.sparc64 bzip2-libs-1.0.5-7.el6_0.sparc64 elfutils-libelf-0.155-2.0.3.el6.sparc64 elfutils-libs-0.155-2.0.3.el6.sparc64 glibc-2.12-1.132.0.8.el6_5.sparc64 numactl-2.0.7-8.el6.sparc64 python-libs-2.6.6-52.0.2.el6.sparc64 slang-2.2.1-1.el6.sparc64 xz-libs-4.999.9-0.3.beta.20091007git.el6.sparc64 zlib-1.2.3-29.el6.sparc64 (gdb) bt 0 0x0000000000173d7c in perf_evsel__intval (evsel=<value optimized out>, sample=0x7feffffda28, name=0x289b28 "prev_state") at util/evsel.c:1918 1 0x0000000000123b94 in process_sample_sched_switch (tchart=0x7feffffe040, evsel=0x4ca850, sample=0x7feffffda28, backtrace=0xc39010 "") at builtin-timechart.c:627 2 0x0000000000122828 in process_sample_event (tool=0x7feffffe040, event=<value optimized out>, sample=0x7feffffda28, evsel=0x4ca850, machine=0x4c9c88) at builtin-timechart.c:569 Another extended load on unaligned pointer. As before fix by copying to a temporary variable using memcpy. Signed-off-by: David Ahern <david.ahern@oracle.com> Link: http://lkml.kernel.org/r/1427228049-51893-1-git-send-email-david.ahern@oracle.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-03-26drm/i915: Fixup legacy plane->crtc link for initial fb configDaniel Vetter
This is a very similar bug in the load detect code fixed in commit 9128b040eb774e04bc23777b005ace2b66ab2a85 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Tue Mar 3 17:31:21 2015 +0100 drm/i915: Fix modeset state confusion in the load detect code But this time around it was the initial fb code that forgot to update the plane->crtc pointer. Otherwise it's the exact same bug, with the exact same restrains (any set_config call/ioctl that doesn't disable the pipe papers over the bug for free, so fairly hard to hit in normal testing). So if you want the full explanation just go read that one over there - it's rather long ... Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Josh Boyer <jwboyer@fedoraproject.org> Cc: Jani Nikula <jani.nikula@linux.intel.com> Reported-and-tested-by: Josh Boyer <jwboyer@fedoraproject.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> [Jani: backported to drm-intel-fixes for v4.0-rc] Reference: http://mid.gmane.org/CA+5PVA7ChbtJrknqws1qvZcbrg1CW2pQAFkSMURWWgyASRyGXg@mail.gmail.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2015-03-26drm/i915: Fix atomic state when reusing the firmware fbDamien Lespiau
Right now, we get a warning when taking over the firmware fb: [drm:drm_atomic_plane_check] FB set but no CRTC with the following backtrace: [<ffffffffa010339d>] drm_atomic_check_only+0x35d/0x510 [drm] [<ffffffffa0103567>] drm_atomic_commit+0x17/0x60 [drm] [<ffffffffa00a6ccd>] drm_atomic_helper_plane_set_property+0x8d/0xd0 [drm_kms_helper] [<ffffffffa00f1fed>] drm_mode_plane_set_obj_prop+0x2d/0x90 [drm] [<ffffffffa00a8a1b>] restore_fbdev_mode+0x6b/0xf0 [drm_kms_helper] [<ffffffffa00aa969>] drm_fb_helper_restore_fbdev_mode_unlocked+0x29/0x80 [drm_kms_helper] [<ffffffffa00aa9e2>] drm_fb_helper_set_par+0x22/0x50 [drm_kms_helper] [<ffffffffa050a71a>] intel_fbdev_set_par+0x1a/0x60 [i915] [<ffffffff813ad444>] fbcon_init+0x4f4/0x580 That's because we update the plane state with the fb from the firmware, but we never associate the plane to that CRTC. We don't quite have the full DRM take over from HW state just yet, so fake enough of the plane atomic state to pass the checks. v2: Fix the state on which we set the CRTC in the case we're sharing the initial fb with another pipe. (Matt) Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> [Jani: backported to drm-intel-fixes for v4.0-rc] Reference: http://mid.gmane.org/CA+5PVA7yXH=U757w8V=Zj2U1URG4nYNav20NpjtQ4svVueyPNw@mail.gmail.com Reference: http://lkml.kernel.org/r/CA+55aFweWR=nDzc2Y=rCtL_H8JfdprQiCimN5dwc+TgyD4Bjsg@mail.gmail.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2015-03-26drm/i915: Keep ring->active_list and ring->requests_list consistentChris Wilson
If we retire requests last, we may use a later seqno and so clear the requests lists without clearing the active list, leading to confusion. Hence we should retire requests first for consistency with the early return. The order used to be important as the lifecycle for the object on the active list was determined by request->seqno. However, the requests themselves are now reference counted removing the constraint from the order of retirement. Fixes regression from commit 1b5a433a4dd967b125131da42b89b5cc0d5b1f57 Author: John Harrison <John.C.Harrison@Intel.com> Date: Mon Nov 24 18:49:42 2014 +0000 drm/i915: Convert 'i915_seqno_passed' calls into 'i915_gem_request_completed ' and a WARNING: CPU: 0 PID: 1383 at drivers/gpu/drm/i915/i915_gem_evict.c:279 i915_gem_evict_vm+0x10c/0x140() WARN_ON(!list_empty(&vm->active_list)) Identified by updating WATCH_LISTS: [drm:i915_verify_lists] *ERROR* blitter ring: active list not empty, but no requests WARNING: CPU: 0 PID: 681 at drivers/gpu/drm/i915/i915_gem.c:2751 i915_gem_retire_requests_ring+0x149/0x230() WARN_ON(i915_verify_lists(ring->dev)) Note that this is only a problem in evict_vm where the following happens after a retire_request has cleaned out all requests, but not all active bo: - intel_ring_idle called from i915_gpu_idle notices that no requests are outstanding and immediately returns. - i915_gem_retire_requests_ring called from i915_gem_retire_requests also immediately returns when there's no request, still leaving the bo on the active list. - evict_vm hits the WARN_ON(!list_empty(&vm->active_list)) after evicting all active objects that there's still stuff left that shouldn't be there. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com>