summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-05-09perf tests: Add test to check backward ring bufferWang Nan
This test checks reading from backward ring buffer. Test result: # ~/perf test 'ring buffer' 45: Test backward reading from ring buffer : Ok The test case is a while loop which calls prctl(PR_SET_NAME) multiple times. Each prctl should issue 2 events: one PERF_RECORD_SAMPLE, one PERF_RECORD_COMM. The first round creates a relative large ring buffer (256 pages). It can afford all events. Read from it and check the count of each type of events. The second round creates a small ring buffer (1 page) and makes it overwritable. Check the correctness of the buffer. Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1462758471-89706-3-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-09perf tools: Support reading from backward ring bufferWang Nan
perf_evlist__mmap_read_backward() is introduced for reading backward ring buffer. Since direction for reading such ring buffer is different from the direction kernel writing to it, and since user need to fetch most recent record from it, a perf_evlist__mmap_read_catchup() is introduced to move the reading pointer to the end of the buffer. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1462758471-89706-2-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-09perf script: Fix incorrect python db-export error messageChris Phlipot
Fix the error message printed when attempting and failing to create the call path root incorrectly references the call return process. This change fixes the message to properly reference the failure to create the call path root. Signed-off-by: Chris Phlipot <cphlipot0@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1462612620-25008-2-git-send-email-cphlipot0@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-09perf stat: Scale values by unit before metricsAndi Kleen
Scale values by unit before passing them to the metrics printing functions. This is needed for TopDown, because it needs to scale the slots correctly by pipeline width / SMTness. For existing metrics it shouldn't make any difference, as those generally use events that don't have any units. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1462489447-31832-8-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-09perf callchain: Recording 'dwarf' callchains do not need DWARF unwinding supportHe Kuang
There is no need to check for DWARF unwinding support when using the 'dwarf' callchain record method, as this will only ask the kernel to collect stack dumps for later DWARF CFI processing, which can be done in another machine, where the support for DWARF unwinding need to be present. Signed-off-by: He Kuang <hekuang@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1462525154-125656-2-git-send-email-hekuang@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-07Merge tag 'perf-core-for-mingo-20160506' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: User visible changes: - Fix ordering of kernel/user entries in 'caller' mode, where the kernel and user parts were being correctly inverted but kept in place wrt each other, i.e. 'callee' (k1, k2, u3, u4) became 'caller' (k2, k1, u4, u3) when it should be 'caller' (u4, u3, k2, k1) (Chris Phlipot) - In 'perf trace' don't print the raw arg syscall args for a syscall that has no arguments, like gettid(). This was happening because just checking if the syscall args list is NULL may mean that there are no args (e.g.: gettid) or that there is no tracepoint info (e.g.: clone) (Arnaldo Carvalho de Melo) - Add extra output of counter values with 'perf stat -vv' (Andi Kleen) Infrastructure changes: - Expose callchain db export via the python API (Chris Phlipot) Code reorganization: - Move some more syscall arg beautifiers from the 'perf trace' main file to separate files in tools/perf/trace/beauty/, to reduce the main file line count (Arnaldo Carvalho de Melo) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-06perf trace: Move futex_op beautifier to tools/perf/trace/beauty/Arnaldo Carvalho de Melo
To reduce the size of builtin-trace.c. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-vb8dpy7bptkf219q5c25ulfp@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf trace: Move open_flags beautifier to tools/perf/trace/beauty/Arnaldo Carvalho de Melo
To reduce the size of builtin-trace.c. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-jt293541hv9od7gqw6lilioh@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf trace: Move signum beautifier to tools/perf/trace/beauty/Arnaldo Carvalho de Melo
To reduce the size of builtin-trace.c. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-qecqxwwtreio6eaatfv58yq5@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf stat: Add extra output of counter values with -vvAndi Kleen
Add debug output of raw counter values per CPU when perf stat -v is specified, together with their cpu numbers. This is very useful to debug problems with per core counters, where we can normally only see aggregated values. v2: Make it depend on -vv, not -v Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1461787251-6702-12-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf script: Update export-to-postgresql to support callchain exportChris Phlipot
Update the export-to-postgresql.py to support the newly introduced callchain export. callchains are added into the existing call_paths table and can now be associated with samples when the "callpaths" commandline option is used with the script. Ex.: $ perf script -s export-to-postgresql.py example_db all callchains Includes the following changes to enable callchain export via the python export APIs: - Add the "callchains" commandline option, which is used to enable callchain export by setting the perf_db_export_callchains global - Add perf_db_export_callchains checks for call_path table creation and population. - Add call_path_id to samples_table to conform with the new API example usage and output using a small test app: test_app.c: volatile int x = 0; void inc_x_loop() { int i; for(i=0; i<100000000; i++) x++; } void a() { inc_x_loop(); } void b() { inc_x_loop(); } int main() { a(); b(); return 0; } example usage: $ gcc -g -O0 test_app.c $ perf record --call-graph=dwarf ./a.out [ perf record: Woken up 77 times to write data ] [ perf record: Captured and wrote 19.373 MB perf.data (2404 samples) ] $ perf script -s scripts/python/export-to-postgresql.py example_db all callchains $ psql example_db example_db=# SELECT (SELECT name FROM symbols WHERE id = cps.symbol_id) as symbol, (SELECT name FROM symbols WHERE id = (SELECT symbol_id from call_paths where id = cps.parent_id)) as parent_symbol, sum(period) as event_count FROM samples join call_paths as cps on call_path_id = cps.id GROUP BY cps.id,evsel_id ORDER BY event_count DESC LIMIT 5; symbol | parent_symbol | event_count ------------------+--------------------------+------------- inc_x_loop | a | 734250982 inc_x_loop | b | 731028057 unknown | unknown | 1335858 task_tick_fair | scheduler_tick | 1238842 update_wall_time | tick_do_update_jiffies64 | 650373 (5 rows) The above data shows total "self time" in cycles for each call path that was sampled. It is intended to demonstrate how it accounts separately for the two ways to reach the "inc_x_loop" function(via "a" and "b"). Recursive common table expressions can be used as well to get cumulative time spent in a function as well, but that is beyond the scope of this basic example. Signed-off-by: Chris Phlipot <cphlipot0@gmail.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1461831551-12213-7-git-send-email-cphlipot0@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf script: Expose usage of the callchain db export via the python apiChris Phlipot
This change allows python scripts to be able to utilize the recent changes to the db export api allowing the export of call_paths derived from sampled callchains. These call paths are also now associated with the samples from which they were derived. - This feature is enabled by setting "perf_db_export_callchains" to true - When enabled, samples that have callchain information will have the callchains exported via call_path_table - The call_path_id field is added to sample_table to enable association of samples with the corresponding callchain stored in the call paths table. A call_path_id of 0 will be exported if there is no corresponding callchain. - When "perf_db_export_callchains" and "perf_db_export_calls" are both set to True, the call path root data structure will be shared. This prevents duplicating of data and call path ids that would result from building two separate call path trees in memory. - The call_return_processor structure definition was relocated to the header file to make its contents visible to db-export.c. This enables the sharing of call path trees between the two features, as mentioned above. This change is visible to python scripts using the python db export api. The change is backwards compatible with scripts written against the previous API, assuming that the scripts model the sample_table function after the one in export-to-postgresql.py script by allowing for additional arguments to be added in the future. ie. using *x as the final argument of the sample_table function. Signed-off-by: Chris Phlipot <cphlipot0@gmail.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1461831551-12213-6-git-send-email-cphlipot0@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf script: Add call path id to exported sample in db exportChris Phlipot
The exported sample now contains a reference to the call_path_id that represents its callchain. While callchains themselves are nice to have, being able to associate them with samples makes them much more useful, and can allow for such things as determining how much cumulative time is spent in a particular function. This information is normally possible to get from the call return processor. However, when doing normal sampling, call/return information is not available, thus necessitating the need for associating samples directly with call paths. This commit include changes to db-export layer to make this information available for subsequent patches in this change set, but by itself, does not make any changes visible to the user. Signed-off-by: Chris Phlipot <cphlipot0@gmail.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1461831551-12213-5-git-send-email-cphlipot0@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf script: Enable db export to output sampled callchainsChris Phlipot
This change enables the db export api to export callchains. This is accomplished by adding callchains obtained from samples to the call_path_root structure and exporting them via the current call path export API. While the current API does support exporting call paths, this is not supported when sampling. This commit addresses that missing feature by allowing the export of call paths when callchains are present in samples. Summary: - This feature is activated by initializing the call_path_root member inside the db_export structure to a non-null value. - Callchains are resolved with thread__resolve_callchain() and then stored and exported by adding a call path under call path root. - Symbol and DSO for each callchain node are exported via db_ids_from_al() This commit puts in place infrastructure to be used by subsequent commits, and by itself, does not introduce any user-visible changes. Signed-off-by: Chris Phlipot <cphlipot0@gmail.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1461831551-12213-4-git-send-email-cphlipot0@gmail.com [ Made adjustments suggested by Adrian Hunter, see thread via this cset's Link: tag ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf tools: Refactor code to move call path handling out of thread-stackChris Phlipot
Move the call path handling code out of thread-stack.c and thread-stack.h to allow other components that are not part of thread-stack to create call paths. Summary: - Create call-path.c and call-path.h and add them to the build. - Move all call path related code out of thread-stack.c and thread-stack.h and into call-path.c and call-path.h. - A small subset of structures and functions are now visible through call-path.h, which is required for thread-stack.c to continue to compile. This change is a prerequisite for subsequent patches in this change set and by itself contains no user-visible changes. Signed-off-by: Chris Phlipot <cphlipot0@gmail.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1461831551-12213-3-git-send-email-cphlipot0@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf callchain: Fix incorrect ordering of entriesChris Phlipot
The existing implementation of thread__resolve_callchain, under certain circumstances, can assemble callchain entries in the incorrect order. The callchain entries are resolved incorrectly for a sample when all of the following conditions are met: 1. callchain_param.order is set to ORDER_CALLER 2. thread__resolve_callchain_sample is able to resolve callchain entries for the sample. 3. unwind__get_entries is also able to resolve callchain entries for the sample. The fix is accomplished by reversing the order in which thread__resolve_callchain_sample and unwind__get_entries are called when callchain_param.order is set to ORDER_CALLER. Unwind specific code from thread__resolve_callchain is also moved into a new static function to improve readability of the fix. How to Reproduce the Existing Bug: Modifying perf script to print call trees in the opposite order or applying the remaining patches from this series and comparing the results output from export-to-postgtresql.py are the easiest ways to see the bug, however it can still be seen in current builds using perf report. Here is how i can reproduce the bug using perf report: # perf record --call-graph=dwarf stress -c 1 -t 5 when i run this command: # perf report --call-graph=flat,0,0,callee This callchain, containing kernel (handle_irq_event, etc) and userspace samples (__libc_start_main, etc) is contained in the output, which looks correct (callee order): gen8_irq_handler handle_irq_event_percpu handle_irq_event handle_edge_irq handle_irq do_IRQ ret_from_intr __random rand 0x558f2a04dded 0x558f2a04c774 __libc_start_main 0x558f2a04dcd9 Now run this command using caller order: # perf report --call-graph=flat,0,0,caller It is expected to see the exact reverse of the above when using caller order (with "0x558f2a04dcd9" at the top and "gen8_irq_handler" at the bottom) in the output, but it is nowhere to be found. instead you see this: ret_from_intr do_IRQ handle_irq handle_edge_irq handle_irq_event handle_irq_event_percpu gen8_irq_handler 0x558f2a04dcd9 __libc_start_main 0x558f2a04c774 0x558f2a04dded rand __random Notice how internally the kernel symbols are reversed and the user space symbols are reversed, but the kernel symbols still appear above the user space symbols. if this patch is applied and perf script is re-run, you will see the expected output (with "0x558f2a04dcd9" at the top and "gen8_irq_handler" at the bottom): 0x558f2a04dcd9 __libc_start_main 0x558f2a04c774 0x558f2a04dded rand __random ret_from_intr do_IRQ handle_irq handle_edge_irq handle_irq_event handle_irq_event_percpu gen8_irq_handler Signed-off-by: Chris Phlipot <cphlipot0@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1461831551-12213-2-git-send-email-cphlipot0@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf trace: Do not print raw args list for syscalls with no argsArnaldo Carvalho de Melo
The test to check if the arg format had been read from the syscall:sys_enter_name/format file was looking at the list of non-commom fields, and if that is empty, it would think it had failed to read it, because it doesn't exist, for instance, for the clone() syscall. So instead before dumping the raw syscall args list check IS_ERR(sc->tp_format), if that is true, then an attempt was made to read the format file and failed, in which case dump the raw arg list values. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-ls7pmdqb2xy9339vdburwvnk@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06Merge tag 'perf-core-for-mingo-20160505' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: User visible changes: - Order output of 'perf trace --summary' better, now the threads will appear ascending order of number of events, and then, for each, in descending order of syscalls by the time spent in the syscalls, so that the last page produced can be the one about the most interesting thread straced, suggested by Milian Wolff (Arnaldo Carvalho de Melo) - Do not show the runtime_ms for a thread when not collecting it, that is done so far only with 'perf trace --sched' (Arnaldo Carvalho de Melo) - Fix kallsyms perf test on ppc64le (Naveen N. Rao) Infrastructure changes: - Move global variables related to presence of some keys in the sort order to a per hist struct, to allow code like the hists browser to work with multiple hists with different lists of columns (Jiri Olsa) - Add support for generating bpf prologue in powerpc (Naveen N. Rao) - Fix kprobe and kretprobe handling with kallsyms on ppc64le (Naveen N. Rao) - evlist mmap changes, prep work for supporting reading backwards (Wang Nan) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-06perf evlist: Rename variable in perf_mmap__read()Wang Nan
In perf_mmap__read(), give better names to pointers. Original name 'old' and 'head' directly related to pointers in ring buffer control page. For backward ring buffer, the meaning of 'head' point is not 'the first byte of free space', but 'the first byte of the last record'. To reduce confusion, rename 'old' to 'start', 'head' to 'end'. 'start' -> 'end' is the direction the records should be read from. Change parameter order. Change 'overwrite' to 'check_messup'. When reading from 'head', no need to check messup for for backward ring buffer. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1461723563-67451-3-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf evlist: Extract perf_mmap__read()Wang Nan
Extract event reader from perf_evlist__mmap_read() to perf__mmap_read(). Future commit will feed it with manually computed 'head' and 'old' pointers. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1461723563-67451-2-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf symbols: Fix kallsyms perf test on ppc64leNaveen N. Rao
ppc64le functions have a Global Entry Point (GEP) and a Local Entry Point (LEP). While placing a probe, we always prefer the LEP since it catches function calls through both the GEP and the LEP. In order to do this, we fixup the function entry points during elf symbol table lookup to point to the LEPs. This works, but breaks 'perf test kallsyms' since the symbols loaded from the symbol table (pointing to the LEP) do not match the symbols in kallsyms. To fix this, we do not adjust all the symbols during symbol table load. Instead, we note down st_other in a newly introduced arch-specific member of perf symbol structure, and later use this to adjust the probe trace point. Reported-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Cc: Mark Wielaard <mjw@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/6be7c2b17e370100c2f79dd444509df7929bdd3e.1460451721.git.naveen.n.rao@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf powerpc: Fix kprobe and kretprobe handling with kallsyms on ppc64leNaveen N. Rao
So far, we used to treat probe point offsets as being offset from the LEP. However, userspace applications (objdump/readelf) always show disassembly and offsets from the function GEP. This is confusing to the user as we will end up probing at an address different from what the user expects when looking at the function disassembly with readelf/objdump. Fix this by changing how we modify probe address with perf. If only the function name is provided, we assume the user needs the LEP. Otherwise, if an offset is specified, we assume that the user knows the exact address to probe based on function disassembly, and so we just place the probe from the GEP offset. Finally, kretprobe was also broken with kallsyms as we were trying to specify an offset. This patch also fixes that issue. Reported-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Cc: Mark Wielaard <mjw@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/75df860aad8216bf4b9bcd10c6351ecc0e3dee54.1460451721.git.naveen.n.rao@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf hists: Move sort__has_comm into struct perf_hpp_listJiri Olsa
Now we have sort dimensions private for struct hists, we need to make dimension booleans hists specific as well. Moving sort__has_comm into struct perf_hpp_list. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1462276488-26683-8-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf hists: Move sort__has_thread into struct perf_hpp_listJiri Olsa
Now we have sort dimensions private for struct hists, we need to make dimension booleans hists specific as well. Moving sort__has_thread into struct perf_hpp_list. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1462276488-26683-7-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf hists: Move sort__has_socket into struct perf_hpp_listJiri Olsa
Now we have sort dimensions private for struct hists, we need to make dimension booleans hists specific as well. Moving sort__has_socket into struct perf_hpp_list. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1462276488-26683-6-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf hists: Move sort__has_dso into struct perf_hpp_listJiri Olsa
Now we have sort dimensions private for struct hists, we need to make dimension booleans hists specific as well. Moving sort__has_dso into struct perf_hpp_list. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1462276488-26683-5-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf hists: Move sort__has_sym into struct perf_hpp_listJiri Olsa
Now we have sort dimensions private for struct hists, we need to make dimension booleans hists specific as well. Moving sort__has_sym into struct perf_hpp_list. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1462276488-26683-4-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf hists: Move sort__has_parent into struct perf_hpp_listJiri Olsa
Now we have sort dimensions private for struct hists, we need to make dimension booleans hists specific as well. Moving sort__has_parent into struct perf_hpp_list. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1462276488-26683-3-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf hists: Move sort__need_collapse into struct perf_hpp_listJiri Olsa
Now we have sort dimensions private for struct hists, we need to make dimension booleans hists specific as well. Moving sort__need_collapse into struct perf_hpp_list. Adding hists__has macro to easily access this info perf struct hists object. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1462276488-26683-2-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf tools powerpc: Add support for generating bpf prologueNaveen N. Rao
Generalize existing macros to serve the purpose. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Wang Nan <wangnan0@huawei.com> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1462461799-17518-1-git-send-email-naveen.n.rao@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf trace: Do not show the runtime_ms for a thread when not collecting itArnaldo Carvalho de Melo
That field is only updated when we use the "sched:sched_stat_runtime" tracepoint, and that is only done so far when we use the '--stat' command line option, without it we get just zeros, confusing the users: Without this patch: # trace -a -s sleep 1 <SNIP> qemu-system-x86 (9931), 468 events, 9.6%, 0.000 msec syscall calls total min avg max stddev (msec) (msec) (msec) (msec) (%) ---------- ------ --------- --------- --------- --------- ------ ppoll 98 982.374 0.000 10.024 29.983 12.65% write 34 0.401 0.005 0.012 0.027 5.49% ioctl 102 0.347 0.002 0.003 0.007 3.08% firefox (10871), 1856 events, 38.2%, 0.000 msec (msec) (msec) (msec) (msec) (%) ---------- ------ --------- --------- --------- --------- ------ poll 395 934.873 0.000 2.367 17.120 11.51% recvmsg 395 0.988 0.001 0.003 0.021 4.20% read 106 0.460 0.002 0.004 0.007 3.17% futex 24 0.108 0.001 0.004 0.010 10.05% mmap 2 0.041 0.016 0.021 0.026 23.92% write 6 0.027 0.004 0.004 0.005 2.52% After this patch that ', 0.000 msecs' gets suppressed when --stat is not in use. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-p7emqrsw7900tdkg43v9l1e1@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf trace: Sort syscalls stats by msecs in --summaryArnaldo Carvalho de Melo
# trace -a -s sleep 1 <SNIP> Xorg (1965), 788 events, 19.0%, 0.000 msec syscall calls total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- --------- --------- --------- --------- ------ select 89 731.038 0.000 8.214 175.218 36.71% ioctl 22 0.661 0.010 0.030 0.072 10.43% writev 42 0.253 0.002 0.006 0.011 5.94% recvmsg 60 0.185 0.001 0.003 0.009 5.90% setitimer 60 0.127 0.001 0.002 0.006 6.14% read 52 0.102 0.001 0.002 0.005 8.55% rt_sigprocmask 45 0.092 0.001 0.002 0.023 23.65% poll 12 0.021 0.001 0.002 0.003 7.21% epoll_wait 12 0.019 0.001 0.002 0.002 2.71% firefox (10871), 1080 events, 26.1%, 0.000 msec syscall calls total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- --------- --------- --------- --------- ------ poll 240 979.562 0.000 4.082 17.132 11.33% recvmsg 240 0.532 0.001 0.002 0.007 3.69% read 60 0.303 0.003 0.005 0.029 8.50% Suggested-by: Milian Wolff <milian.wolff@kdab.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-52kdkuyxihq0kvc0n2aalhay@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf trace: Sort summary output by number of eventsArnaldo Carvalho de Melo
# trace -a -s sleep 1 |& grep events | tail gmain (1733), 34 events, 1.0%, 0.000 msec hexchat (9765), 46 events, 1.4%, 0.000 msec ssh (11109), 80 events, 2.4%, 0.000 msec sleep (32631), 81 events, 2.4%, 0.000 msec qemu-system-x86 (10021), 272 events, 8.2%, 0.000 msec Xorg (1965), 322 events, 9.7%, 0.000 msec SoftwareVsyncTh (10922), 366 events, 11.1%, 0.000 msec gnome-shell (2231), 446 events, 13.5%, 0.000 msec qemu-system-x86 (9931), 468 events, 14.1%, 0.000 msec firefox (10871), 1098 events, 33.2%, 0.000 msec [root@jouet ~]# Suggested-by: Milian Wolff <milian.wolff@kdab.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-ye4cnprhfeiq32ar4lt60dqs@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf tools: Add template for generating rbtree resort classArnaldo Carvalho de Melo
Sometimes we want to sort an existing rbtree by a different key, introduce a template for that, that needs only to be provided the rbtree root and the number of entries in it. To do that a new rbtree will be created with extra space for each entry, where possibly pre-calculated keys will be stored to be used in the resort process and also later, when using the newly sorted rbtree. Please check the following two changesets to see it in use for resorting stats for threads and its syscalls in 'perf trace --summary'. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-9l6e1q34lmf3wwdeewstyakg@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06perf machine: Introduce number of threads memberArnaldo Carvalho de Melo
To be used, for instance, for pre-allocating an rb_tree array for sorting by other keys besides the current pid one. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-ja0ifkwue7ttjhbwijn6g6eu@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-05perf/x86/intel/pt: Convert ACCESS_ONCE()sAlexander Shishkin
This patch converts remaining ACCESS_ONCE() instances into READ_ONCE() and WRITE_ONCE() as appropriate. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: vince@deater.net Link: http://lkml.kernel.org/r/1461857746-31346-2-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05perf/x86/intel/pt: Export CPU frequency ratios needed by PT decodersAlexander Shishkin
Intel PT decoders need access to various bits of timing related information to be able to correctly decode timing packets from a PT stream (MTC and CBR packets). This patch exports all the necessary bits as sysfs attributes for the sake of consistency: * max_nonturbo_ratio: ratio between the invariant TSC and base clock; * tsc_art_ratio: TSC to core crystal clock ratio (also available as CPUID.15H). Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: vince@deater.net Link: http://lkml.kernel.org/r/87zisdvibe.fsf@ashishki-desk.ger.corp.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05perf/x86/intel/pt: Bypass PT vs. LBR exclusivity if the core supports itAlexander Shishkin
Not all cores prevent using Intel PT and LBRs simultaneously, although most of them still do as of today. This patch adds an opt-in flag for such cores to disable mutual exclusivity between PT and LBR; also flip it on for Goldmont. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: vince@deater.net Link: http://lkml.kernel.org/r/1461857746-31346-4-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05perf/arm: Special-case hetereogeneous CPUsMark Rutland
Commit: 26657848502b7847 ("perf/core: Verify we have a single perf_hw_context PMU") forcefully prevents multiple PMUs from sharing perf_hw_context, as this generally doesn't make sense. It is a common bug for uncore PMUs to use perf_hw_context rather than perf_invalid_context, which this detects. However, systems exist with heterogeneous CPUs (and hence heterogeneous HW PMUs), for which sharing perf_hw_context is necessary, and possible in some limited cases. To make this work we have to perform some gymnastics, as we did in these commits: 66eb579e66ecfea5 ("perf: allow for PMU-specific event filtering") c904e32a69b7c779 ("arm: perf: filter unschedulable events") To allow those systems to work, we must allow PMUs for heterogeneous CPUs to share perf_hw_context, though we must still disallow sharing otherwise to detect the common misuse of perf_hw_context. This patch adds a new PERF_PMU_CAP_HETEROGENEOUS_CPUS for this, updates the core logic to account for this, and makes use of it in the arm_pmu code that is used for systems with heterogeneous CPUs. Comments are added to make the rationale clear and hopefully avoid accidental abuse. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20160426103346.GA20836@leverpostej Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05perf/core: Let userspace know if the PMU supports address filtersAlexander Shishkin
Export an additional common attribute for PMUs that support address range filtering to let the perf userspace identify such PMUs in a uniform way. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: vince@deater.net Link: http://lkml.kernel.org/r/1461771888-10409-8-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05perf/x86/intel/pt: Add support for address range filtering in PTAlexander Shishkin
Newer versions of Intel PT support address ranges, which can be used to define IP address range-based filters or TraceSTOP regions. Number of ranges in enumerated via cpuid. This patch implements PMU callbacks and related low-level code to allow filter validation, configuration and programming into the hardware. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: vince@deater.net Link: http://lkml.kernel.org/r/1461771888-10409-7-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05perf/core: Introduce address range filteringAlexander Shishkin
Many instruction tracing PMUs out there support address range-based filtering, which would, for example, generate trace data only for a given range of instruction addresses, which is useful for tracing individual functions, modules or libraries. Other PMUs may also utilize this functionality to allow filtering to or filtering out code at certain address ranges. This patch introduces the interface for userspace to specify these filters and for the PMU drivers to apply these filters to hardware configuration. The user interface is an ASCII string that is passed via an ioctl() and specifies (in the form of an ASCII string) address ranges within certain object files or within kernel. There is no special treatment for kernel modules yet, but it might be a worthy pursuit. The PMU driver interface basically adds two extra callbacks to the PMU driver structure, one of which validates the filter configuration proposed by the user against what the hardware is actually capable of doing and the other one translates hardware-independent filter configuration into something that can be programmed into the hardware. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: vince@deater.net Link: http://lkml.kernel.org/r/1461771888-10409-6-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05perf/core: Extend perf_event_aux_ctx() to optionally iterate through more eventsAlexander Shishkin
Trace filtering code needs an iterator that can go through all events in a context, including inactive and filtered, to be able to update their filters' address ranges based on mmap or exec events. This patch changes perf_event_aux_ctx() to optionally do this. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: vince@deater.net Link: http://lkml.kernel.org/r/1461771888-10409-5-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05perf/x86/intel/pt: Add IP filtering register/CPUID bitsAlexander Shishkin
New versions of Intel PT support address range-based filtering. Add the new registers, bit definitions and relevant CPUID bits. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: vince@deater.net Link: http://lkml.kernel.org/r/1461771888-10409-4-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05perf/x86/intel/pt: Move PT specific MSR bit definitions to a private headerAlexander Shishkin
Nothing outside of the Intel PT driver should ever care about its MSR bits, so there is no reason to keep them in msr-index.h. This patch moves them to a pt-local header. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: vince@deater.net Link: http://lkml.kernel.org/r/1461771888-10409-3-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05perf/core: Move set_filter() out of CONFIG_EVENT_TRACINGAlexander Shishkin
For instruction trace filtering, namely, for communicating filter definitions from userspace, I'd like to re-use the SET_FILTER code that the tracepoints are using currently. To that end, move the relevant code out from behind the CONFIG_EVENT_TRACING dependency. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: vince@deater.net Link: http://lkml.kernel.org/r/1461771888-10409-2-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05perf/x86/amd/iommu: Do not register a task ctx for uncore like PMUsPeter Zijlstra
The new sanity check introduced by: 26657848502b ("perf/core: Verify we have a single perf_hw_context PMU") ... triggered on the AMD IOMMU driver. IOMMUs are not per logical CPU, they cannot have per-task counters. Fix it. Reported-by: Borislav Petkov <bp@alien8.de> Tested-by: Borislav Petkov <bp@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: jroedel@suse.de Cc: suravee.suthikulpanit@amd.com Link: http://lkml.kernel.org/r/20160423224255.GB3430@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-05perf/x86: Add model numbers for Kabylake CPUsAndi Kleen
Everything the same as Skylake, just new model numbers. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: http://lkml.kernel.org/r/1461977748-17616-1-git-send-email-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-01Linux 4.6-rc6Linus Torvalds