summaryrefslogtreecommitdiff
path: root/kernel/trace/trace.h
AgeCommit message (Collapse)Author
2014-07-18tracing: let user specify tracing_thresh after selecting function_graphStanislav Fomichev
Currently, tracing_thresh works only if we specify it before selecting function_graph tracer. If we do the opposite, tracing_thresh will change it's value, but it will not be applied. To fix it, we add update_thresh callback which is called whenever tracing_thresh is updated and for function_graph tracer we register handler which reinitializes tracer depending on tracing_thresh. Link: http://lkml.kernel.org/p/20140718111727.GA3206@stfomichev-desktop.yandex.net Signed-off-by: Stanislav Fomichev <stfomichev@yandex-team.ru> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-06-10tracing: Fix check of ftrace_trace_arrays list_empty() checkSteven Rostedt (Red Hat)
The check that tests if ftrace_trace_arrays is empty in top_trace_array(), uses the .prev pointer: if (list_empty(ftrace_trace_arrays.prev)) instead of testing the variable itself: if (list_empty(&ftrace_trace_arrays)) Although it is technically correct, it is awkward and confusing. Use the proper method. Link: http://lkml.kernel.org/r/87oay1bas8.fsf@sejong.aot.lge.com Reported-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-06-06tracing: Return error if ftrace_trace_arrays list is emptyYoshihiro YUNOMAE
ftrace_trace_arrays links global_trace.list. However, global_trace is not added to ftrace_trace_arrays if trace_alloc_buffers() failed. As the result, ftrace_trace_arrays becomes an empty list. If ftrace_trace_arrays is an empty list, current top_trace_array() returns an invalid pointer. As the result, the kernel can induce memory corruption or panic. Current implementation does not check whether ftrace_trace_arrays is empty list or not. So, in this patch, if ftrace_trace_arrays is empty list, top_trace_array() returns NULL. Moreover, this patch makes all functions calling top_trace_array() handle it appropriately. Link: http://lkml.kernel.org/p/20140605223517.32311.99233.stgit@yunodevel Signed-off-by: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-05-21tracing: Add funcgraph_tail option to print function name after closing bracesRobert Elliott
In the function-graph tracer, add a funcgraph_tail option to print the function name on all } lines, not just functions whose first line is no longer in the trace buffer. If a function calls other traced functions, its total time appears on its } line. This change allows grep to be used to determine the function for which the line corresponds. Update Documentation/trace/ftrace.txt to describe this new option. Link: http://lkml.kernel.org/p/20140520221041.8359.6782.stgit@beardog.cce.hp.com Signed-off-by: Robert Elliott <elliott@hp.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-05-21tracing: Eliminate duplicate TRACE_GRAPH_PRINT_xx definesRobert Elliott
Eliminate duplicate TRACE_GRAPH_PRINT_xx defines in trace_functions_graph.c that are already in trace.h. Add TRACE_GRAPH_PRINT_IRQS to trace.h, which is the only one that is missing. Link: http://lkml.kernel.org/p/20140520221031.8359.24733.stgit@beardog.cce.hp.com Signed-off-by: Robert Elliott <elliott@hp.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-30tracing: Remove mock up poll wait functionSteven Rostedt (Red Hat)
Now that the ring buffer has a built in way to wake up readers when there's data, using irq_work such that it is safe to do it in any context. But it was still using the old "poor man's" wait polling that checks every 1/10 of a second to see if it should wake up a waiter. This makes the latency for a wake up excruciatingly long. No need to do that anymore. Completely remove the different wait_poll types from the tracers and have them all use the default one now. Reported-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-24tracing/stack_trace: Skip 4 instead of 3 when using ftrace_ops_list_funcJiaxing Wang
When using ftrace_ops_list_func, we should skip 4 instead of 3, to avoid ftrace_call+0x5/0xb appearing in the stack trace: Depth Size Location (110 entries) ----- ---- -------- 0) 2956 0 update_curr+0xe/0x1e0 1) 2956 68 ftrace_call+0x5/0xb 2) 2888 92 enqueue_entity+0x53/0xe80 3) 2796 80 enqueue_task_fair+0x47/0x7e0 4) 2716 28 enqueue_task+0x45/0x70 5) 2688 12 activate_task+0x22/0x30 Add a function using_ftrace_ops_list_func() to test for this while keeping ftrace_ops_list_func to remain static. Link: http://lkml.kernel.org/p/1398006644-5935-2-git-send-email-wangjiaxing@insigma.com.cn Signed-off-by: Jiaxing Wang <wangjiaxing@insigma.com.cn> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-21tracing: Move ftrace_max_lock into trace_arraySteven Rostedt (Red Hat)
In preparation for having tracers enabled in instances, the max_lock should be unique as updating the max for one tracer is a separate operation than updating it for another tracer using a different max. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-21tracing: Move tracing_max_latency into trace_arraySteven Rostedt (Red Hat)
In preparation for letting the latency tracers be used by instances, remove the global tracing_max_latency variable and add a max_latency field to the trace_array that the latency tracers will now use. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-21ftrace: Remove global function list and call function directlySteven Rostedt (Red Hat)
Instead of having a list of global functions that are called, as only one global function is allow to be enabled at a time, there's no reason to have a list. Instead, simply have all the users of the global ops, use the global ops directly, instead of registering their own ftrace_ops. Just switch what function is used before enabling the function tracer. This removes a lot of code as well as the complexity involved with it. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-07kernel: use macros from compiler.h instead of __attribute__((...))Gideon Israel Dsouza
To increase compiler portability there is <linux/compiler.h> which provides convenience macros for various gcc constructs. Eg: __weak for __attribute__((weak)). I've replaced all instances of gcc attributes with the right macro in the kernel subsystem. Signed-off-by: Gideon Israel Dsouza <gidisrael@gmail.com> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-20ftrace: Allow for function tracing instance to filter functionsSteven Rostedt (Red Hat)
Create a "set_ftrace_filter" and "set_ftrace_notrace" files in the instance directories to let users filter of functions to trace for the given instance. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-20ftrace: Allow instances to use function tracingSteven Rostedt (Red Hat)
Allow instances (sub-buffers) to enable function tracing. Each instance will have its own function tracing capability. For now, instances will not have function stack tracing, or will they be able to pick and choose what functions they can trace. Picking and choosing their own functions will come later. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-20tracing: Convert tracer->enabled to counterSteven Rostedt (Red Hat)
As tracers will soon be used by instances, the tracer enabled field needs to be converted to a counter instead of a boolean. This counter is protected by the trace_types_lock mutex. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-20tracing: Set up infrastructure to allow tracers for instancesSteven Rostedt (Red Hat)
Currently the tracers (function, function_graph, irqsoff, etc) can only be used by the top level tracing directory (not for instances). This sets up the infrastructure to allow instances to be able to run a separate tracer apart from the what the top level tracing is doing. As tracers need to adapt for being used by instances, the tracers must flag if they can be used by instances or not. Currently only the 'nop' tracer can be used by all instances. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-20tracing: Pass trace_array to flag_changed callbackSteven Rostedt (Red Hat)
As options (flags) may affect instances instead of being global the flag_changed() callbacks need to receive the trace_array descriptor of the instance they will be modifying. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-20tracing: Pass trace_array to set_flag callbackSteven Rostedt (Red Hat)
As options (flags) may affect instances instead of being global the set_flag() callbacks need to receive the trace_array descriptor of the instance they will be modifying. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-01-02tracing: Fix rcu handling of event_trigger_data filter fieldSteven Rostedt (Red Hat)
The filter field of the event_trigger_data structure is protected under RCU sched locks. It was not annotated as such, and after doing so, sparse pointed out several locations that required fix ups. Reported-by: kbuild test robot <fengguang.wu@intel.com> Tested-by: Tom Zanussi <tom.zanussi@linux.intel.com> Acked-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-01-02tracing: Add generic tracing_lseek() functionSteven Rostedt (Red Hat)
Trace event triggers added a lseek that uses the ftrace_filter_lseek() function. Unfortunately, when function tracing is not configured in that function is not defined and the kernel fails to build. This is the second time that function was added to a file ops and it broke the build due to requiring special config dependencies. Make a generic tracing_lseek() that all the tracing utilities may use. Also, modify the old ftrace_filter_lseek() to return 0 instead of 1 on WRONLY. Not sure why it was a 1 as that does not make sense. This also changes the old tracing_seek() to modify the file pos pointer on WRONLY as well. Reported-by: kbuild test robot <fengguang.wu@intel.com> Tested-by: Tom Zanussi <tom.zanussi@linux.intel.com> Acked-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-12-22tracing: Add and use generic set_trigger_filter() implementationTom Zanussi
Add a generic event_command.set_trigger_filter() op implementation and have the current set of trigger commands use it - this essentially gives them all support for filters. Syntactically, filters are supported by adding 'if <filter>' just after the command, in which case only events matching the filter will invoke the trigger. For example, to add a filter to an enable/disable_event command: echo 'enable_event:system:event if common_pid == 999' > \ .../othersys/otherevent/trigger The above command will only enable the system:event event if the common_pid field in the othersys:otherevent event is 999. As another example, to add a filter to a stacktrace command: echo 'stacktrace if common_pid == 999' > \ .../somesys/someevent/trigger The above command will only trigger a stacktrace if the common_pid field in the event is 999. The filter syntax is the same as that described in the 'Event filtering' section of Documentation/trace/events.txt. Because triggers can now use filters, the trigger-invoking logic needs to be moved in those cases - e.g. for ftrace_raw_event_calls, if a trigger has a filter associated with it, the trigger invocation now needs to happen after the { assign; } part of the call, in order for the trigger condition to be tested. There's still a SOFT_DISABLED-only check at the top of e.g. the ftrace_raw_events function, so when an event is soft disabled but not because of the presence of a trigger, the original SOFT_DISABLED behavior remains unchanged. There's also a bit of trickiness in that some triggers need to avoid being invoked while an event is currently in the process of being logged, since the trigger may itself log data into the trace buffer. Thus we make sure the current event is committed before invoking those triggers. To do that, we split the trigger invocation in two - the first part (event_triggers_call()) checks the filter using the current trace record; if a command has the post_trigger flag set, it sets a bit for itself in the return value, otherwise it directly invoks the trigger. Once all commands have been either invoked or set their return flag, event_triggers_call() returns. The current record is then either committed or discarded; if any commands have deferred their triggers, those commands are finally invoked following the close of the current event by event_triggers_post_call(). To simplify the above and make it more efficient, the TRIGGER_COND bit is introduced, which is set only if a soft-disabled trigger needs to use the log record for filter testing or needs to wait until the current log record is closed. The syscall event invocation code is also changed in analogous ways. Because event triggers need to be able to create and free filters, this also adds a couple external wrappers for the existing create_filter and free_filter functions, which are too generic to be made extern functions themselves. Link: http://lkml.kernel.org/r/7164930759d8719ef460357f143d995406e4eead.1382622043.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-12-22tracing: Add 'enable_event' and 'disable_event' event trigger commandsTom Zanussi
Add 'enable_event' and 'disable_event' event_command commands. enable_event and disable_event event triggers are added by the user via these commands in a similar way and using practically the same syntax as the analagous 'enable_event' and 'disable_event' ftrace function commands, but instead of writing to the set_ftrace_filter file, the enable_event and disable_event triggers are written to the per-event 'trigger' files: echo 'enable_event:system:event' > .../othersys/otherevent/trigger echo 'disable_event:system:event' > .../othersys/otherevent/trigger The above commands will enable or disable the 'system:event' trace events whenever the othersys:otherevent events are hit. This also adds a 'count' version that limits the number of times the command will be invoked: echo 'enable_event:system:event:N' > .../othersys/otherevent/trigger echo 'disable_event:system:event:N' > .../othersys/otherevent/trigger Where N is the number of times the command will be invoked. The above commands will will enable or disable the 'system:event' trace events whenever the othersys:otherevent events are hit, but only N times. This also makes the find_event_file() helper function extern, since it's useful to use from other places, such as the event triggers code, so make it accessible. Link: http://lkml.kernel.org/r/f825f3048c3f6b026ee37ae5825f9fc373451828.1382622043.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-12-22tracing: Add 'snapshot' event trigger commandTom Zanussi
Add 'snapshot' event_command. snapshot event triggers are added by the user via this command in a similar way and using practically the same syntax as the analogous 'snapshot' ftrace function command, but instead of writing to the set_ftrace_filter file, the snapshot event trigger is written to the per-event 'trigger' files: echo 'snapshot' > .../somesys/someevent/trigger The above command will turn on snapshots for someevent i.e. whenever someevent is hit, a snapshot will be done. This also adds a 'count' version that limits the number of times the command will be invoked: echo 'snapshot:N' > .../somesys/someevent/trigger Where N is the number of times the command will be invoked. The above command will snapshot N times for someevent i.e. whenever someevent is hit N times, a snapshot will be done. Also adds a new tracing_alloc_snapshot() function - the existing tracing_snapshot_alloc() function is a special version of tracing_snapshot() that also does the snapshot allocation - the snapshot triggers would like to be able to do just the allocation but not take a snapshot; the existing tracing_snapshot_alloc() in turn now also calls tracing_alloc_snapshot() underneath to do that allocation. Link: http://lkml.kernel.org/r/c9524dd07ce01f9dcbd59011290e0a8d5b47d7ad.1382622043.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> [ fix up from kbuild test robot <fengguang.wu@intel.com report ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-12-20tracing: Add basic event trigger frameworkTom Zanussi
Add a 'trigger' file for each trace event, enabling 'trace event triggers' to be set for trace events. 'trace event triggers' are patterned after the existing 'ftrace function triggers' implementation except that triggers are written to per-event 'trigger' files instead of to a single file such as the 'set_ftrace_filter' used for ftrace function triggers. The implementation is meant to be entirely separate from ftrace function triggers, in order to keep the respective implementations relatively simple and to allow them to diverge. The event trigger functionality is built on top of SOFT_DISABLE functionality. It adds a TRIGGER_MODE bit to the ftrace_event_file flags which is checked when any trace event fires. Triggers set for a particular event need to be checked regardless of whether that event is actually enabled or not - getting an event to fire even if it's not enabled is what's already implemented by SOFT_DISABLE mode, so trigger mode directly reuses that. Event trigger essentially inherit the soft disable logic in __ftrace_event_enable_disable() while adding a bit of logic and trigger reference counting via tm_ref on top of that in a new trace_event_trigger_enable_disable() function. Because the base __ftrace_event_enable_disable() code now needs to be invoked from outside trace_events.c, a wrapper is also added for those usages. The triggers for an event are actually invoked via a new function, event_triggers_call(), and code is also added to invoke them for ftrace_raw_event calls as well as syscall events. The main part of the patch creates a new trace_events_trigger.c file to contain the trace event triggers implementation. The standard open, read, and release file operations are implemented here. The open() implementation sets up for the various open modes of the 'trigger' file. It creates and attaches the trigger iterator and sets up the command parser. If opened for reading set up the trigger seq_ops. The read() implementation parses the event trigger written to the 'trigger' file, looks up the trigger command, and passes it along to that event_command's func() implementation for command-specific processing. The release() implementation does whatever cleanup is needed to release the 'trigger' file, like releasing the parser and trigger iterator, etc. A couple of functions for event command registration and unregistration are added, along with a list to add them to and a mutex to protect them, as well as an (initially empty) registration function to add the set of commands that will be added by future commits, and call to it from the trace event initialization code. also added are a couple trigger-specific data structures needed for these implementations such as a trigger iterator and a struct for trigger-specific data. A couple structs consisting mostly of function meant to be implemented in command-specific ways, event_command and event_trigger_ops, are used by the generic event trigger command implementations. They're being put into trace.h alongside the other trace_event data structures and functions, in the expectation that they'll be needed in several trace_event-related files such as trace_events_trigger.c and trace_events.c. The event_command.func() function is meant to be called by the trigger parsing code in order to add a trigger instance to the corresponding event. It essentially coordinates adding a live trigger instance to the event, and arming the triggering the event. Every event_command func() implementation essentially does the same thing for any command: - choose ops - use the value of param to choose either a number or count version of event_trigger_ops specific to the command - do the register or unregister of those ops - associate a filter, if specified, with the triggering event The reg() and unreg() ops allow command-specific implementations for event_trigger_op registration and unregistration, and the get_trigger_ops() op allows command-specific event_trigger_ops selection to be parameterized. When a trigger instance is added, the reg() op essentially adds that trigger to the triggering event and arms it, while unreg() does the opposite. The set_filter() function is used to associate a filter with the trigger - if the command doesn't specify a set_filter() implementation, the command will ignore filters. Each command has an associated trigger_type, which serves double duty, both as a unique identifier for the command as well as a value that can be used for setting a trigger mode bit during trigger invocation. The signature of func() adds a pointer to the event_command struct, used to invoke those functions, along with a command_data param that can be passed to the reg/unreg functions. This allows func() implementations to use command-specific blobs and supports code re-use. The event_trigger_ops.func() command corrsponds to the trigger 'probe' function that gets called when the triggering event is actually invoked. The other functions are used to list the trigger when needed, along with a couple mundane book-keeping functions. This also moves event_file_data() into trace.h so it can be used outside of trace_events.c. Link: http://lkml.kernel.org/r/316d95061accdee070aac8e5750afba0192fa5b9.1382622043.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Idea-by: Steve Rostedt <rostedt@goodmis.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-11-16Merge tag 'trace-3.13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing update from Steven Rostedt: "This batch of changes is mostly clean ups and small bug fixes. The only real feature that was added this release is from Namhyung Kim, who introduced "set_graph_notrace" filter that lets you run the function graph tracer and not trace particular functions and their call chain. Tom Zanussi added some updates to the ftrace multibuffer tracing that made it more consistent with the top level tracing. One of the fixes for perf function tracing required an API change in RCU; the addition of "rcu_is_watching()". As Paul McKenney is pushing that change in this release too, he gave me a branch that included all the changes to get that working, and I pulled that into my tree in order to complete the perf function tracing fix" * tag 'trace-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Add rcu annotation for syscall trace descriptors tracing: Do not use signed enums with unsigned long long in fgragh output tracing: Remove unused function ftrace_off_permanent() tracing: Do not assign filp->private_data to freed memory tracing: Add helper function tracing_is_disabled() tracing: Open tracer when ftrace_dump_on_oops is used tracing: Add support for SOFT_DISABLE to syscall events tracing: Make register/unregister_ftrace_command __init tracing: Update event filters for multibuffer recordmcount.pl: Add support for __fentry__ ftrace: Have control op function callback only trace when RCU is watching rcu: Do not trace rcu_is_watching() functions ftrace/x86: skip over the breakpoint for ftrace caller trace/trace_stat: use rbtree postorder iteration helper instead of opencoding ftrace: Add set_graph_notrace filter ftrace: Narrow down the protected area of graph_lock ftrace: Introduce struct ftrace_graph_data ftrace: Get rid of ftrace_graph_filter_enabled tracing: Fix potential out-of-bounds in trace_get_user() tracing: Show more exact help information about snapshot
2013-11-11tracing: Add rcu annotation for syscall trace descriptorsSteven Rostedt (Red Hat)
sparse complains about the enter/exit_sysycall_files[] variables being dereferenced with rcu_dereference_sched(). The fields need to be annotated with __rcu. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-11-11ftrace, sched: Add TRACE_FLAG_PREEMPT_RESCHEDPeter Zijlstra
Since the introduction of PREEMPT_NEED_RESCHED in: f27dde8deef3 ("sched: Add NEED_RESCHED to the preempt_count") we need to be able to look at both TIF_NEED_RESCHED and PREEMPT_NEED_RESCHED to understand the full preemption behaviour. Add it to the trace output. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Yuanhan Liu <yuanhan.liu@linux.intel.com> Link: http://lkml.kernel.org/r/20131004152826.GP3081@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06tracing: Do not use signed enums with unsigned long long in fgragh outputSteven Rostedt (Red Hat)
The duration field of print_graph_duration() can also be used to do the space filling by passing an enum in it: DURATION_FILL_FULL DURATION_FILL_START DURATION_FILL_END The problem is that these are enums and defined as negative, but the duration field is unsigned long long. Most archs are fine with this but blackfin fails to compile because of it: kernel/built-in.o: In function `print_graph_duration': kernel/trace/trace_functions_graph.c:782: undefined reference to `__ucmpdi2' Overloading a unsigned long long with an signed enum is just bad in principle. We can accomplish the same thing by using part of the flags field instead. Cc: Mike Frysinger <vapier@gentoo.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-11-06tracing: Add helper function tracing_is_disabled()Geyslan G. Bem
This patch creates the function 'tracing_is_disabled', which can be used outside of trace.c. Link: http://lkml.kernel.org/r/1382141754-12155-1-git-send-email-geyslan@gmail.com Signed-off-by: Geyslan G. Bem <geyslan@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-11-05tracing: Add support for SOFT_DISABLE to syscall eventsTom Zanussi
The original SOFT_DISABLE patches didn't add support for soft disable of syscall events; this adds it. Add an array of ftrace_event_file pointers indexed by syscall number to the trace array and remove the existing enabled bitmaps, which as a result are now redundant. The ftrace_event_file structs in turn contain the soft disable flags we need for per-syscall soft disable accounting. Adding ftrace_event_files also means we can remove the USE_CALL_FILTER bit, thus enabling multibuffer filter support for syscall events. Link: http://lkml.kernel.org/r/6e72b566e85d8df8042f133efbc6c30e21fb017e.1382620672.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-11-05tracing: Update event filters for multibufferTom Zanussi
The trace event filters are still tied to event calls rather than event files, which means you don't get what you'd expect when using filters in the multibuffer case: Before: # echo 'bytes_alloc > 8192' > /sys/kernel/debug/tracing/events/kmem/kmalloc/filter # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/filter bytes_alloc > 8192 # mkdir /sys/kernel/debug/tracing/instances/test1 # echo 'bytes_alloc > 2048' > /sys/kernel/debug/tracing/instances/test1/events/kmem/kmalloc/filter # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/filter bytes_alloc > 2048 # cat /sys/kernel/debug/tracing/instances/test1/events/kmem/kmalloc/filter bytes_alloc > 2048 Setting the filter in tracing/instances/test1/events shouldn't affect the same event in tracing/events as it does above. After: # echo 'bytes_alloc > 8192' > /sys/kernel/debug/tracing/events/kmem/kmalloc/filter # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/filter bytes_alloc > 8192 # mkdir /sys/kernel/debug/tracing/instances/test1 # echo 'bytes_alloc > 2048' > /sys/kernel/debug/tracing/instances/test1/events/kmem/kmalloc/filter # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/filter bytes_alloc > 8192 # cat /sys/kernel/debug/tracing/instances/test1/events/kmem/kmalloc/filter bytes_alloc > 2048 We'd like to just move the filter directly from ftrace_event_call to ftrace_event_file, but there are a couple cases that don't yet have multibuffer support and therefore have to continue using the current event_call-based filters. For those cases, a new USE_CALL_FILTER bit is added to the event_call flags, whose main purpose is to keep the old behavior for those cases until they can be updated with multibuffer support; at that point, the USE_CALL_FILTER flag (and the new associated call_filter_check_discard() function) can go away. The multibuffer support also made filter_current_check_discard() redundant, so this change removes that function as well and replaces it with filter_check_discard() (or call_filter_check_discard() as appropriate). Link: http://lkml.kernel.org/r/f16e9ce4270c62f46b2e966119225e1c3cca7e60.1382620672.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-10-19ftrace: Add set_graph_notrace filterNamhyung Kim
The set_graph_notrace filter is analogous to set_ftrace_notrace and can be used for eliminating uninteresting part of function graph trace output. It also works with set_graph_function nicely. # cd /sys/kernel/debug/tracing/ # echo do_page_fault > set_graph_function # perf ftrace live true 2) | do_page_fault() { 2) | __do_page_fault() { 2) 0.381 us | down_read_trylock(); 2) 0.055 us | __might_sleep(); 2) 0.696 us | find_vma(); 2) | handle_mm_fault() { 2) | handle_pte_fault() { 2) | __do_fault() { 2) | filemap_fault() { 2) | find_get_page() { 2) 0.033 us | __rcu_read_lock(); 2) 0.035 us | __rcu_read_unlock(); 2) 1.696 us | } 2) 0.031 us | __might_sleep(); 2) 2.831 us | } 2) | _raw_spin_lock() { 2) 0.046 us | add_preempt_count(); 2) 0.841 us | } 2) 0.033 us | page_add_file_rmap(); 2) | _raw_spin_unlock() { 2) 0.057 us | sub_preempt_count(); 2) 0.568 us | } 2) | unlock_page() { 2) 0.084 us | page_waitqueue(); 2) 0.126 us | __wake_up_bit(); 2) 1.117 us | } 2) 7.729 us | } 2) 8.397 us | } 2) 8.956 us | } 2) 0.085 us | up_read(); 2) + 12.745 us | } 2) + 13.401 us | } ... # echo handle_mm_fault > set_graph_notrace # perf ftrace live true 1) | do_page_fault() { 1) | __do_page_fault() { 1) 0.205 us | down_read_trylock(); 1) 0.041 us | __might_sleep(); 1) 0.344 us | find_vma(); 1) 0.069 us | up_read(); 1) 4.692 us | } 1) 5.311 us | } ... Link: http://lkml.kernel.org/r/1381739066-7531-5-git-send-email-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-10-19ftrace: Get rid of ftrace_graph_filter_enabledNamhyung Kim
The ftrace_graph_filter_enabled means that user sets function filter and it always has same meaning of ftrace_graph_count > 0. Link: http://lkml.kernel.org/r/1381739066-7531-2-git-send-email-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-09-09Merge tag 'trace-3.12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "Not much changes for the 3.12 merge window. The major tracing changes are still in flux, and will have to wait for 3.13. The changes for 3.12 are mostly clean ups and minor fixes. H Peter Anvin added a check to x86_32 static function tracing that helps a small segment of the kernel community. Oleg Nesterov had a few changes from 3.11, but were mostly clean ups and not worth pushing in the -rc time frame. Li Zefan had small clean up with annotating a raw_init with __init. I fixed a slight race in updating function callbacks, but the race is so small and the bug that happens when it occurs is so minor it's not even worth pushing to stable. The only real enhancement is from Alexander Z Lam that made the tracing_cpumask work for trace buffer instances, instead of them all sharing a global cpumask" * tag 'trace-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: ftrace/rcu: Do not trace debug_lockdep_rcu_enabled() x86-32, ftrace: Fix static ftrace when early microcode is enabled ftrace: Fix a slight race in modifying what function callback gets traced tracing: Make tracing_cpumask available for all instances tracing: Kill the !CONFIG_MODULES code in trace_events.c tracing: Don't pass file_operations array to event_create_dir() tracing: Kill trace_create_file_ops() and friends tracing/syscalls: Annotate raw_init function with __init
2013-09-03Merge branch 'rcu/next' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU updates from Paul E. McKenney: " * Update RCU documentation. These were posted to LKML at https://lkml.org/lkml/2013/8/19/611. * Miscellaneous fixes. These were posted to LKML at https://lkml.org/lkml/2013/8/19/619. * Full-system idle detection. This is for use by Frederic Weisbecker's adaptive-ticks mechanism. Its purpose is to allow the timekeeping CPU to shut off its tick when all other CPUs are idle. These were posted to LKML at https://lkml.org/lkml/2013/8/19/648. * Improve rcutorture test coverage. These were posted to LKML at https://lkml.org/lkml/2013/8/19/675. " Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-08-22tracing: Make tracing_cpumask available for all instancesAlexander Z Lam
Allow tracer instances to disable tracing by cpu by moving the static global tracing_cpumask into trace_array. Link: http://lkml.kernel.org/r/921622317f239bfc2283cac2242647801ef584f2.1375980149.git.azl@google.com Cc: Vaibhav Nagarnaik <vnagarnaik@google.com> Cc: David Sharp <dhsharp@google.com> Cc: Alexander Z Lam <lambchop468@gmail.com> Signed-off-by: Alexander Z Lam <azl@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-26tracing: Add __tracepoint_string() to export string pointersSteven Rostedt (Red Hat)
There are several tracepoints (mostly in RCU), that reference a string pointer and uses the print format of "%s" to display the string that exists in the kernel, instead of copying the actual string to the ring buffer (saves time and ring buffer space). But this has an issue with userspace tools that read the binary buffers that has the address of the string but has no access to what the string itself is. The end result is just output that looks like: rcu_dyntick: ffffffff818adeaa 1 0 rcu_dyntick: ffffffff818adeb5 0 140000000000000 rcu_dyntick: ffffffff818adeb5 0 140000000000000 rcu_utilization: ffffffff8184333b rcu_utilization: ffffffff8184333b The above is pretty useless when read by the userspace tools. Ideally we would want something that looks like this: rcu_dyntick: Start 1 0 rcu_dyntick: End 0 140000000000000 rcu_dyntick: Start 140000000000000 0 rcu_callback: rcu_preempt rhp=0xffff880037aff710 func=put_cred_rcu 0/4 rcu_callback: rcu_preempt rhp=0xffff880078961980 func=file_free_rcu 0/5 rcu_dyntick: End 0 1 The trace_printk() which also only stores the address of the string format instead of recording the string into the buffer itself, exports the mapping of kernel addresses to format strings via the printk_format file in the debugfs tracing directory. The tracepoint strings can use this same method and output the format to the same file and the userspace tools will be able to decipher the address without any modification. The tracepoint strings need its own section to save the strings because the trace_printk section will cause the trace_printk() buffers to be allocated if anything exists within the section. trace_printk() is only used for debugging and should never exist in the kernel, we can not use the trace_printk sections. Add a new tracepoint_str section that will also be examined by the output of the printk_format file. Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-24tracing: Kill trace_cpu struct/membersOleg Nesterov
After the previous changes trace_array_cpu->trace_cpu and trace_array->trace_cpu becomes write-only. Remove these members and kill "struct trace_cpu" as well. As a side effect this also removes memset(per_cpu_memory, 0). It was not needed, alloc_percpu() returns zero-filled memory. Link: http://lkml.kernel.org/r/20130723152613.GA23741@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-19tracing: Kill trace_array->waiterOleg Nesterov
Trivial. trace_array->waiter has no users since 6eaaa5d5 "tracing/core: use appropriate waiting on trace_pipe". Link: http://lkml.kernel.org/r/20130719142036.GA1594@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-19tracing: Add ref_data to function and fgraph tracer structsSteven Rostedt (Red Hat)
The selftest for function and function graph tracers are defined as __init, as they are only executed at boot up. The "tracer" structs that are associated to those tracers are not setup as __init as they are used after boot. To stop mismatch warnings, those structures need to be annotated with __ref_data. Currently, the tracer structures are defined to __read_mostly, as they do not really change. But in the future they should be converted to consts, but that will take a little work because they have a "next" pointer that gets updated when they are registered. That will have to wait till the next major release. Link: http://lkml.kernel.org/r/1373596735.17876.84.camel@gandalf.local.home Reported-by: kbuild test robot <fengguang.wu@intel.com> Reported-by: Chen Gang <gang.chen@asianux.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-11Merge tag 'trace-3.11' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing changes from Steven Rostedt: "The majority of the changes here are cleanups for the large changes that were added to 3.10, which includes several bug fixes that have been marked for stable. As for new features, there were a few, but nothing to write to LWN about. These include: New function trigger called "dump" and "cpudump" that will cause ftrace to dump its buffer to the console when the function is called. The difference between "dump" and "cpudump" is that "dump" will dump the entire contents of the ftrace buffer, where as "cpudump" will only dump the contents of the ftrace buffer for the CPU that called the function. Another small enhancement is a new sysctl switch called "traceoff_on_warning" which, when enabled, will disable tracing if any WARN_ON() is triggered. This is useful if you want to debug what caused a warning and do not want to risk losing your trace data by the ring buffer overwriting the data before you can disable it. There's also a kernel command line option that will make this enabled at boot up called the same thing" * tag 'trace-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (34 commits) tracing: Make tracing_open_generic_{tr,tc}() static tracing: Remove ftrace() function tracing: Remove TRACE_EVENT_TYPE enum definition tracing: Make tracer_tracing_{off,on,is_on}() static tracing: Fix irqs-off tag display in syscall tracing uprobes: Fix return value in error handling path tracing: Fix race between deleting buffer and setting events tracing: Add trace_array_get/put() to event handling tracing: Get trace_array ref counts when accessing trace files tracing: Add trace_array_get/put() to handle instance refs better tracing: Protect ftrace_trace_arrays list in trace_events.c tracing: Make trace_marker use the correct per-instance buffer ftrace: Do not run selftest if command line parameter is set tracing/kprobes: Don't pass addr=ip to perf_trace_buf_submit() tracing: Use flag buffer_disabled for irqsoff tracer tracing/kprobes: Turn trace_probe->files into list_head tracing: Fix disabling of soft disable tracing: Add missing syscall_metadata comment tracing: Simplify code for showing of soft disabled flag tracing/kprobes: Kill probe_enable_lock ...
2013-07-03tracing: Remove ftrace() functionzhangwei(Jovi)
The only caller of function ftrace(...) was removed a long time ago, so remove the function body as well. Link: http://lkml.kernel.org/r/1365564393-10972-10-git-send-email-jovi.zhangwei@huawei.com Signed-off-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-03tracing: Remove TRACE_EVENT_TYPE enum definitionzhangwei(Jovi)
TRACE_EVENT_TYPE enum is not used at present, remove it. Link: http://lkml.kernel.org/r/1365564393-10972-8-git-send-email-jovi.zhangwei@huawei.com Signed-off-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-02tracing: Add trace_array_get/put() to event handlingSteven Rostedt (Red Hat)
Commit a695cb58162 "tracing: Prevent deleting instances when they are being read" tried to fix a race between deleting a trace instance and reading contents of a trace file. But it wasn't good enough. The following could crash the kernel: # cd /sys/kernel/debug/tracing/instances # ( while :; do mkdir foo; rmdir foo; done ) & # ( while :; do echo 1 > foo/events/sched/sched_switch 2> /dev/null; done ) & Luckily this can only be done by root user, but it should be fixed regardless. The problem is that a delete of the file can happen after the write to the event is opened, but before the enabling happens. The solution is to make sure the trace_array is available before succeeding in opening for write, and incerment the ref counter while opened. Now the instance can be deleted when the events are writing to the buffer, but the deletion of the instance will disable all events before the instance is actually deleted. Cc: stable@vger.kernel.org # 3.10 Reported-by: Alexander Lam <azl@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-02tracing: Protect ftrace_trace_arrays list in trace_events.cAlexander Z Lam
There are multiple places where the ftrace_trace_arrays list is accessed in trace_events.c without the trace_types_lock held. Link: http://lkml.kernel.org/r/1372732674-22726-1-git-send-email-azl@google.com Cc: Vaibhav Nagarnaik <vnagarnaik@google.com> Cc: David Sharp <dhsharp@google.com> Cc: Alexander Z Lam <lambchop468@gmail.com> Cc: stable@vger.kernel.org # 3.10 Signed-off-by: Alexander Z Lam <azl@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-07-02ftrace: Do not run selftest if command line parameter is setSteven Rostedt (Red Hat)
If the kernel command line ftrace filter parameters are set (ftrace_filter or ftrace_notrace), force the function self test to pass, with a warning why it was forced. If the user adds a filter to the kernel command line, it is assumed that they know what they are doing, and the self test should just not run instead of failing (which disables function tracing) or clearing the filter, as that will probably annoy the user. If the user wants the selftest to run, the message will tell them why it did not. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-06-11tracing: Fix outputting formats of x86-tsc and counter when use trace_clockYoshihiro YUNOMAE
Outputting formats of x86-tsc and counter should be a raw format, but after applying the patch(2b6080f28c7cc3efc8625ab71495aae89aeb63a0), the format was changed to nanosec. This is because the global variable trace_clock_id was used. When we use multiple buffers, clock_id of each sub-buffer should be used. Then, this patch uses tr->clock_id instead of the global variable trace_clock_id. [ Basically, this fixes a regression where the multibuffer code changed the trace_clock file to update tr->clock_id but the traces still use the old global trace_clock_id variable, negating the file's effect. The global trace_clock_id variable is obsolete and removed. - SR ] Link: http://lkml.kernel.org/r/20130423013239.22334.7394.stgit@yunodevel Signed-off-by: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-04-30Merge branch 'perf-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "Features: - Add "uretprobes" - an optimization to uprobes, like kretprobes are an optimization to kprobes. "perf probe -x file sym%return" now works like kretprobes. By Oleg Nesterov. - Introduce per core aggregation in 'perf stat', from Stephane Eranian. - Add memory profiling via PEBS, from Stephane Eranian. - Event group view for 'annotate' in --stdio, --tui and --gtk, from Namhyung Kim. - Add support for AMD NB and L2I "uncore" counters, by Jacob Shin. - Add Ivy Bridge-EP uncore support, by Zheng Yan - IBM zEnterprise EC12 oprofile support patchlet from Robert Richter. - Add perf test entries for checking breakpoint overflow signal handler issues, from Jiri Olsa. - Add perf test entry for for checking number of EXIT events, from Namhyung Kim. - Add perf test entries for checking --cpu in record and stat, from Jiri Olsa. - Introduce perf stat --repeat forever, from Frederik Deweerdt. - Add --no-demangle to report/top, from Namhyung Kim. - PowerPC fixes plus a couple of cleanups/optimizations in uprobes and trace_uprobes, by Oleg Nesterov. Various fixes and refactorings: - Fix dependency of the python binding wrt libtraceevent, from Naohiro Aota. - Simplify some perf_evlist methods and to allow 'stat' to share code with 'record' and 'trace', by Arnaldo Carvalho de Melo. - Remove dead code in related to libtraceevent integration, from Namhyung Kim. - Revert "perf sched: Handle PERF_RECORD_EXIT events" to get 'perf sched lat' back working, by Arnaldo Carvalho de Melo - We don't use Newt anymore, just plain libslang, by Arnaldo Carvalho de Melo. - Kill a bunch of die() calls, from Namhyung Kim. - Fix build on non-glibc systems due to libio.h absence, from Cody P Schafer. - Remove some perf_session and tracing dead code, from David Ahern. - Honor parallel jobs, fix from Borislav Petkov - Introduce tools/lib/lk library, initially just removing duplication among tools/perf and tools/vm. from Borislav Petkov ... and many more I missed to list, see the shortlog and git log for more details." * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (136 commits) perf/x86/intel/P4: Robistify P4 PMU types perf/x86/amd: Fix AMD NB and L2I "uncore" support perf/x86/amd: Remove old-style NB counter support from perf_event_amd.c perf/x86: Check all MSRs before passing hw check perf/x86/amd: Add support for AMD NB and L2I "uncore" counters perf/x86/intel: Add Ivy Bridge-EP uncore support perf/x86/intel: Fix SNB-EP CBO and PCU uncore PMU filter management perf/x86: Avoid kfree() in CPU_{STARTING,DYING} uprobes/perf: Avoid perf_trace_buf_prepare/submit if ->perf_events is empty uprobes/tracing: Don't pass addr=ip to perf_trace_buf_submit() uprobes/tracing: Change create_trace_uprobe() to support uretprobes uprobes/tracing: Make seq_printf() code uretprobe-friendly uprobes/tracing: Make register_uprobe_event() paths uretprobe-friendly uprobes/tracing: Make uprobe_{trace,perf}_print() uretprobe-friendly uprobes/tracing: Introduce is_ret_probe() and uretprobe_dispatcher() uprobes/tracing: Introduce uprobe_{trace,perf}_print() helpers uprobes/tracing: Generalize struct uprobe_trace_entry_head uprobes/tracing: Kill the pointless local_save_flags/preempt_count calls uprobes/tracing: Kill the pointless seq_print_ip_sym() call uprobes/tracing: Kill the pointless task_pt_regs() calls ...
2013-04-13uprobes/tracing: Generalize struct uprobe_trace_entry_headOleg Nesterov
struct uprobe_trace_entry_head has a single member for reporting, "unsigned long ip". If we want to support uretprobes we need to create another struct which has "func" and "ret_ip" and duplicate a lot of functions, like trace_kprobe.c does. To avoid this copy-and-paste horror we turn ->ip into ->vaddr[] and add couple of trivial helpers to calculate sizeof/data. This uglifies the code a bit, but this allows us to avoid a lot more complications later, when we add the support for ret-probes. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Tested-by: Anton Arapov <anton@redhat.com>
2013-03-15tracing: Move find_event_field() into trace_events.czhangwei(Jovi)
By moving find_event_field() and trace_find_field() into trace_events.c, the ftrace_common_fields list and trace_get_fields() can become local to the trace_events.c file. find_event_field() is renamed to trace_find_event_field() to conform to the tracing global function names. Link: http://lkml.kernel.org/r/513D8426.9070109@huawei.com Signed-off-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com> [ rostedt: Modified trace_find_field() to trace_find_event_field() ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-15tracing: Add function-trace option to disable function tracing of latency ↵Steven Rostedt (Red Hat)
tracers Currently, the only way to stop the latency tracers from doing function tracing is to fully disable the function tracer from the proc file system: echo 0 > /proc/sys/kernel/ftrace_enabled This is a big hammer approach as it disables function tracing for all users. This includes kprobes, perf, stack tracer, etc. Instead, create a function-trace option that the latency tracers can check to determine if it should enable function tracing or not. This option can be set or cleared even while the tracer is active and the tracers will disable or enable function tracing depending on how the option was set. Instead of using the proc file, disable latency function tracing with echo 0 > /debug/tracing/options/function-trace Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Clark Williams <williams@redhat.com> Cc: John Kacur <jkacur@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>