summaryrefslogtreecommitdiff
path: root/kernel/trace
AgeCommit message (Collapse)Author
2011-03-18trace, filters: Initialize the match variable in process_ops() properlyIngo Molnar
Make sure the 'match' variable always has a value. Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-17trace, documentation: Fix branch profiling location in debugfsDavid Rientjes
The debugfs interface for branch profiling is through /sys/kernel/debug/tracing/trace_stat/branch_annotated /sys/kernel/debug/tracing/trace_stat/branch_all so update the Kconfig accordingly. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <alpine.DEB.2.00.1103161716320.11407@chino.kir.corp.google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-11blktrace: Use rq->cmd_flags directly in blk_add_trace_rq.Tao Ma
In blk_add_trace_rq, we only chose the minor 2 bits from request's cmd_flags and did some check for discard. so most of other flags(e.g, REQ_SYNC) are missing. For example, with a sync write after blkparse we get: 8,16 1 1 0.001776503 7509 A WS 1349632 + 1024 <- (8,17) 1347584 8,16 1 2 0.001776813 7509 Q WS 1349632 + 1024 [dd] 8,16 1 3 0.001780395 7509 G WS 1349632 + 1024 [dd] 8,16 1 5 0.001783186 7509 I W 1349632 + 1024 [dd] 8,16 1 11 0.001816987 7509 D W 1349632 + 1024 [dd] 8,16 0 2 0.006218192 0 C W 1349632 + 1024 [0] Since now we have integrated the flags of both bio and request, it is safe to pass rq->cmd_flags directly to __blk_add_trace. With this patch, after a sync write we get: 8,16 1 1 0.001776900 5425 A WS 1189888 + 1024 <- (8,17) 1187840 8,16 1 2 0.001777179 5425 Q WS 1189888 + 1024 [dd] 8,16 1 3 0.001780797 5425 G WS 1189888 + 1024 [dd] 8,16 1 5 0.001783402 5425 I WS 1189888 + 1024 [dd] 8,16 1 11 0.001817468 5425 D WS 1189888 + 1024 [dd] 8,16 0 2 0.005640709 0 C WS 1189888 + 1024 [0] Signed-off-by: Tao Ma <boyu.mt@taobao.com> Acked-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10tracing: Fix irqoff selftest expanding max bufferSteven Rostedt
If the kernel command line declares a tracer "ftrace=sometracer" and that tracer is either not defined or is enabled after irqsoff, then the irqs off selftest will fail with the following error: Testing tracer irqsoff: ------------[ cut here ]------------ WARNING: at /home/rostedt/work/autotest/nobackup/linux-test.git/kernel/trace/tra ce.c:713 update_max_tr_single+0xfa/0x11b() Hardware name: Modules linked in: Pid: 1, comm: swapper Not tainted 2.6.38-rc8-test #1 Call Trace: [<c0441d9d>] ? warn_slowpath_common+0x65/0x7a [<c049adb2>] ? update_max_tr_single+0xfa/0x11b [<c0441dc1>] ? warn_slowpath_null+0xf/0x13 [<c049adb2>] ? update_max_tr_single+0xfa/0x11b [<c049e454>] ? stop_critical_timing+0x154/0x204 [<c049b54b>] ? trace_selftest_startup_irqsoff+0x5b/0xc1 [<c049b54b>] ? trace_selftest_startup_irqsoff+0x5b/0xc1 [<c049b54b>] ? trace_selftest_startup_irqsoff+0x5b/0xc1 [<c049e529>] ? time_hardirqs_on+0x25/0x28 [<c0468bca>] ? trace_hardirqs_on_caller+0x18/0x12f [<c0468cec>] ? trace_hardirqs_on+0xb/0xd [<c049b54b>] ? trace_selftest_startup_irqsoff+0x5b/0xc1 [<c049b6b8>] ? register_tracer+0xf8/0x1a3 [<c14e93fe>] ? init_irqsoff_tracer+0xd/0x11 [<c040115e>] ? do_one_initcall+0x71/0x121 [<c14e93f1>] ? init_irqsoff_tracer+0x0/0x11 [<c14ce3a9>] ? kernel_init+0x13a/0x1b6 [<c14ce26f>] ? kernel_init+0x0/0x1b6 [<c0403842>] ? kernel_thread_helper+0x6/0x10 ---[ end trace e93713a9d40cd06c ]--- .. no entries found ..FAILED! What happens is the "ftrace=..." will expand the ring buffer to its default size (from its minimum size) but it will not expand the max ring buffer (the ring buffer to store maximum latencies). When the irqsoff test runs, it will call the ring buffer swap routine that checks if the max ring buffer is the same size as the normal ring buffer, and will fail if it is not. This causes the test to fail. The solution is to expand the max ring buffer before running the self test if the max ring buffer is used by that tracer and the normal ring buffer is expanded. The max ring buffer should be shrunk again after the test is done to save space. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-03-10tracing: Align 4 byte ints together in struct tracerSteven Rostedt
Move elements in struct tracer for better alignment. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-03-10tracing: Export trace_set_clr_event()Yuanhan Liu
Trace events belonging to a module only exists when the module is loaded. Well, we can use trace_set_clr_event funtion to enable some trace event at the module init routine, so that we will not miss something while loading then module. So, Export the trace_set_clr_event function so that module can use it. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> LKML-Reference: <1289196312-25323-1-git-send-email-yuanhan.liu@linux.intel.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-03-10tracing: Explain about unstable clock on resume with ring buffer warningJiri Olsa
The "Delta way too big" warning might appear on a system with a unstable shed clock right after the system is resumed and tracing was enabled at time of suspend. Since it's not realy a bug, and the unstable sched clock is working fast and reliable otherwise, Steven suggested to keep using the sched clock in any case and just to make note in the warning itself. v2 changes: - added #ifdef CONFIG_HAVE_UNSTABLE_SCHED_CLOCK Signed-off-by: Jiri Olsa <jolsa@redhat.com> LKML-Reference: <20110218145219.GD2604@jolsa.brq.redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-03-10tracing: Adjust conditional expression latency formatting.David Sharp
Formatting change only to improve code readability. No code changes except to introduce intermediate variables. Signed-off-by: David Sharp <dhsharp@google.com> LKML-Reference: <1291421609-14665-13-git-send-email-dhsharp@google.com> [ Keep variable declarations and assignment separate ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-03-10tracing: Fix event alignment: ftrace:context_switch and ftrace:wakeupDavid Sharp
Signed-off-by: David Sharp <dhsharp@google.com> LKML-Reference: <1291421609-14665-6-git-send-email-dhsharp@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-03-10tracing: Remove lock_depth from event entrySteven Rostedt
The lock_depth field in the event headers was added as a temporary data point for help in removing the BKL. Now that the BKL is pretty much been removed, we can remove this field. This in turn changes the header from 12 bytes to 8 bytes, removing the 4 byte buffer that gcc would insert if the first field in the data load was 8 bytes in size. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-03-09ring-buffer: Remove unused #include <linux/trace_irq.h>David Sharp
Signed-off-by: David Sharp <dhsharp@google.com> LKML-Reference: <1291421609-14665-3-git-send-email-dhsharp@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-03-09tracing: Add an 'overwrite' trace_option.David Sharp
Add an "overwrite" trace_option for ftrace to control whether the buffer should be overwritten on overflow or not. The default remains to overwrite old events when the buffer is full. This patch adds the option to instead discard newest events when the buffer is full. This is useful to get a snapshot of traces just after enabling traces. Dropping the current event is also a simpler code path. Signed-off-by: David Sharp <dhsharp@google.com> LKML-Reference: <1291844807-15481-1-git-send-email-dhsharp@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-03-08Merge commit 'v2.6.38-rc8' into perf/coreIngo Molnar
Merge reason: Merge latest fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-03blktrace: Remove blk_fill_rwbs_rq.Tao Ma
If we enable trace events to trace block actions, We use blk_fill_rwbs_rq to analyze the corresponding actions in request's cmd_flags, but we only choose the minor 2 bits from it, so most of other flags(e.g, REQ_SYNC) are missing. For example, with a sync write we get: write_test-2409 [001] 160.013869: block_rq_insert: 3,64 W 0 () 258135 + = 8 [write_test] Since now we have integrated the flags of both bio and request, it is safe to pass rq->cmd_flags directly to blk_fill_rwbs and blk_fill_rwbs_rq isn't needed any more. With this patch, after a sync write we get: write_test-2417 [000] 226.603878: block_rq_insert: 3,64 WS 0 () 258135 += 8 [write_test] Signed-off-by: Tao Ma <boyu.mt@taobao.com> Acked-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-03Merge branch '/tip/perf/filter' of ↵Frederic Weisbecker
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace.git into perf/core
2011-02-18Revert "tracing: Add unstable sched clock note to the warning"Ingo Molnar
This reverts commit 5e38ca8f3ea423442eaafe1b7e206084aa38120a. Breaks the build of several !CONFIG_HAVE_UNSTABLE_SCHED_CLOCK architectures. Cc: Jiri Olsa <jolsa@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Message-ID: <20110217171823.GB17058@elte.hu> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-17Merge branch 'tip/perf/core' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core
2011-02-16Merge branch 'perf/urgent' into perf/coreIngo Molnar
Merge reason: we need to queue up dependent patch Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-15Merge branch 'master' into for-nextJiri Kosina
2011-02-14Merge branch 'tip/perf/urgent' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core
2011-02-14tracing/kprobe: Fix NULL pointer deref checkMasami Hiramatsu
Add NULL check for avoiding NULL pointer deref. This bug has been introduced by: 1ff511e35ed8: tracing/kprobes: Add bitfield type which causes a null pointer dereference bug when kprobe-tracer parses an argument without type. Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: 2nddept-manager@sdl.hitachi.co.jp Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20110214054807.8919.69740.stgit@ltc236.sdl.hitachi.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu> Reported-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
2011-02-11ftrace: Fix memory leak with function graph and cpu hotplugSteven Rostedt
When the fuction graph tracer starts, it needs to make a special stack for each task to save the real return values of the tasks. All running tasks have this stack created, as well as any new tasks. On CPU hot plug, the new idle task will allocate a stack as well when init_idle() is called. The problem is that cpu hotplug does not create a new idle_task. Instead it uses the idle task that existed when the cpu went down. ftrace_graph_init_task() will add a new ret_stack to the task that is given to it. Because a clone will make the task have a stack of its parent it does not check if the task's ret_stack is already NULL or not. When the CPU hotplug code starts a CPU up again, it will allocate a new stack even though one already existed for it. The solution is to treat the idle_task specially. In fact, the function_graph code already does, just not at init_idle(). Instead of using the ftrace_graph_init_task() for the idle task, which that function expects the task to be a clone, have a separate ftrace_graph_init_idle_task(). Also, we will create a per_cpu ret_stack that is used by the idle task. When we call ftrace_graph_init_idle_task() it will check if the idle task's ret_stack is NULL, if it is, then it will assign it the per_cpu ret_stack. Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Suggested-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stable Tree <stable@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-02-09Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: cdrom: support devices that have check_events but not media_changed cfq-iosched: Don't wait if queue already has requests. blkio-throttle: Avoid calling blkiocg_lookup_group() for root group cfq: rename a function to give it more appropriate name cciss: make cciss_revalidate not loop through CISS_MAX_LUNS volumes unnecessarily. drivers/block/aoe/Makefile: replace the use of <module>-objs with <module>-y loop: queue_lock NULL pointer derefence in blk_throtl_exit drivers/block/Makefile: replace the use of <module>-objs with <module>-y blktrace: Don't output messages if NOTIFY isn't set.
2011-02-08tracing: Deprecate tracing_enabled for tracing_onSteven Rostedt
tracing_enabled should not be used, it is heavy weight and does not do much in helping lower the overhead. tracing_on should be used instead. Warn users to use tracing_on when tracing_enabled is used as it will soon be removed from the tracing directory. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-02-08tracing: Remove obsolete sched_switch tracerSteven Rostedt
The trace events sched_switch and sched_wakeup do the same thing as the stand alone sched_switch tracer does. It is no longer needed. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-02-08tracing: Add unstable sched clock note to the warningJiri Olsa
The warning "Delta way too big" warning might appear on a system with unstable shed clock right after the system is resumed and tracing was enabled during the suspend. Since it's not realy bug, and the unstable sched clock is working fast and reliable otherwise, Steven suggested to keep using the sched clock in any case and just to make note in the warning itself. Signed-off-by: Jiri Olsa <jolsa@redhat.com> LKML-Reference: <1296649698-6003-1-git-send-email-jolsa@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-02-08tracing/syscalls: Early terminate search for sys_ni_syscallIan Munsie
Many system calls are unimplemented and mapped to sys_ni_syscall, but at boot ftrace would still search through every syscall metadata entry for a match which wouldn't be there. This patch adds causes the search to terminate early if the system call is not mapped. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> LKML-Reference: <1296703645-18718-7-git-send-email-imunsie@au1.ibm.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-02-08tracing/syscalls: Allow arch specific syscall symbol matchingIan Munsie
Some architectures have unusual symbol names and the generic code to match the symbol name with the function name for the syscall metadata will fail. For example, symbols on PPC64 start with a period and the generic code will fail to match them. This patch moves the match logic out into a separate function which an arch can override by defining ARCH_HAS_SYSCALL_MATCH_SYM_NAME in asm/ftrace.h and implementing arch_syscall_match_sym_name. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> LKML-Reference: <1296703645-18718-5-git-send-email-imunsie@au1.ibm.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-02-08tracing/syscalls: Make arch_syscall_addr weakIan Munsie
Some architectures use non-trivial system call tables and will not work with the generic arch_syscall_addr code. For example, PowerPC64 uses a table of twin long longs. This patch makes the generic arch_syscall_addr weak to allow architectures with non-trivial system call tables to override it. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> LKML-Reference: <1296703645-18718-4-git-send-email-imunsie@au1.ibm.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-02-08tracing/syscalls: Convert redundant syscall_nr checks into WARN_ONIan Munsie
With the ftrace events now checking if the syscall_nr is valid upon initialisation it should no longer be possible to register or unregister a syscall event without a valid syscall_nr since they should not be created. This adds a WARN_ON_ONCE in the register and unregister functions to locate potential regressions in the future. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> LKML-Reference: <1296703645-18718-3-git-send-email-imunsie@au1.ibm.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-02-08tracing/syscalls: Don't add events for unmapped syscallsIan Munsie
FTRACE_SYSCALLS would create events for each and every system call, even if it had failed to map the system call's name with it's number. This resulted in a number of events being created that would not behave as expected. This could happen, for example, on architectures who's symbol names are unusual and will not match the system call name. It could also happen with system calls which were mapped to sys_ni_syscall. This patch changes the default system call number in the metadata to -1. If the system call name from the metadata is not successfully mapped to a system call number during boot, than the event initialisation routine will now return an error, preventing the event from being created. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> LKML-Reference: <1296703645-18718-2-git-send-email-imunsie@au1.ibm.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-02-08tracing/filter: Remove synchronize_sched() from __alloc_preds()Steven Rostedt
Because the filters are processed first and then activated (added to the call), we no longer need to worry about the preds of the filter in __alloc_preds() being used. As the filter that is allocating preds is not activated yet. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-02-08tracing/filter: Swap entire filter of eventsSteven Rostedt
When creating a new filter, instead of allocating the filter to the event call first and then processing the filter, it is easier to process a temporary filter and then just swap it with the call filter. By doing this, it simplifies the code. A filter is allocated and processed, when it is done, it is swapped with the call filter, synchronize_sched() is called to make sure all callers are done with the old filter (filters are called with premption disabled), and then the old filter is freed. Cc: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-02-08tracing/filter: Increase the max preds to 2^14Steven Rostedt
Now that the filter logic does not require to save the pred results on the stack, we can increase the max number of preds we allow. As the preds are index by a short value, and we use the MSBs as flags we can increase the max preds to 2^14 (16384) which should be way more than enough. Cc: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-02-08tracing/filter: Move MAX_FILTER_PRED to local tracing directorySteven Rostedt
The MAX_FILTER_PRED is only needed by the kernel/trace/*.c files. Move it to kernel/trace/trace.h. Cc: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-02-08tracing/filter: Optimize filter by folding the treeSteven Rostedt
There are many cases that a filter will contain multiple ORs or ANDs together near the leafs. Walking up and down the tree to get to the next compare can be a waste. If there are several ORs or ANDs together, fold them into a single pred and allocate an array of the conditions that they check. This will speed up the filter by linearly walking an array and can still break out if a short circuit condition is met. Cc: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-02-08tracing/filter: Check the created pred treeSteven Rostedt
Since the filter walks a tree to determine if a match is made or not, if the tree was incorrectly created, it could cause an infinite loop. Add a check to walk the entire tree before assigning it as a filter to make sure the tree is correct. Cc: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-02-08tracing/filter: Optimize short ciruit checkSteven Rostedt
The test if we should break out early for OR and AND operations can be optimized by comparing the current result with (pred->op == OP_OR) That is if the result is true and the op is an OP_OR, or if the result is false and the op is not an OP_OR (thus an OP_AND) we can break out early in either case. Otherwise we continue processing. Cc: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-02-08tracing/filter: Use a tree instead of stack for filter_match_preds()Steven Rostedt
Currently the filter_match_preds() requires a stack to push and pop the preds to determine if the filter matches the record or not. This has two drawbacks: 1) It requires a stack to store state information. As this is done in fast paths we can't allocate the storage for this stack, and we can't use a global as it must be re-entrant. The stack is stored on the kernel stack and this greatly limits how many preds we may allow. 2) All conditions are calculated even when a short circuit exists. a || b will always calculate a and b even though a was determined to be true. Using a tree we can walk a constant structure that will save the state as we go. The algorithm is simply: pred = root; do { switch (move) { case MOVE_DOWN: if (OR or AND) { pred = left; continue; } if (pred == root) break; match = pred->fn(); pred = pred->parent; move = left child ? MOVE_UP_FROM_LEFT : MOVE_UP_FROM_RIGHT; continue; case MOVE_UP_FROM_LEFT: /* Only OR or AND can be a parent */ if (match && OR || !match && AND) { /* short circuit */ if (pred == root) break; pred = pred->parent; move = left child ? MOVE_UP_FROM_LEFT : MOVE_UP_FROM_RIGHT; continue; } pred = pred->right; move = MOVE_DOWN; continue; case MOVE_UP_FROM_RIGHT: if (pred == root) break; pred = pred->parent; move = left child ? MOVE_UP_FROM_LEFT : MOVE_UP_FROM_RIGHT; continue; } done = 1; } while (!done); This way there's no strict limit to how many preds we allow and it also will short circuit the logical operations when possible. Cc: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-02-08tracing/filter: Free pred array on disabling of filterSteven Rostedt
When a filter is disabled, free the preds. Cc: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-02-08tracing/filter: Allocate the preds in an arraySteven Rostedt
Currently we allocate an array of pointers to filter_preds, and then allocate a separate filter_pred for each item in the array. This adds slight overhead in the filters as it needs to derefernce twice to get to the op condition. Allocating the preds themselves in a single array removes a dereference as well as helps on the cache footprint. Cc: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-02-08tracing/filter: Call synchronize_sched() just once for system filtersSteven Rostedt
By separating out the reseting of the filter->n_preds to zero from the reallocation of preds for the filter, we can reset groups of filters first, call synchronize_sched() just once, and then reallocate each of the filters in the system group. Cc: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-02-08tracing/filter: Dynamically allocate predsSteven Rostedt
For every filter that is made, we create predicates to hold every operation within the filter. We have a max of 32 predicates that we can hold. Currently, we allocate all 32 even if we only need to use one. Part of the reason we do this is that the filter can be used at any moment by any event. Fortunately, the filter is only used with preemption disabled. By reseting the count of preds used "n_preds" to zero, then performing a synchronize_sched(), we can safely free and reallocate a new array of preds. Cc: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-02-08tracing/filter: Move OR and AND logic out of fn() methodSteven Rostedt
The ops OR and AND act different from the other ops, as they are the only ones to take other ops as their arguements. These ops als change the logic of the filter_match_preds. By removing the OR and AND fn's we can also remove the val1 and val2 that is passed to all other fn's and are unused. Cc: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-02-08tracing/filter: Have no filter return a matchSteven Rostedt
The n_preds field of a file can change at anytime, and even can become zero, just as the filter is about to be processed by an event. In the case that is zero on entering the filter, return 1, telling the caller the event matchs and should be trace. Also use a variable and assign it with ACCESS_ONCE() such that the count stays consistent within the function. Cc: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-02-07tracing/kprobes: Add bitfield typeMasami Hiramatsu
Add bitfield type for tracing arguments on kprobe-tracer. The syntax of a bitfield type is: b<bit-size>@<bit-offset>/<container-size> e.g. Accessing 2 bits-width field with 4 bits-offset in 32 bits-width data at 4 bytes offseted from the address pointed by AX register: +4(%ax):b2@4/32 Since the width of container data depends on the arch, so I just added the container-size at the end. Cc: 2nddept-manager@sdl.hitachi.co.jp Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <20110204125205.9507.11363.stgit@ltc236.sdl.hitachi.co.jp> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-07tracing/kprobes: Support longer (>128 bytes) commandMasami Hiramatsu
Expand command line buffer of kprobe-tracer to 4096 bytes. Reported-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: 2nddept-manager@sdl.hitachi.co.jp Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <20110204125159.9507.20895.stgit@ltc236.sdl.hitachi.co.jp> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-07tracing/kprobes: Cleanup strict_strtol() using codeMasami Hiramatsu
Since strict_strtol() accepts minus digits started with '-', it doesn't need to invert after converting. Cc: 2nddept-manager@sdl.hitachi.co.jp Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <20110204125153.9507.49335.stgit@ltc236.sdl.hitachi.co.jp> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-03tracing: Replace syscall_meta_data struct array with pointer arraySteven Rostedt
Currently the syscall_meta structures for the syscall tracepoints are placed in the __syscall_metadata section, and at link time, the linker makes one large array of all these syscall metadata structures. On boot up, this array is read (much like the initcall sections) and the syscall data is processed. The problem is that there is no guarantee that gcc will place complex structures nicely together in an array format. Two structures in the same file may be placed awkwardly, because gcc has no clue that they are suppose to be in an array. A hack was used previous to force the alignment to 4, to pack the structures together. But this caused alignment issues with other architectures (sparc). Instead of packing the structures into an array, the structures' addresses are now put into the __syscall_metadata section. As pointers are always the natural alignment, gcc should always pack them tightly together (otherwise initcall, extable, etc would also fail). By having the pointers to the structures in the section, we can still iterate the trace_events without causing unnecessary alignment problems with other architectures, or depending on the current behaviour of gcc that will likely change in the future just to tick us kernel developers off a little more. The __syscall_metadata section is also moved into the .init.data section as it is now only needed at boot up. Suggested-by: David Miller <davem@davemloft.net> Acked-by: David S. Miller <davem@davemloft.net> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-02-03tracing: Replace trace_event struct array with pointer arraySteven Rostedt
Currently the trace_event structures are placed in the _ftrace_events section, and at link time, the linker makes one large array of all the trace_event structures. On boot up, this array is read (much like the initcall sections) and the events are processed. The problem is that there is no guarantee that gcc will place complex structures nicely together in an array format. Two structures in the same file may be placed awkwardly, because gcc has no clue that they are suppose to be in an array. A hack was used previous to force the alignment to 4, to pack the structures together. But this caused alignment issues with other architectures (sparc). Instead of packing the structures into an array, the structures' addresses are now put into the _ftrace_event section. As pointers are always the natural alignment, gcc should always pack them tightly together (otherwise initcall, extable, etc would also fail). By having the pointers to the structures in the section, we can still iterate the trace_events without causing unnecessary alignment problems with other architectures, or depending on the current behaviour of gcc that will likely change in the future just to tick us kernel developers off a little more. The _ftrace_event section is also moved into the .init.data section as it is now only needed at boot up. Suggested-by: David Miller <davem@davemloft.net> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>