summaryrefslogtreecommitdiff
path: root/kernel/trace/trace.c
AgeCommit message (Collapse)Author
2012-10-07Merge branch 'virtio-next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull virtio changes from Rusty Russell: "New workflow: same git trees pulled by linux-next get sent straight to Linus. Git is awkward at shuffling patches compared with quilt or mq, but that doesn't happen often once things get into my -next branch." * 'virtio-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (24 commits) lguest: fix occasional crash in example launcher. virtio-blk: Disable callback in virtblk_done() virtio_mmio: Don't attempt to create empty virtqueues virtio_mmio: fix off by one error allocating queue drivers/virtio/virtio_pci.c: fix error return code virtio: don't crash when device is buggy virtio: remove CONFIG_VIRTIO_RING virtio: add help to CONFIG_VIRTIO option. virtio: support reserved vqs virtio: introduce an API to set affinity for a virtqueue virtio-ring: move queue_index to vring_virtqueue virtio_balloon: not EXPERIMENTAL any more. virtio-balloon: dependency fix virtio-blk: fix NULL checking in virtblk_alloc_req() virtio-blk: Add REQ_FLUSH and REQ_FUA support to bio path virtio-blk: Add bio-based IO path for virtio-blk virtio: console: fix error handling in init() function tools: Fix pthread flag for Makefile of trace-agent used by virtio-trace tools: Add guest trace agent as a user tool virtio/console: Allocate scatterlist according to the current pipe size ...
2012-10-02Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace changes from Eric Biederman: "This is a mostly modest set of changes to enable basic user namespace support. This allows the code to code to compile with user namespaces enabled and removes the assumption there is only the initial user namespace. Everything is converted except for the most complex of the filesystems: autofs4, 9p, afs, ceph, cifs, coda, fuse, gfs2, ncpfs, nfs, ocfs2 and xfs as those patches need a bit more review. The strategy is to push kuid_t and kgid_t values are far down into subsystems and filesystems as reasonable. Leaving the make_kuid and from_kuid operations to happen at the edge of userspace, as the values come off the disk, and as the values come in from the network. Letting compile type incompatible compile errors (present when user namespaces are enabled) guide me to find the issues. The most tricky areas have been the places where we had an implicit union of uid and gid values and were storing them in an unsigned int. Those places were converted into explicit unions. I made certain to handle those places with simple trivial patches. Out of that work I discovered we have generic interfaces for storing quota by projid. I had never heard of the project identifiers before. Adding full user namespace support for project identifiers accounts for most of the code size growth in my git tree. Ultimately there will be work to relax privlige checks from "capable(FOO)" to "ns_capable(user_ns, FOO)" where it is safe allowing root in a user names to do those things that today we only forbid to non-root users because it will confuse suid root applications. While I was pushing kuid_t and kgid_t changes deep into the audit code I made a few other cleanups. I capitalized on the fact we process netlink messages in the context of the message sender. I removed usage of NETLINK_CRED, and started directly using current->tty. Some of these patches have also made it into maintainer trees, with no problems from identical code from different trees showing up in linux-next. After reading through all of this code I feel like I might be able to win a game of kernel trivial pursuit." Fix up some fairly trivial conflicts in netfilter uid/git logging code. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (107 commits) userns: Convert the ufs filesystem to use kuid/kgid where appropriate userns: Convert the udf filesystem to use kuid/kgid where appropriate userns: Convert ubifs to use kuid/kgid userns: Convert squashfs to use kuid/kgid where appropriate userns: Convert reiserfs to use kuid and kgid where appropriate userns: Convert jfs to use kuid/kgid where appropriate userns: Convert jffs2 to use kuid and kgid where appropriate userns: Convert hpfs to use kuid and kgid where appropriate userns: Convert btrfs to use kuid/kgid where appropriate userns: Convert bfs to use kuid/kgid where appropriate userns: Convert affs to use kuid/kgid wherwe appropriate userns: On alpha modify linux_to_osf_stat to use convert from kuids and kgids userns: On ia64 deal with current_uid and current_gid being kuid and kgid userns: On ppc convert current_uid from a kuid before printing. userns: Convert s390 getting uid and gid system calls to use kuid and kgid userns: Convert s390 hypfs to use kuid and kgid where appropriate userns: Convert binder ipc to use kuids userns: Teach security_path_chown to take kuids and kgids userns: Add user namespace support to IMA userns: Convert EVM to deal with kuids and kgids in it's hmac computation ...
2012-09-28ftrace: Allow stealing pages from pipe bufferMasami Hiramatsu
Use generic steal operation on pipe buffer to allow stealing ring buffer's read page from pipe buffer. Note that this could reduce the performance of splice on the splice_write side operation without affinity setting. Since the ring buffer's read pages are allocated on the tracing-node, but the splice user does not always execute splice write side operation on the same node. In this case, the page will be accessed from the another node. Thus, it is strongly recommended to assign the splicing thread to corresponding node. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-09-24tracing: Add an option for disabling markersMandeep Singh Baines
In our application, we have trace markers spread through user-space. We have markers in GL, X, etc. These are super handy for Chrome's about:tracing feature (Chrome + system + kernel trace view), but can be very distracting when you're trying to debug a kernel issue. I normally, use "grep -v tracing_mark_write" but it would be nice if I could just temporarily disable markers all together. Link: http://lkml.kernel.org/r/1347066739-26285-1-git-send-email-msb@chromium.org CC: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Mandeep Singh Baines <msb@chromium.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-09-18userns: Teach trace to use from_kuidEric W. Biederman
- When tracing capture the kuid. - When displaying the data to user space convert the kuid into the user namespace of the process that opened the report file. Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-08-21Merge branch 'tip/perf/core-2' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/core Pull ftrace fixlets from Steve Rostedt. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-08-07tracing/trivial: Fix some typos in kernel/traceWang Tianhong
Fix some typos in kernel/trace. Link: http://lkml.kernel.org/r/1343887320.2228.9.camel@louis-ThinkPad-T410 Signed-off-by: Wang Tianhong <wangthbj@linux.vnet.ibm.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-07-26Merge tag 'staging-3.6-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging tree patches from Greg Kroah-Hartman: "Here's the big staging tree merge for the 3.6-rc1 merge window. There are some patches in here outside of drivers/staging/, notibly the iio code (which is still stradeling the staging / not staging boundry), the pstore code, and the tracing code. All of these have gotten acks from the various subsystem maintainers to be included in this tree. The pstore and tracing patches are related, and are coming here as they replace one of the android staging drivers. Otherwise, the normal staging mess. Lots of cleanups and a few new drivers (some iio drivers, and the large csr wireless driver abomination.) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>" Fixed up trivial conflicts in drivers/staging/comedi/drivers/s626.h and drivers/staging/gdm72xx/netlink_k.c * tag 'staging-3.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (1108 commits) staging: csr: delete a bunch of unused library functions staging: csr: remove csr_utf16.c staging: csr: remove csr_pmem.h staging: csr: remove CsrPmemAlloc staging: csr: remove CsrPmemFree() staging: csr: remove CsrMemAllocDma() staging: csr: remove CsrMemCalloc() staging: csr: remove CsrMemAlloc() staging: csr: remove CsrMemFree() and CsrMemFreeDma() staging: csr: remove csr_util.h staging: csr: remove CsrOffSetOf() stating: csr: remove unneeded #includes in csr_util.c staging: csr: make CsrUInt16ToHex static staging: csr: remove CsrMemCpy() staging: csr: remove CsrStrLen() staging: csr: remove CsrVsnprintf() staging: csr: remove CsrStrDup staging: csr: remove CsrStrChr() staging: csr: remove CsrStrNCmp staging: csr: remove CsrStrCmp ...
2012-07-17tracing: Fix initialization failure path in tracing_set_tracer()Anton Vorontsov
If tracer->init() fails, current code will leave current_tracer pointing to an unusable tracer, which at best makes 'current_tracer' report inaccurate value. Fix the issue by pointing current_tracer to nop tracer, and only update current_tracer with the new one after all the initialization succeeds. Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-07-11tracing: Check for allocation failure in __tracing_open()Dan Carpenter
Clean up and return -ENOMEM on if the kzalloc() fails. This also prevents a potential crash, as the pointer that failed to allocate would be later used. Link: http://lkml.kernel.org/r/20120711063507.GF11812@elgon.mountain Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-07-06Merge branch 'tip/perf/core' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/core Pull tracing updates from Steve Rostedt. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-07-03Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block bits from Jens Axboe: "As vacation is coming up, thought I'd better get rid of my pending changes in my for-linus branch for this iteration. It contains: - Two patches for mtip32xx. Killing a non-compliant sysfs interface and moving it to debugfs, where it belongs. - A few patches from Asias. Two legit bug fixes, and one killing an interface that is no longer in use. - A patch from Jan, making the annoying partition ioctl warning a bit less annoying, by restricting it to !CAP_SYS_RAWIO only. - Three bug fixes for drbd from Lars Ellenberg. - A fix for an old regression for umem, it hasn't really worked since the plugging scheme was changed in 3.0. - A few fixes from Tejun. - A splice fix from Eric Dumazet, fixing an issue with pipe resizing." * 'for-linus' of git://git.kernel.dk/linux-block: scsi: Silence unnecessary warnings about ioctl to partition block: Drop dead function blk_abort_queue() block: Mitigate lock unbalance caused by lock switching block: Avoid missed wakeup in request waitqueue umem: fix up unplugging splice: fix racy pipe->buffers uses drbd: fix null pointer dereference with on-congestion policy when diskless drbd: fix list corruption by failing but already aborted reads drbd: fix access of unallocated pages and kernel panic xen/blkfront: Add WARN to deal with misbehaving backends. blkcg: drop local variable @q from blkg_destroy() mtip32xx: Create debugfs entries for troubleshooting mtip32xx: Remove 'registers' and 'flags' from sysfs blkcg: fix blkg_alloc() failure path block: blkcg_policy_cfq shouldn't be used if !CONFIG_CFQ_GROUP_IOSCHED block: fix return value on cfq_init() failure mtip32xx: Remove version.h header file inclusion xen/blkback: Copy id field when doing BLKIF_DISCARD.
2012-06-28tracing: Remove NR_CPUS array from trace_iteratorSteven Rostedt
Replace the NR_CPUS array of buffer_iter from the trace_iterator with an allocated array. This will just create an array of possible CPUS instead of the max number specified. The use of NR_CPUS in that array caused allocation failures for machines that were tight on memory. This did not cause any failures to the system itself (no crashes), but caused unnecessary failures for reading the trace files. Added a helper function called 'trace_buffer_iter()' that returns the buffer_iter item or NULL if it is not defined or the array was not allocated. Some routines do not require the array (tracing_open_pipe() for one). Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-06-28tracing/selftest: Add a WARN_ON() if a tracer test failsSteven Rostedt
Add a WARN_ON() output on test failures so that they are easier to detect in automated tests. Although, the WARN_ON() will not print if the test causes the system to crash, obviously. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-06-13splice: fix racy pipe->buffers usesEric Dumazet
Dave Jones reported a kernel BUG at mm/slub.c:3474! triggered by splice_shrink_spd() called from vmsplice_to_pipe() commit 35f3d14dbbc5 (pipe: add support for shrinking and growing pipes) added capability to adjust pipe->buffers. Problem is some paths don't hold pipe mutex and assume pipe->buffers doesn't change for their duration. Fix this by adding nr_pages_max field in struct splice_pipe_desc, and use it in place of pipe->buffers where appropriate. splice_shrink_spd() loses its struct pipe_inode_info argument. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Tom Herbert <therbert@google.com> Cc: stable <stable@vger.kernel.org> # 2.6.35 Tested-by: Dave Jones <davej@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-06-07tracing: Have tracing_off() actually turn tracing offSteven Rostedt
A recent update to have tracing_on/off() only affect the ftrace ring buffers instead of all ring buffers had a cut and paste error. The tracing_off() did the exact same thing as tracing_on() and would not actually turn off tracing. Unfortunately, tracing_off() is more important to be working than tracing_on() as this is a key development tool, as it lets the developer turn off tracing as soon as a problem is discovered. It is also used by panic and oops code. This bug also breaks the 'echo func:traceoff > set_ftrace_filter' Cc: <stable@vger.kernel.org> # 3.4 Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-05-21Merge branch 'tip/perf/core-2' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/core
2012-05-21Merge branch 'perf/core' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Fixes for perf/core: - Rename some perf_target methods to avoid double negation, from Namhyung Kim. - Revert change to use per task events with inheritance, from Namhyung Kim. - Events should start disabled till children starts running, from David Ahern. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-19tracing: Remove kernel_lock annotationsRichard Weinberger
The BKL is gone, these annotations are useless. Link: http://lkml.kernel.org/r/1320654202-4433-1-git-send-email-richard@nod.at Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-05-19tracing: Fix initial buffer_size_kb stateVaibhav Nagarnaik
Make sure that the state of buffer_size_kb is initialized correctly and returns actual size of the ring buffer. Link: http://lkml.kernel.org/r/1336066834-1673-1-git-send-email-vnagarnaik@google.com Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Laurent Chavey <chavey@google.com> Cc: Justin Teravest <teravest@google.com> Cc: David Sharp <dhsharp@google.com> Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-05-18Merge remote-tracking branch 'tip/perf/urgent' into perf/coreArnaldo Carvalho de Melo
Merge reason: We are going to queue up a dependent patch: "perf tools: Move parse event automated tests to separated object" That depends on: commit e7c72d8 perf tools: Add 'G' and 'H' modifiers to event parsing Conflicts: tools/perf/builtin-stat.c Conflicted with the recent 'perf_target' patches when checking the result of perf_evsel open routines to see if a retry is needed to cope with older kernels where the exclude guest/host perf_event_attr bits were not used. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-16tracing: change CPU ring buffer state from tracing_cpumaskVaibhav Nagarnaik
According to Documentation/trace/ftrace.txt: tracing_cpumask: This is a mask that lets the user only trace on specified CPUS. The format is a hex string representing the CPUS. The tracing_cpumask currently doesn't affect the tracing state of per-CPU ring buffers. This patch enables/disables CPU recording as its corresponding bit in tracing_cpumask is set/unset. Link: http://lkml.kernel.org/r/1336096792-25373-3-git-send-email-vnagarnaik@google.com Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Laurent Chavey <chavey@google.com> Cc: Justin Teravest <teravest@google.com> Cc: David Sharp <dhsharp@google.com> Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-05-16tracing: Check return value of tracing_dentry_percpu()Namhyung Kim
If tracing_dentry_percpu() failed, tracing_init_debugfs_percpu() will try to create each cpu directories on debugfs' root directory as d_percpu is NULL. Link: http://lkml.kernel.org/r/1335143517-2285-1-git-send-email-namhyung.kim@lge.com Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Namhyung Kim <namhyung.kim@lge.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-05-16ring-buffer: Make removal of ring buffer pages atomicVaibhav Nagarnaik
This patch adds the capability to remove pages from a ring buffer without destroying any existing data in it. This is done by removing the pages after the tail page. This makes sure that first all the empty pages in the ring buffer are removed. If the head page is one in the list of pages to be removed, then the page after the removed ones is made the head page. This removes the oldest data from the ring buffer and keeps the latest data around to be read. To do this in a non-racey manner, tracing is stopped for a very short time while the pages to be removed are identified and unlinked from the ring buffer. The pages are freed after the tracing is restarted to minimize the time needed to stop tracing. The context in which the pages from the per-cpu ring buffer are removed runs on the respective CPU. This minimizes the events not traced to only NMI trace contexts. Link: http://lkml.kernel.org/r/1336096792-25373-1-git-send-email-vnagarnaik@google.com Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Laurent Chavey <chavey@google.com> Cc: Justin Teravest <teravest@google.com> Cc: David Sharp <dhsharp@google.com> Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-05-16tracing: Clean up tracing_mark_write()Steven Rostedt
On gcc 4.5 the function tracing_mark_write() would give a warning of page2 being uninitialized. This is due to a bug in gcc because the logic prevents page2 from being used uninitialized, and gcc 4.6+ does not complain (correctly). Instead of adding a "unitialized" around page2, which could show a bug later on, I combined page1 and page2 into an array map_pages[]. This binds the two and the two are modified according to nr_pages (what gcc 4.5 seems to ignore). This no longer gives a warning with gcc 4.5 nor with gcc 4.6. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-05-09tracing: Remove ftrace_disable/enable_cpu()Steven Rostedt
The ftrace_disable_cpu() and ftrace_enable_cpu() functions were needed back before the ring buffer was lockless. Now that the ring buffer is lockless (and has been for some time), these functions serve no purpose, and unnecessarily slow down operations of the tracer. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-05-09tracing: Use seq_*_private interface for some seq filesJiri Olsa
It's appropriate to use __seq_open_private interface to open some of trace seq files, because it covers all steps we are duplicating in tracing code - zallocating the iterator and setting it as seq_file's private. Using this for following files: trace available_filter_functions enabled_functions Link: http://lkml.kernel.org/r/1335342219-2782-5-git-send-email-jolsa@redhat.com Signed-off-by: Jiri Olsa <jolsa@redhat.com> [ Fixed warnings for: kernel/trace/trace.c: In function '__tracing_open': kernel/trace/trace.c:2418:11: warning: unused variable 'ret' [-Wunused-variable] kernel/trace/trace.c:2417:19: warning: unused variable 'm' [-Wunused-variable] ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-04-24ring-buffer: Add per_cpu ring buffer control filesVaibhav Nagarnaik
Add a debugfs entry under per_cpu/ folder for each cpu called buffer_size_kb to control the ring buffer size for each CPU independently. If the global file buffer_size_kb is used to set size, the individual ring buffers will be adjusted to the given size. The buffer_size_kb will report the common size to maintain backward compatibility. If the buffer_size_kb file under the per_cpu/ directory is used to change buffer size for a specific CPU, only the size of the respective ring buffer is updated. When tracing/buffer_size_kb is read, it reports 'X' to indicate that sizes of per_cpu ring buffers are not equivalent. Link: http://lkml.kernel.org/r/1328212844-11889-1-git-send-email-vnagarnaik@google.com Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Michael Rubin <mrubin@google.com> Cc: David Sharp <dhsharp@google.com> Cc: Justin Teravest <teravest@google.com> Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-04-24tracing: Remove an unneeded check in trace_seq_buffer()Dan Carpenter
memcpy() returns a pointer to "bug". Hopefully, it's not NULL here or we would already have Oopsed. Link: http://lkml.kernel.org/r/20120420063145.GA22649@elgon.mountain Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-04-24tracing: Add percpu buffers for trace_printk()Steven Rostedt
Currently, trace_printk() uses a single buffer to write into to calculate the size and format needed to save the trace. To do this safely in an SMP environment, a spin_lock() is taken to only allow one writer at a time to the buffer. But this could also affect what is being traced, and add synchronization that would not be there otherwise. Ideally, using percpu buffers would be useful, but since trace_printk() is only used in development, having per cpu buffers for something never used is a waste of space. Thus, the use of the trace_bprintk() format section is changed to be used for static fmts as well as dynamic ones. Then at boot up, we can check if the section that holds the trace_printk formats is non-empty, and if it does contain something, then we know a trace_printk() has been added to the kernel. At this time the trace_printk per cpu buffers are allocated. A check is also done at module load time in case a module is added that contains a trace_printk(). Once the buffers are allocated, they are never freed. If you use a trace_printk() then you should know what you are doing. A buffer is made for each type of context: normal softirq irq nmi The context is checked and the appropriate buffer is used. This allows for totally lockless usage of trace_printk(), and they no longer even disable interrupts. Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-04-16tracing: Fix regression with tracing_onSteven Rostedt
The change to make tracing_on affect only the ftrace ring buffer, caused a bug where it wont affect any ring buffer. The problem was that the buffer of the trace_array was passed to the write function and not the trace array itself. The trace_array can change the buffer when running a latency tracer. If this happens, then the buffer being disabled may not be the buffer currently used by ftrace. This will cause the tracing_on file to become useless. The simple fix is to pass the trace_array to the write function instead of the buffer. Then the actual buffer may be changed. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-03-27tracing: Fix ent_size in trace outputSteven Rostedt
When reading the trace file, the records of each of the per_cpu buffers are examined to find the next event to print out. At the point of looking at the event, the size of the event is recorded. But if the first event is chosen, the other events in the other CPU buffers will reset the event size that is stored in the iterator descriptor, causing the event size passed to the output functions to be incorrect. In most cases this is not a problem, but for the case of stack traces, it is. With the change to the stack tracing to record a dynamic number of back traces, the output depends on the size of the entry instead of the fixed 8 back traces. When the entry size is not correct, the back traces would not be fully printed. Note, reading from the per-cpu trace files were not affected. Reported-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-03-02tracing: Keep NMI watchdog from triggering when dumping traceSteven Rostedt
As ftrace_dump() (called by ftrace_dump_on_oops) disables interrupts as it dumps its output to the console, it can keep interrupts disabled for long periods of time. This is likely to trigger the NMI watchdog, and it can disrupt the output of critical data. Add a touch_nmi_watchdog() to each event that is written to the screen to keep the NMI watchdog from affecting the output. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-02-22tracing/ring-buffer: Only have tracing_on disable tracing buffersSteven Rostedt
As the ring-buffer code is being used by other facilities in the kernel, having tracing_on file disable *all* buffers is not a desired affect. It should only disable the ftrace buffers that are being used. Move the code into the trace.c file and use the buffer disabling for tracing_on() and tracing_off(). This way only the ftrace buffers will be affected by them and other kernel utilities will not be confused to why their output suddenly stopped. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-02-13ftrace: sched_switch plugin is deprecatedGeunsik Lim
Actually, sched_switch function tracer is merged into wakeup/wakeup_rt Update 'mini-HOWTO' for ftrace(Kernel function tracer). If we want to trace "sched:sched_switch" to trace sched_switch func, We may utilize event option.(e.g: trace-cmd list -e | grep sched) This patch is based on Linux-3.3.rc2-SMP-PREEMPT Link: http://lkml.kernel.org/r/1328695537-15081-1-git-send-email-geunsik.lim@gmail.com Cc: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Geunsik Lim <geunsik.lim@samsung.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-01-08Merge branch 'for-linus2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs * 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (165 commits) reiserfs: Properly display mount options in /proc/mounts vfs: prevent remount read-only if pending removes vfs: count unlinked inodes vfs: protect remounting superblock read-only vfs: keep list of mounts for each superblock vfs: switch ->show_options() to struct dentry * vfs: switch ->show_path() to struct dentry * vfs: switch ->show_devname() to struct dentry * vfs: switch ->show_stats to struct dentry * switch security_path_chmod() to struct path * vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb vfs: trim includes a bit switch mnt_namespace ->root to struct mount vfs: take /proc/*/mounts and friends to fs/proc_namespace.c vfs: opencode mntget() mnt_set_mountpoint() vfs: spread struct mount - remaining argument of next_mnt() vfs: move fsnotify junk to struct mount vfs: move mnt_devname vfs: move mnt_list to struct mount vfs: switch pnode.h macros to struct mount * ...
2012-01-06Merge branch 'perf-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (106 commits) perf kvm: Fix copy & paste error in description perf script: Kill script_spec__delete perf top: Fix a memory leak perf stat: Introduce get_ratio_color() helper perf session: Remove impossible condition check perf tools: Fix feature-bits rework fallout, remove unused variable perf script: Add generic perl handler to process events perf tools: Use for_each_set_bit() to iterate over feature flags perf tools: Unify handling of features when writing feature section perf report: Accept fifos as input file perf tools: Moving code in some files perf tools: Fix out-of-bound access to struct perf_session perf tools: Continue processing header on unknown features perf tools: Improve macros for struct feature_ops perf: builtin-record: Document and check that mmap_pages must be a power of two. perf: builtin-record: Provide advice if mmap'ing fails with EPERM. perf tools: Fix truncated annotation perf script: look up thread using tid instead of pid perf tools: Look up thread names for system wide profiling perf tools: Fix comm for processes with named threads ...
2012-01-04switch debugfs to umode_tAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-12-11trace: Allow ftrace_dump() to be called from modulesPaul E. McKenney
Add an EXPORT_SYMBOL_GPL() so that rcutorture can dump the trace buffer upon detection of an RCU error. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2011-11-17tracing: Add entries in buffer and total entries to default output headerSteven Rostedt
Knowing the number of event entries in the ring buffer compared to the total number that were written is useful information. The latency format gives this information and there's no reason that the default format does not. This information is now added to the default header, along with the number of online CPUs: # tracer: nop # # entries-in-buffer/entries-written: 159836/64690869 #P:4 # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / delay # TASK-PID CPU# |||| TIMESTAMP FUNCTION # | | | |||| | | <idle>-0 [000] ...2 49.442971: local_touch_nmi <-cpu_idle <idle>-0 [000] d..2 49.442973: enter_idle <-cpu_idle <idle>-0 [000] d..2 49.442974: atomic_notifier_call_chain <-enter_idle <idle>-0 [000] d..2 49.442976: __atomic_notifier_call_chain <-atomic_notifier The above shows that the trace contains 159836 entries, but 64690869 were written. One could figure out that there were 64531033 entries that were dropped. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-11-17tracing: Add irq, preempt-count and need resched info to default trace outputSteven Rostedt
People keep asking how to get the preempt count, irq, and need resched info and we keep telling them to enable the latency format. Some developers think that traces without this info is completely useless, and for a lot of tasks it is useless. The first option was to enable the latency trace as the default format, but the header for the latency format is pretty useless for most tracers and it also does the timestamp in straight microseconds from the time the trace started. This is sometimes more difficult to read as the default trace is seconds from the start of boot up. Latency format: # tracer: nop # # nop latency trace v1.1.5 on 3.2.0-rc1-test+ # -------------------------------------------------------------------- # latency: 0 us, #159771/64234230, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4) # ----------------- # | task: -0 (uid:0 nice:0 policy:0 rt_prio:0) # ----------------- # # _------=> CPU# # / _-----=> irqs-off # | / _----=> need-resched # || / _---=> hardirq/softirq # ||| / _--=> preempt-depth # |||| / delay # cmd pid ||||| time | caller # \ / ||||| \ | / migratio-6 0...2 41778231us+: rcu_note_context_switch <-__schedule migratio-6 0...2 41778233us : trace_rcu_utilization <-rcu_note_context_switch migratio-6 0...2 41778235us+: rcu_sched_qs <-rcu_note_context_switch migratio-6 0d..2 41778236us+: rcu_preempt_qs <-rcu_note_context_switch migratio-6 0...2 41778238us : trace_rcu_utilization <-rcu_note_context_switch migratio-6 0...2 41778239us+: debug_lockdep_rcu_enabled <-__schedule default format: # tracer: nop # # TASK-PID CPU# TIMESTAMP FUNCTION # | | | | | migration/0-6 [000] 50.025810: rcu_note_context_switch <-__schedule migration/0-6 [000] 50.025812: trace_rcu_utilization <-rcu_note_context_switch migration/0-6 [000] 50.025813: rcu_sched_qs <-rcu_note_context_switch migration/0-6 [000] 50.025815: rcu_preempt_qs <-rcu_note_context_switch migration/0-6 [000] 50.025817: trace_rcu_utilization <-rcu_note_context_switch migration/0-6 [000] 50.025818: debug_lockdep_rcu_enabled <-__schedule migration/0-6 [000] 50.025820: debug_lockdep_rcu_enabled <-__schedule The latency format header has latency information that is pretty meaningless for most tracers. Although some of the header is useful, and we can add that later to the default format as well. What is really useful with the latency format is the irqs-off, need-resched hard/softirq context and the preempt count. This commit adds the option irq-info which is on by default that adds this information: # tracer: nop # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / delay # TASK-PID CPU# |||| TIMESTAMP FUNCTION # | | | |||| | | <idle>-0 [000] d..2 49.309305: cpuidle_get_driver <-cpuidle_idle_call <idle>-0 [000] d..2 49.309307: mwait_idle <-cpu_idle <idle>-0 [000] d..2 49.309309: need_resched <-mwait_idle <idle>-0 [000] d..2 49.309310: test_ti_thread_flag <-need_resched <idle>-0 [000] d..2 49.309312: trace_power_start.constprop.13 <-mwait_idle <idle>-0 [000] d..2 49.309313: trace_cpu_idle <-mwait_idle <idle>-0 [000] d..2 49.309315: need_resched <-mwait_idle If a user wants the old format, they can disable the 'irq-info' option: # tracer: nop # # TASK-PID CPU# TIMESTAMP FUNCTION # | | | | | <idle>-0 [000] 49.309305: cpuidle_get_driver <-cpuidle_idle_call <idle>-0 [000] 49.309307: mwait_idle <-cpu_idle <idle>-0 [000] 49.309309: need_resched <-mwait_idle <idle>-0 [000] 49.309310: test_ti_thread_flag <-need_resched <idle>-0 [000] 49.309312: trace_power_start.constprop.13 <-mwait_idle <idle>-0 [000] 49.309313: trace_cpu_idle <-mwait_idle <idle>-0 [000] 49.309315: need_resched <-mwait_idle Requested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-11-11Merge branch 'tip/perf/core' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/core
2011-11-07tracing/latency: Fix header output for latency tracersJiri Olsa
In case the the graph tracer (CONFIG_FUNCTION_GRAPH_TRACER) or even the function tracer (CONFIG_FUNCTION_TRACER) are not set, the latency tracers do not display proper latency header. The involved/fixed latency tracers are: wakeup_rt wakeup preemptirqsoff preemptoff irqsoff The patch adds proper handling of tracer configuration options for latency tracers, and displaying correct header info accordingly. * The current output (for wakeup tracer) with both graph and function tracers disabled is: # tracer: wakeup # <idle>-0 0d.h5 1us+: 0:120:R + [000] 7: 0:R watchdog/0 <idle>-0 0d.h5 3us+: ttwu_do_activate.clone.1 <-try_to_wake_up ... * The fixed output is: # tracer: wakeup # # wakeup latency trace v1.1.5 on 3.1.0-tip+ # -------------------------------------------------------------------- # latency: 55 us, #4/4, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2) # ----------------- # | task: migration/0-6 (uid:0 nice:0 policy:1 rt_prio:99) # ----------------- # # _------=> CPU# # / _-----=> irqs-off # | / _----=> need-resched # || / _---=> hardirq/softirq # ||| / _--=> preempt-depth # |||| / delay # cmd pid ||||| time | caller # \ / ||||| \ | / cat-1129 0d..4 1us : 1129:120:R + [000] 6: 0:R migration/0 cat-1129 0d..4 2us+: ttwu_do_activate.clone.1 <-try_to_wake_up * The current output (for wakeup tracer) with only function tracer enabled is: # tracer: wakeup # cat-1140 0d..4 1us+: 1140:120:R + [000] 6: 0:R migration/0 cat-1140 0d..4 2us : ttwu_do_activate.clone.1 <-try_to_wake_up * The fixed output is: # tracer: wakeup # # wakeup latency trace v1.1.5 on 3.1.0-tip+ # -------------------------------------------------------------------- # latency: 207 us, #109/109, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2) # ----------------- # | task: watchdog/1-12 (uid:0 nice:0 policy:1 rt_prio:99) # ----------------- # # _------=> CPU# # / _-----=> irqs-off # | / _----=> need-resched # || / _---=> hardirq/softirq # ||| / _--=> preempt-depth # |||| / delay # cmd pid ||||| time | caller # \ / ||||| \ | / <idle>-0 1d.h5 1us+: 0:120:R + [001] 12: 0:R watchdog/1 <idle>-0 1d.h5 3us : ttwu_do_activate.clone.1 <-try_to_wake_up Link: http://lkml.kernel.org/r/20111107150849.GE1807@m.brq.redhat.com Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-10-26Merge branch 'perf-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (121 commits) perf symbols: Increase symbol KSYM_NAME_LEN size perf hists browser: Refuse 'a' hotkey on non symbolic views perf ui browser: Use libslang to read keys perf tools: Fix tracing info recording perf hists browser: Elide DSO column when it is set to just one DSO, ditto for threads perf hists: Don't consider filtered entries when calculating column widths perf hists: Don't decay total_period for filtered entries perf hists browser: Honour symbol_conf.show_{nr_samples,total_period} perf hists browser: Do not exit on tab key with single event perf annotate browser: Don't change selection line when returning from callq perf tools: handle endianness of feature bitmap perf tools: Add prelink suggestion to dso update message perf script: Fix unknown feature comment perf hists browser: Apply the dso and thread filters when merging new batches perf hists: Move the dso and thread filters from hist_browser perf ui browser: Honour the xterm colors perf top tui: Give color hints just on the percentage, like on --stdio perf ui browser: Make the colors configurable and change the defaults perf tui: Remove unneeded call to newtCls on startup perf hists: Don't format the percentage on hist_entry__snprintf ... Fix up conflicts in arch/x86/kernel/kprobes.c manually. Ingo's tree did the insane "add volatile to const array", which just doesn't make sense ("volatile const"?). But we could remove the const *and* make the array volatile to make doubly sure that gcc doesn't optimize it away.. Also fix up kernel/trace/ring_buffer.c non-data-conflicts manually: the reader_lock has been turned into a raw lock by the core locking merge, and there was a new user of it introduced in this perf core merge. Make sure that new use also uses the raw accessor functions.
2011-10-14tracing: Fix returning of duplicate data after EOF in trace_pipe_rawSteven Rostedt
The trace_pipe_raw handler holds a cached page from the time the file is opened to the time it is closed. The cached page is used to handle the case of the user space buffer being smaller than what was read from the ring buffer. The left over buffer is held in the cache so that the next read will continue where the data left off. After EOF is returned (no more data in the buffer), the index of the cached page is set to zero. If a user app reads the page again after EOF, the check in the buffer will see that the cached page is less than page size and will return the cached page again. This will cause reading the trace_pipe_raw again after EOF to return duplicate data, making the output look like the time went backwards but instead data is just repeated. The fix is to not reset the index right after all data is read from the cache, but to reset it after all data is read and more data exists in the ring buffer. Cc: stable <stable@kernel.org> Reported-by: Jeremy Eder <jeder@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-10-14ftrace: Fix README to state tracing_on to start/stop tracingGeunsik Lim
tracing_enabled option is deprecated. To start/stop tracing, write to /sys/kernel/debug/tracing/tracing_on without tracing_enabled. This patch is based on Linux 3.1.0-rc1 Signed-off-by: Geunsik Lim <geunsik.lim@samsung.com> Link: http://lkml.kernel.org/r/1313127022-23830-1-git-send-email-leemgs1@gmail.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-10-11tracing: Do not allocate buffer for trace_markerSteven Rostedt
When doing intense tracing, the kmalloc inside trace_marker can introduce side effects to what is being traced. As trace_marker() is used by userspace to inject data into the kernel ring buffer, it needs to do so with the least amount of intrusion to the operations of the kernel or the user space application. As the ring buffer is designed to write directly into the buffer without the need to make a temporary buffer, and userspace already went through the hassle of knowing how big the write will be, we can simply pin the userspace pages and write the data directly into the buffer. This improves the impact of tracing via trace_marker tremendously! Thanks to Peter Zijlstra and Thomas Gleixner for pointing out the use of get_user_pages_fast() and kmap_atomic(). Suggested-by: Thomas Gleixner <tglx@linutronix.de> Suggested-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-10-11tracing: Warn on output if the function tracer was found corruptedSteven Rostedt
As the function tracer is very intrusive, lots of self checks are performed on the tracer and if something is found to be strange it will shut itself down keeping it from corrupting the rest of the kernel. This shutdown may still allow functions to be traced, as the tracing only stops new modifications from happening. Trying to stop the function tracer itself can cause more harm as it requires code modification. Although a WARN_ON() is executed, a user may not notice it. To help the user see that something isn't right with the tracing of the system a big warning is added to the output of the tracer that lets the user know that their data may be incomplete. Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-19tracing: Add a counter clock for those that do not trust clocksSteven Rostedt
When debugging tight race conditions, it can be helpful to have a synchronized tracing method. Although in most cases the global clock provides this functionality, if timings is not the issue, it is more comforting to know that the order of events really happened in a precise order. Instead of using a clock, add a "counter" that is simply an incrementing atomic 64bit counter that orders the events as they are perceived to happen. The trace_clock_counter() is added from the attempt by Peter Zijlstra trying to convert the trace_clock_global() to it. I took Peter's counter code and made trace_clock_counter() instead, and added it to the choice of clocks. Just echo counter > /debug/tracing/trace_clock to activate it. Requested-by: Thomas Gleixner <tglx@linutronix.de> Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-By: Valdis Kletnieks <valdis.kletnieks@vt.edu> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-13locking, tracing: Annotate tracing locks as rawThomas Gleixner
The tracing locks can be taken in atomic context and therefore cannot be preempted on -rt - annotate it. In mainline this change documents the low level nature of the lock - otherwise there's no functional difference. Lockdep and Sparse checking will work as usual. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>