summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2012-06-06Merge branch 'for-upstream' of git://github.com/agraf/linux-2.6 into nextAvi Kivity
Alex says: "Changes this time include: - Generalize KVM_GUEST support to overall ePAPR code - Fix reset for Book3S HV - Fix machine check deferral when CONFIG_KVM_GUEST=y - Add support for BookE register DECAR" * 'for-upstream' of git://github.com/agraf/linux-2.6: KVM: PPC: Not optimizing MSR_CE and MSR_ME with paravirt. KVM: PPC: booke: Added DECAR support KVM: PPC: Book3S HV: Make the guest hash table size configurable KVM: PPC: Factor out guest epapr initialization Signed-off-by: Avi Kivity <avi@redhat.com>
2012-06-06KVM: disable uninitialized var warningMichael S. Tsirkin
I see this in 3.5-rc1: arch/x86/kvm/mmu.c: In function ‘kvm_test_age_rmapp’: arch/x86/kvm/mmu.c:1271: warning: ‘iter.desc’ may be used uninitialized in this function The line in question was introduced by commit 1e3f42f03c38c29c1814199a6f0a2f01b919ea3f static int kvm_test_age_rmapp(struct kvm *kvm, unsigned long *rmapp, unsigned long data) { - u64 *spte; + u64 *sptep; + struct rmap_iterator iter; <- line 1271 int young = 0; /* The reason I think is that the compiler assumes that the rmap value could be 0, so static u64 *rmap_get_first(unsigned long rmap, struct rmap_iterator *iter) { if (!rmap) return NULL; if (!(rmap & 1)) { iter->desc = NULL; return (u64 *)rmap; } iter->desc = (struct pte_list_desc *)(rmap & ~1ul); iter->pos = 0; return iter->desc->sptes[iter->pos]; } will not initialize iter.desc, but the compiler isn't smart enough to see that for (sptep = rmap_get_first(*rmapp, &iter); sptep; sptep = rmap_get_next(&iter)) { will immediately exit in this case. I checked by adding if (!*rmapp) goto out; on top which is clearly equivalent but disables the warning. This patch uses uninitialized_var to disable the warning without increasing code size. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-06-06KVM: Cleanup the kvm_print functions and introduce pr_XX wrappersChristoffer Dall
Introduces a couple of print functions, which are essentially wrappers around standard printk functions, with a KVM: prefix. Functions introduced or modified are: - kvm_err(fmt, ...) - kvm_info(fmt, ...) - kvm_debug(fmt, ...) - kvm_pr_unimpl(fmt, ...) - pr_unimpl(vcpu, fmt, ...) -> vcpu_unimpl(vcpu, fmt, ...) Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-06-05KVM: VMX: Fix KVM_SET_SREGS with big real mode segmentsOrit Wasserman
For example migration between Westmere and Nehelem hosts, caught in big real mode. The code that fixes the segments for real mode guest was moved from enter_rmode to vmx_set_segments. enter_rmode calls vmx_set_segments for each segment. Signed-off-by: Orit Wasserman <owasserm@rehdat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-06-05KVM: MMU: do not iterate over all VMs in mmu_shrink()Gleb Natapov
mmu_shrink() needlessly iterates over all VMs even though it will not attempt to free mmu pages from more than one on them. Fix that and also check used mmu pages count outside of VM lock to skip inactive VMs faster. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-06-05KVM: ia64: Mark ia64 KVM as BROKENAvi Kivity
Practically all patches to ia64 KVM are build fixes; numerous warnings remain; the last patch from the maintainer was committed more than three years ago. It is clear that no one is using this thing. Mark as BROKEN to ensure people don't get hit by pointless build problems. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-06-05KVM: VMX: Use EPT Access bit in response to memory notifiersXudong Hao
Signed-off-by: Haitao Shan <haitao.shan@intel.com> Signed-off-by: Xudong Hao <xudong.hao@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-06-05KVM: VMX: Enable EPT A/D bits if supported by turning on relevant bit in EPTPXudong Hao
In EPT page structure entry, Enable EPT A/D bits if processor supported. Signed-off-by: Haitao Shan <haitao.shan@intel.com> Signed-off-by: Xudong Hao <xudong.hao@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-06-05KVM: VMX: Add parameter to control A/D bits support, default is onXudong Hao
Add kernel parameter to control A/D bits support, it's on by default. Signed-off-by: Haitao Shan <haitao.shan@intel.com> Signed-off-by: Xudong Hao <xudong.hao@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-06-05KVM: VMX: Add EPT A/D bits definitionsXudong Hao
Signed-off-by: Haitao Shan <haitao.shan@intel.com> Signed-off-by: Xudong Hao <xudong.hao@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-06-05KVM: Avoid wasting pages for small lpage_info arraysTakuya Yoshikawa
lpage_info is created for each large level even when the memory slot is not for RAM. This means that when we add one slot for a PCI device, we end up allocating at least KVM_NR_PAGE_SIZES - 1 pages by vmalloc(). To make things worse, there is an increasing number of devices which would result in more pages being wasted this way. This patch mitigates this problem by using kvm_kvzalloc(). Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-06-04fixups for signal breakageAl Viro
Obvious brainos spotted by Geert. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-04Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner: "The clocksource driver is pure hardware enablement and the skew option is default off, well tested and non dangerous." * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tick: Move skew_tick option into the HIGH_RES_TIMER section clocksource: em_sti: Add DT support clocksource: em_sti: Emma Mobile STI driver clockevents: Make clockevents_config() a global symbol tick: Add tick skew boot option
2012-06-02Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull straggler x86 fixes from Peter Anvin: "Three groups of patches: - EFI boot stub documentation and the ability to print error messages; - Removal for PTRACE_ARCH_PRCTL for x32 (obsolete interface which should never have been ported, and the port is broken and potentially dangerous.) - ftrace stack corruption fixes. I'm not super-happy about the technical implementation, but it is probably the least invasive in the short term. In the future I would like a single method for nesting the debug stack, however." * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, x32, ptrace: Remove PTRACE_ARCH_PRCTL for x32 x86, efi: Add EFI boot stub documentation x86, efi; Add EFI boot stub console support x86, efi: Only close open files in error path ftrace/x86: Do not change stacks in DEBUG when calling lockdep x86: Allow nesting of the debug stack IDT setting x86: Reset the debug_stack update counter ftrace: Use breakpoint method to update ftrace caller ftrace: Synchronize variable setting with breakpoints
2012-06-02Merge 'for-linus' branches from ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/{vfs,signal} Pull vfs fix and a fix from the signal changes for frv from Al Viro. The __kernel_nlink_t for powerpc got scrogged because 64-bit powerpc actually depended on the default "unsigned long", while 32-bit powerpc had an explicit override to "unsigned short". Al didn't notice, and made both of them be the unsigned short. The frv signal fix is fallout from simplifying the do_notify_resume() code, and leaving an extra parenthesis. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: powerpc: Fix size of st_nlink on 64bit * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: frv: Remove bogus closing parenthesis
2012-06-02powerpc: Fix size of st_nlink on 64bitAnton Blanchard
commit e57f93cc53b7 (powerpc: get rid of nlink_t uses, switch to explicitly-sized type) changed the size of st_nlink on ppc64 from a long to a short, resulting in boot failures. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-02frv: Remove bogus closing parenthesisGeert Uytterhoeven
Introduced by commit 6fd84c0831ec78d98736b76dc5e9b849f1dbfc9e ("TIF_RESTORE_SIGMASK can be set only when TIF_SIGPENDING is set") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01Merge tag 'fbdev-updates-for-3.5' of git://github.com/schandinat/linux-2.6Linus Torvalds
Pull fbdev updates from Florian Tobias Schandinat: - driver for AUO-K1900 and AUO-K1901 epaper controller - large updates for OMAP (e.g. decouple HDMI audio and video) - some updates for Exynos and SH Mobile - various other small fixes and cleanups * tag 'fbdev-updates-for-3.5' of git://github.com/schandinat/linux-2.6: (130 commits) video: bfin_adv7393fb: Fix cleanup code video: exynos_dp: reduce delay time when configuring video setting video: exynos_dp: move sw reset prioir to enabling sw defined function video: exynos_dp: use devm_ functions fb: handle NULL pointers in framebuffer release OMAPDSS: HDMI: OMAP4: Update IRQ flags for the HPD IRQ request OMAPDSS: Apply VENC timings even if panel is disabled OMAPDSS: VENC/DISPC: Delay dividing Y resolution for managers connected to VENC OMAPDSS: DISPC: Support rotation through TILER OMAPDSS: VRFB: remove compiler warnings when CONFIG_BUG=n OMAPFB: remove compiler warnings when CONFIG_BUG=n OMAPDSS: remove compiler warnings when CONFIG_BUG=n OMAPDSS: DISPC: fix usage of dispc_ovl_set_accu_uv OMAPDSS: use DSI_FIFO_BUG workaround only for manual update displays OMAPDSS: DSI: Support command mode interleaving during video mode blanking periods OMAPDSS: DISPC: Update Accumulator configuration for chroma plane drivers/video: fsl-diu-fb: don't initialize the THRESHOLDS registers video: exynos mipi dsi: support reverse panel type video: exynos mipi dsi: Properly interpret the interrupt source flags video: exynos mipi dsi: Avoid races in probe() ...
2012-06-01Merge tag 'for-linus-3.5-20120601' of git://git.infradead.org/linux-mtdLinus Torvalds
Pull mtd update from David Woodhouse: - More robust parsing especially of xattr data in JFFS2 - Updates to mxc_nand and gpmi drivers to support new boards and device tree - Improve consistency of information about ECC strength in NAND devices - Clean up partition handling of plat_nand - Support NAND drivers without dedicated access to OOB area - BCH hardware ECC support for OMAP - Other fixes and cleanups, and a few new device IDs Fixed trivial conflict in drivers/mtd/nand/gpmi-nand/gpmi-nand.c due to added include files next to each other. * tag 'for-linus-3.5-20120601' of git://git.infradead.org/linux-mtd: (75 commits) mtd: mxc_nand: move ecc strengh setup before nand_scan_tail mtd: block2mtd: fix recursive call of mtd_writev mtd: gpmi-nand: define ecc.strength mtd: of_parts: fix breakage in Kconfig mtd: nand: fix scan_read_raw_oob mtd: docg3 fix in-middle of blocks reads mtd: cfi_cmdset_0002: Slight cleanup of fixup messages mtd: add fixup for S29NS512P NOR flash. jffs2: allow to complete xattr integrity check on first GC scan jffs2: allow to discriminate between recoverable and non-recoverable errors mtd: nand: omap: add support for hardware BCH ecc ARM: OMAP3: gpmc: add BCH ecc api and modes mtd: nand: check the return code of 'read_oob/read_oob_raw' mtd: nand: remove 'sndcmd' parameter of 'read_oob/read_oob_raw' mtd: m25p80: Add support for Winbond W25Q80BW jffs2: get rid of jffs2_sync_super jffs2: remove unnecessary GC pass on sync jffs2: remove unnecessary GC pass on umount jffs2: remove lock_super mtd: gpmi: add gpmi support for mx6q ...
2012-06-01Merge remote-tracking branch 'rostedt/tip/perf/urgent-2' into ↵H. Peter Anvin
x86-urgent-for-linus
2012-06-01Merge branch 'ux500/hickup' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull arm fixes for ux500 mismerge mishap from Arnd Bergmann: "The device tree conversion for arm/ux500 in 3.5 turns out to be incomplete because of a mismerge done by Linus Walleij that I failed to notice early enough and that Lee Jones as the original author of those patches did not manage to fix during the -next cycle. While we originally to get a much larger set of ux500 device tree enablement patches merged, this did not happen in time. After some discussion at Linaro Connect conference this week, Lee has been able to do damage control and provide a series to put the broken platform back into usable shape for both DT and non-DT based booting. This series has not been part of linux-next and is based on top of the current state of the upstream kernel rather than an -rc, but this is the best we could manage given the earlier breakage." * 'ux500/hickup' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: ARM: ux500: Enable probing of pinctrl through Device Tree ARM: ux500: Add support for ab8500 regulators into the Device Tree ARM: ux500: Provide regulator support for SMSC911x via Device Tree ARM: ux500: Allow PRCMU regulator to be probed during a DT enabled boot ARM: ux500: Apply db8500-prcmu regulator information to db8500 Device Tree ARM: ux500: Only initialise STE's UIBs on boards which support them ARM: ux500: Disable platform setup of the ab8500 when DT is enabled ARM: ux500: Use correct format for dynamic IRQ assignment ARM: ux500: Re-enable SMSC911x platform code registration during non-DT boots ARM: ux500: PRCMU related configuration and layout corrections for Device Tree ARM: ux500: Remove DB8500 PRCMU platform registration when DT is enabled ARM: ux500: Disable SMSC911x platform code registration when DT is enabled ARM: ux500: New DT:ed u8500_init_devices for one-by-one device enablement ARM: ux500: New DT:ed snowball_platform_devs for one-by-one device enablement pinctrl-nomadik: Allow Device Tree driver probing
2012-06-01x86, x32, ptrace: Remove PTRACE_ARCH_PRCTL for x32H.J. Lu
When I added x32 ptrace to 3.4 kernel, I also include PTRACE_ARCH_PRCTL support for x32 GDB For ARCH_GET_FS/GS, it takes a pointer to int64. But at user level, ARCH_GET_FS/GS takes a pointer to int32. So I have to add x32 ptrace to glibc to handle it with a temporary int64 passed to kernel and copy it back to GDB as int32. Roland suggested that PTRACE_ARCH_PRCTL is obsolete and x32 GDB should use fs_base and gs_base fields of user_regs_struct instead. Accordingly, remove PTRACE_ARCH_PRCTL completely from the x32 code to avoid possible memory overrun when pointer to int32 is passed to kernel. Link: http://lkml.kernel.org/r/CAMe9rOpDzHfS7NH7m1vmD9QRw8SSj4Sc%2BaNOgcWm_WJME2eRsQ@mail.gmail.com Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: <stable@vger.kernel.org> v3.4
2012-06-01Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal Pull third pile of signal handling patches from Al Viro: "This time it's mostly helpers and conversions to them; there's a lot of stuff remaining in the tree, but that'll either go in -rc2 (isolated bug fixes, ideally via arch maintainers' trees) or will sit there until the next cycle." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: x86: get rid of calling do_notify_resume() when returning to kernel mode blackfin: check __get_user() return value whack-a-mole with TIF_FREEZE FRV: Optimise the system call exit path in entry.S [ver #2] FRV: Shrink TIF_WORK_MASK [ver #2] FRV: Prevent syscall exit tracing and notify_resume at end of kernel exceptions new helper: signal_delivered() powerpc: get rid of restore_sigmask() most of set_current_blocked() callers want SIGKILL/SIGSTOP removed from set set_restore_sigmask() is never called without SIGPENDING (and never should be) TIF_RESTORE_SIGMASK can be set only when TIF_SIGPENDING is set don't call try_to_freeze() from do_signal() pull clearing RESTORE_SIGMASK into block_sigmask() sh64: failure to build sigframe != signal without handler openrisc: tracehook_signal_handler() is supposed to be called on success new helper: sigmask_to_save() new helper: restore_saved_sigmask() new helpers: {clear,test,test_and_clear}_restore_sigmask() HAVE_RESTORE_SIGMASK is defined on all architectures now
2012-06-01Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs changes from Al Viro. "A lot of misc stuff. The obvious groups: * Miklos' atomic_open series; kills the damn abuse of ->d_revalidate() by NFS, which was the major stumbling block for all work in that area. * ripping security_file_mmap() and dealing with deadlocks in the area; sanitizing the neighborhood of vm_mmap()/vm_munmap() in general. * ->encode_fh() switched to saner API; insane fake dentry in mm/cleancache.c gone. * assorted annotations in fs (endianness, __user) * parts of Artem's ->s_dirty work (jff2 and reiserfs parts) * ->update_time() work from Josef. * other bits and pieces all over the place. Normally it would've been in two or three pull requests, but signal.git stuff had eaten a lot of time during this cycle ;-/" Fix up trivial conflicts in Documentation/filesystems/vfs.txt (the 'truncate_range' inode method was removed by the VM changes, the VFS update adds an 'update_time()' method), and in fs/btrfs/ulist.[ch] (due to sparse fix added twice, with other changes nearby). * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (95 commits) nfs: don't open in ->d_revalidate vfs: retry last component if opening stale dentry vfs: nameidata_to_filp(): don't throw away file on error vfs: nameidata_to_filp(): inline __dentry_open() vfs: do_dentry_open(): don't put filp vfs: split __dentry_open() vfs: do_last() common post lookup vfs: do_last(): add audit_inode before open vfs: do_last(): only return EISDIR for O_CREAT vfs: do_last(): check LOOKUP_DIRECTORY vfs: do_last(): make ENOENT exit RCU safe vfs: make follow_link check RCU safe vfs: do_last(): use inode variable vfs: do_last(): inline walk_component() vfs: do_last(): make exit RCU safe vfs: split do_lookup() Btrfs: move over to use ->update_time fs: introduce inode operation ->update_time reiserfs: get rid of resierfs_sync_super reiserfs: mark the superblock as dirty a bit later ...
2012-06-01x86: get rid of calling do_notify_resume() when returning to kernel modeAl Viro
If we end up calling do_notify_resume() with !user_mode(refs), it does nothing (do_signal() explicitly bails out and we can't get there with TIF_NOTIFY_RESUME in such situations). Then we jump to resume_userspace_sig, which rechecks the same thing and bails out to resume_kernel, thus breaking the loop. It's easier and cheaper to check *before* calling do_notify_resume() and bail out to resume_kernel immediately. And kill the check in do_signal()... Note that on amd64 we can't get there with !user_mode() at all - asm glue takes care of that. Acked-and-reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01blackfin: check __get_user() return valueAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01whack-a-mole with TIF_FREEZEAl Viro
blackfin has reintroduced it, completely unused. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01FRV: Optimise the system call exit path in entry.S [ver #2]David Howells
Optimise the system call exit path in entry.S by packing some instructions. Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01FRV: Shrink TIF_WORK_MASK [ver #2]David Howells
Shrink TIF_WORK_MASK so that it will fit in the 12-bit signed immediate operand field of an ANDI instruction. Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01FRV: Prevent syscall exit tracing and notify_resume at end of kernel exceptionsDavid Howells
Move the test for kernel mode processing from do_signal() into entry.S to also prevent system call exit tracing and userspace resumption notification handling happening when returning from kernel exceptions. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01new helper: signal_delivered()Al Viro
Does block_sigmask() + tracehook_signal_handler(); called when sigframe has been successfully built. All architectures converted to it; block_sigmask() itself is gone now (merged into this one). I'm still not too happy with the signature, but that's a separate story (IMO we need a structure that would contain signal number + siginfo + k_sigaction, so that get_signal_to_deliver() would fill one, signal_delivered(), handle_signal() and probably setup...frame() - take one). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01powerpc: get rid of restore_sigmask()Al Viro
... it's just a call of set_current_blocked() now Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01most of set_current_blocked() callers want SIGKILL/SIGSTOP removed from setAl Viro
Only 3 out of 63 do not. Renamed the current variant to __set_current_blocked(), added set_current_blocked() that will exclude unblockable signals, switched open-coded instances to it. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01set_restore_sigmask() is never called without SIGPENDING (and never should be)Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01TIF_RESTORE_SIGMASK can be set only when TIF_SIGPENDING is setAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01don't call try_to_freeze() from do_signal()Al Viro
get_signal_to_deliver() will handle it itself Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01pull clearing RESTORE_SIGMASK into block_sigmask()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01sh64: failure to build sigframe != signal without handlerAl Viro
it's actually "send me SIGSEGV"... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01openrisc: tracehook_signal_handler() is supposed to be called on successAl Viro
... not if sigframe couldn't have been built. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01new helper: sigmask_to_save()Al Viro
replace boilerplate "should we use ->saved_sigmask or ->blocked?" with calls of obvious inlined helper... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01new helper: restore_saved_sigmask()Al Viro
first fruits of ..._restore_sigmask() helpers: now we can take boilerplate "signal didn't have a handler, clear RESTORE_SIGMASK and restore the blocked mask from ->saved_mask" into a common helper. Open-coded instances switched... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01new helpers: {clear,test,test_and_clear}_restore_sigmask()Al Viro
helpers parallel to set_restore_sigmask(), used in the next commits Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-06-01x86, efi: Add EFI boot stub documentationMatt Fleming
Since we can't expect every user to read the EFI boot stub code it seems prudent to have a couple of paragraphs explaining what it is and how it works. The "initrd=" option in particular is tricky because it only understands absolute EFI-style paths (backslashes as directory separators), and until now this hasn't been documented anywhere. This has tripped up a couple of users. Cc: Matthew Garrett <mjg@redhat.com> Cc: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Matt Fleming <matt.fleming@intel.com> Link: http://lkml.kernel.org/r/1331907517-3985-4-git-send-email-matt@console-pimps.org Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2012-06-01x86, efi; Add EFI boot stub console supportMatt Fleming
We need a way of printing useful messages to the user, for example when we fail to open an initrd file, instead of just hanging the machine without giving the user any indication of what went wrong. So sprinkle some error messages throughout the EFI boot stub code to make it easier for users to diagnose/report problems. Reported-by: Keshav P R <the.ridikulus.rat@gmail.com> Cc: Matthew Garrett <mjg@redhat.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com> Link: http://lkml.kernel.org/r/1331907517-3985-3-git-send-email-matt@console-pimps.org Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2012-06-01x86, efi: Only close open files in error pathMatt Fleming
The loop at the 'close_handles' label in handle_ramdisks() should be using 'i', which represents the number of initrd files that were successfully opened, not 'nr_initrds' which is the number of initrd= arguments passed on the command line. Currently, if we execute the loop to close all file handles and we failed to open any initrds we'll try to call the close function on a garbage pointer, causing the machine to hang. Cc: Matthew Garrett <mjg@redhat.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com> Link: http://lkml.kernel.org/r/1331907517-3985-2-git-send-email-matt@console-pimps.org Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2012-06-01ftrace/x86: Do not change stacks in DEBUG when calling lockdepSteven Rostedt
When both DYNAMIC_FTRACE and LOCKDEP are set, the TRACE_IRQS_ON/OFF will call into the lockdep code. The lockdep code can call lots of functions that may be traced by ftrace. When ftrace is updating its code and hits a breakpoint, the breakpoint handler will call into lockdep. If lockdep happens to call a function that also has a breakpoint attached, it will jump back into the breakpoint handler resetting the stack to the debug stack and corrupt the contents currently on that stack. The 'do_sym' call that calls do_int3() is protected by modifying the IST table to point to a different location if another breakpoint is hit. But the TRACE_IRQS_OFF/ON are outside that protection, and if a breakpoint is hit from those, the stack will get corrupted, and the kernel will crash: [ 1013.243754] BUG: unable to handle kernel NULL pointer dereference at 0000000000000002 [ 1013.272665] IP: [<ffff880145cc0000>] 0xffff880145cbffff [ 1013.285186] PGD 1401b2067 PUD 14324c067 PMD 0 [ 1013.298832] Oops: 0010 [#1] PREEMPT SMP [ 1013.310600] CPU 2 [ 1013.317904] Modules linked in: ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables crc32c_intel ghash_clmulni_intel microcode usb_debug serio_raw pcspkr iTCO_wdt i2c_i801 iTCO_vendor_support e1000e nfsd nfs_acl auth_rpcgss lockd sunrpc i915 video i2c_algo_bit drm_kms_helper drm i2c_core [last unloaded: scsi_wait_scan] [ 1013.401848] [ 1013.407399] Pid: 112, comm: kworker/2:1 Not tainted 3.4.0+ #30 [ 1013.437943] RIP: 8eb8:[<ffff88014630a000>] [<ffff88014630a000>] 0xffff880146309fff [ 1013.459871] RSP: ffffffff8165e919:ffff88014780f408 EFLAGS: 00010046 [ 1013.477909] RAX: 0000000000000001 RBX: ffffffff81104020 RCX: 0000000000000000 [ 1013.499458] RDX: ffff880148008ea8 RSI: ffffffff8131ef40 RDI: ffffffff82203b20 [ 1013.521612] RBP: ffffffff81005751 R08: 0000000000000000 R09: 0000000000000000 [ 1013.543121] R10: ffffffff82cdc318 R11: 0000000000000000 R12: ffff880145cc0000 [ 1013.564614] R13: ffff880148008eb8 R14: 0000000000000002 R15: ffff88014780cb40 [ 1013.586108] FS: 0000000000000000(0000) GS:ffff880148000000(0000) knlGS:0000000000000000 [ 1013.609458] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 1013.627420] CR2: 0000000000000002 CR3: 0000000141f10000 CR4: 00000000001407e0 [ 1013.649051] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1013.670724] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 1013.692376] Process kworker/2:1 (pid: 112, threadinfo ffff88013fe0e000, task ffff88014020a6a0) [ 1013.717028] Stack: [ 1013.724131] ffff88014780f570 ffff880145cc0000 0000400000004000 0000000000000000 [ 1013.745918] cccccccccccccccc ffff88014780cca8 ffffffff811072bb ffffffff81651627 [ 1013.767870] ffffffff8118f8a7 ffffffff811072bb ffffffff81f2b6c5 ffffffff81f11bdb [ 1013.790021] Call Trace: [ 1013.800701] Code: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a <e7> d7 64 81 ff ff ff ff 01 00 00 00 00 00 00 00 65 d9 64 81 ff [ 1013.861443] RIP [<ffff88014630a000>] 0xffff880146309fff [ 1013.884466] RSP <ffff88014780f408> [ 1013.901507] CR2: 0000000000000002 The solution was to reuse the NMI functions that change the IDT table to make the debug stack keep its current stack (in kernel mode) when hitting a breakpoint: call debug_stack_set_zero TRACE_IRQS_ON call debug_stack_reset If the TRACE_IRQS_ON happens to hit a breakpoint then it will keep the current stack and not crash the box. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-06-01x86: Allow nesting of the debug stack IDT settingSteven Rostedt
When the NMI handler runs, it checks if it preempted a debug handler and if that handler is using the debug stack. If it is, it changes the IDT table not to update the stack, otherwise it will reset the debug stack and corrupt the debug handler it preempted. Now that ftrace uses breakpoints to change functions from nops to callers, many more places may hit a breakpoint. Unfortunately this includes some of the calls that lockdep performs. Which causes issues with the debug stack. It too needs to change the debug stack before tracing (if called from the debug handler). Allow the debug_stack_set_zero() and debug_stack_reset() to be nested so that the debug handlers can take advantage of them too. [ Used this_cpu_*() over __get_cpu_var() as suggested by H. Peter Anvin ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-06-01x86: Reset the debug_stack update counterSteven Rostedt
When an NMI goes off and it sees that it preempted the debug stack, to keep the debug stack safe, it changes the IDT to point to one that does not modify the stack on breakpoint (to allow breakpoints in NMIs). But the variable that gets set to know to undo it on exit never gets cleared on exit. Thus every NMI will reset it on exit the first time it is done even if it does not need to be reset. [ Added H. Peter Anvin's suggestion to use this_cpu_read/write ] Cc: <stable@vger.kernel.org> # v3.3 Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-06-01ftrace: Use breakpoint method to update ftrace callerSteven Rostedt
On boot up and module load, it is fine to modify the code directly, without the use of breakpoints. This is because boot up modification is done before SMP is initialized, thus the modification is serial, and module load is done before the module executes. But after that we must use a SMP safe method to modify running code. Otherwise, if we are running the function tracer and update its function (by starting off the stack tracer, or perf tracing) the change of the function called by the ftrace trampoline is done directly. If this is being executed on another CPU, that CPU may take a GPF and crash the kernel. The breakpoint method is used to change the nops at all the functions, but the change of the ftrace callback handler itself was still using a direct modification. If tracing was enabled and the function callback was changed then another CPU could fault if it was currently calling the original callback. This modification must use the breakpoint method too. Note, the direct method is still used for boot up and module load. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-06-01ftrace: Synchronize variable setting with breakpointsSteven Rostedt
When the function tracer starts modifying the code via breakpoints it sets a variable (modifying_ftrace_code) to inform the breakpoint handler to call the ftrace int3 code. But there's no synchronization between setting this code and the handler, thus it is possible for the handler to be called on another CPU before it sees the variable. This will cause a kernel crash as the int3 handler will not know what to do with it. I originally added smp_mb()'s to force the visibility of the variable but H. Peter Anvin suggested that I just make it atomic. [ Added comments as suggested by Peter Zijlstra ] Suggested-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>