summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2011-10-17x25: Prevent skb overreads when checking call user dataMatthew Daley
x25_find_listener does not check that the amount of call user data given in the skb is big enough in per-socket comparisons, hence buffer overreads may occur. Fix this by adding a check. Signed-off-by: Matthew Daley <mattjd@gmail.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Andrew Hendry <andrew.hendry@gmail.com> Cc: stable <stable@kernel.org> Acked-by: Andrew Hendry <andrew.hendry@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-17x25: Handle undersized/fragmented skbsMatthew Daley
There are multiple locations in the X.25 packet layer where a skb is assumed to be of at least a certain size and that all its data is currently available at skb->data. These assumptions are not checked, hence buffer overreads may occur. Use pskb_may_pull to check these minimal size assumptions and ensure that data is available at skb->data when necessary, as well as use skb_copy_bits where needed. Signed-off-by: Matthew Daley <mattjd@gmail.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Andrew Hendry <andrew.hendry@gmail.com> Cc: stable <stable@kernel.org> Acked-by: Andrew Hendry <andrew.hendry@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-17x25: Validate incoming call user data lengthsMatthew Daley
X.25 call user data is being copied in its entirety from incoming messages without consideration to the size of the destination buffers, leading to possible buffer overflows. Validate incoming call user data lengths before these copies are performed. It appears this issue was noticed some time ago, however nothing seemed to come of it: see http://www.spinics.net/lists/linux-x25/msg00043.html and commit 8db09f26f912f7c90c764806e804b558da520d4f. Signed-off-by: Matthew Daley <mattjd@gmail.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Tested-by: Andrew Hendry <andrew.hendry@gmail.com> Cc: stable <stable@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-17udplite: fast-path computation of checksum coverageGerrit Renker
Commit 903ab86d195cca295379699299c5fc10beba31c7 of 1 March this year ("udp: Add lockless transmit path") introduced a new fast TX path that broke the checksum coverage computation of UDP-lite, which so far depended on up->len (only set if the socket is locked and 0 in the fast path). Fixed by providing both fast- and slow-path computation of checksum coverage. The latter can be removed when UDP(-lite)v6 also uses a lockless transmit path. Reported-by: Thomas Volkert <thomas@homer-conferencing.com> Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-17Avoid using variable-length arrays in kernel/sys.cLinus Torvalds
The size is always valid, but variable-length arrays generate worse code for no good reason (unless the function happens to be inlined and the compiler sees the length for the simple constant it is). Also, there seems to be some code generation problem on POWER, where Henrik Bakken reports that register r28 can get corrupted under some subtle circumstances (interrupt happening at the wrong time?). That all indicates some seriously broken compiler issues, but since variable length arrays are bad regardless, there's little point in trying to chase it down. "Just don't do that, then". Reported-by: Henrik Grindal Bakken <henribak@cisco.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-10-17ARM: 7114/1: cache-l2x0: add resume entry for l2 in secure modeBarry Song
we save the l2x0 registers at the first initialization, and platform codes can get them to restore l2x0 status after wakeup. Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Barry Song <Baohua.Song@csr.com> Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Tested-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-10-17ARM: 7023/1: L2x0: Add interrupts property to OF bindingMark Rutland
Following the discussion here: http://lists.ozlabs.org/pipermail/devicetree-discuss/2011-August/007301.html The L2x0 L2 Cache Controllers support a combined interrupt line which can be used for several events (e.g. read/write/parity errors on tag/data RAM, event counter increment/overflow). Unfortunately the OF binding added in c519ecf2 ("ARM: 7009/1: l2x0: Add OF based initialization") does not represent the interrupt. This patch adds an "interrupts" property to the L2x0 OF binding, representing the combined interrupt line. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Rob Herring <rob.herring@calxeda.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Grant Likely <grant.likely@secretlab.ca> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Olof Johansson <olof@lixom.net> Cc: Barry Song <21cnbao@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-10-17ARM: 7090/1: CACHE-L2X0: filter start address can be 0 and is often 0Barry Song
this patch fixes the error in Rob Herring's ARM: 7009/1: l2x0: Add OF based initialization http://www.spinics.net/lists/arm-kernel/msg131123.html it has been in rmk/for-next with commit 41c86ff5b Cc: Shawn Guo <shawn.guo@linaro.org> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Barry Song <Baohua.Song@csr.com> Acked-by: Rob Herring <robherring2@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-10-17ARM: 7089/1: L2X0: add explicit cpu_relax() for busy wait loopBarry Song
using cpu_relax in busy loops is a well-known idiom in the kernel. It's more for documentation purposes than technically needed here. Signed-off-by: Barry Song <Baohua.Song@csr.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Jamie Iles <jamie@jamieiles.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-10-17ARM: 7009/1: l2x0: Add OF based initializationRob Herring
This adds probing for ARM L2x0 cache controllers via device tree. Support includes the L210, L220, and PL310 controllers. The binding allows setting up cache RAM latencies and filter addresses (PL310 only). Signed-off-by: Rob Herring <rob.herring@calxeda.com> Acked-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Olof Johansson <olof@lixom.net> Acked-by: Barry Song <21cnbao@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-10-17ARM: 7116/1: debug: provide dummy default option for DEBUG_LL UART choiceWill Deacon
Defaulting to DEBUG_ICEDCC will cause systems to hang during boot unless a hardware debugger is listening to the debug comms. channel. This patch adds a dummy UART option as the default DEBUG_LL choice which requires the platform to do the right thing. Acked-by: Stephen Boyd <sboyd@codeaurora.org> Tested-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-10-17ARM: 7097/1: debug: Move DEBUG_ICEDCC into the DEBUG_LL choiceStephen Boyd
DEBUG_ICEDCC support is just another DEBUG_LL choice and selecting it along with other DEBUG_LL options doesn't make much sense. Put it into the DEBUG_LL choice to avoid confusion. Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-10-17ARM: 7096/1: debug: Add UART1 config choicesStephen Boyd
ARM patch 7072/1 (debug: use kconfig choice for selecting DEBUG_LL UART) didn't notice that the Kconfigs relied on being unselected to configure a different serial port. Since there is no NONE option in a choice menu, explicitly add the other option so that both serial ports can be selected. Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-10-17ARM: 7073/1: debug: augment DEBUG_LL Kconfig help to clarify behaviourWill Deacon
Enabled DEBUG_LL hardcodes the UART address into the kernel and results in a non-portable kernel image. Since this option is only intended for use when debugging early boot failures, supporting multiple platforms in such a configuration is not the intended use-case. This patch documents this limitation in the DEBUG_LL Kconfig help text, so that users are aware of the portability restrictions that are associated with enabling low-level debugging support. Reported-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-10-17ARM: 7072/1: debug: use kconfig choice for selecting DEBUG_LL UARTWill Deacon
Enabling CONFIG_DEBUG_LL (which is required for earlyprintk) hardwires the debug UART address into the kernel, so that we can print before the platform is initialised. If the user inadvertently selects multiple platforms with DEBUG_LL enabled, the UART address may not be correct and will likely cause the kernel to hang in the very early stages of boot. This patch, based on a skeleton from Russell, uses a Kconfig choice for selecting the DEBUG_LL UART, therefore allowing the user to make a choice about the supported platform when DEBUG_LL is enabled. Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-10-17ARM: 7115/4: move __exception and friends to asm/exception.hJamie Iles
The definition of __exception_irq_entry for CONFIG_FUNCTION_GRAPH_TRACER=y needs linux/ftrace.h, but this creates a circular dependency with it's current home in asm/system.h. Create asm/exception.h and update all current users. v4: - rebase to rmk/for-next v3: - remove redundant includes of linux/ftrace.h v2: - document the usage restricitions of __exception* Cc: Zoltan Devai <zdevai@gmail.com> Signed-off-by: Jamie Iles <jamie@jamieiles.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-10-17ARM: 7124/1: smp: Add a localtimer handler callable from C codeShawn Guo
In order to be able to handle localtimer directly from C code instead of assembly code, introduce handle_local_timer(), which is modeled after handle_IRQ(). Signed-off-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-10-17ARM: 7123/1: smp: Add an IPI handler callable from C codeShawn Guo
In order to be able to handle IPI directly from C code instead of assembly code, introduce handle_IPI(), which is modeled after handle_IRQ(). Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-10-17ARM: 7100/1: smp_scu: remove __init annotation from scu_enable()Shawn Guo
When Cortex-A9 MPCore resumes from Dormant or Shutdown modes, SCU needs to be re-enabled. This patch removes __init annotation from function scu_enable(), so that platform resume procedure can call it to re-enable SCU. Signed-off-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-10-17ARM: 7061/1: gic: convert logical CPU numbers into physical numbersWill Deacon
The GIC driver must convert logical CPU numbers passed in from Linux into physical CPU numbers that are understood by the hardware. This patch uses the new cpu_logical_map macro for performing the conversion inside the GIC driver. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-10-17ARM: 7060/1: smp: populate logical CPU mapping during bootWill Deacon
To allow booting Linux on a CPU with physical ID != 0, we need to provide a mapping from the logical CPU number to the physical CPU number. This patch adds such a mapping and populates it during boot. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-10-17ARM: 7011/1: Add ARM cpu topology definitionVincent Guittot
The affinity between ARM processors is defined in the MPIDR register. We can identify which processors are in the same cluster, and which ones have performance interdependency. We can define the cpu topology of ARM platform, that is then used by sched_mc and sched_smt. The default state of sched_mc and sched_smt config is disable. When enabled, the behavior of the scheduler can be modified with sched_mc_power_savings and sched_smt_power_savings sysfs interfaces. Changes since v4 : * Remove unnecessary parentheses and blank lines Changes since v3 : * Update the format of printk message * Remove blank line Changes since v2 : * Update the commit message and some comments Changes since v1 : * Update the commit message * Add read_cpuid_mpidr in arch/arm/include/asm/cputype.h * Modify header of arch/arm/kernel/topology.c * Modify tests and manipulation of MPIDR's bitfields * Modify the place and dependancy of the config * Modify Noop functions Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-10-16Merge branch 'fixes' of ↵Linus Torvalds
http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm * 'fixes' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm: ARM: 7128/1: vic: Don't write to the read-only register VIC_IRQ_STATUS ARM: 7122/1: localtimer: add header linux/errno.h explicitly ARM: 7117/1: perf: fix HW_CACHE_* events on Cortex-A9 ARM: 7113/1: mm: Align bank start to MAX_ORDER_NR_PAGES
2011-10-15ARM: 7128/1: vic: Don't write to the read-only register VIC_IRQ_STATUSZoltan Devai
This is unneeded and causes an abort on the SPMP8000 platform. Acked-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Zoltan Devai <zoss@devai.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-10-15ARM: 7122/1: localtimer: add header linux/errno.h explicitlyShawn Guo
Per the text in Documentation/SubmitChecklist as below, we should explicitly have header linux/errno.h in localtimer.h for ENXIO reference. 1: If you use a facility then #include the file that defines/declares that facility. Don't depend on other header files pulling in ones that you use. Otherwise, we may run into some compiling error like the following one, if any file includes localtimer.h without CONFIG_LOCAL_TIMERS defined. arch/arm/include/asm/localtimer.h: In function ‘local_timer_setup’: arch/arm/include/asm/localtimer.h:53:10: error: ‘ENXIO’ undeclared (first use in this function) Signed-off-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-10-15ARM: 7117/1: perf: fix HW_CACHE_* events on Cortex-A9Will Deacon
Using COHERENT_LINE_{MISS,HIT} for cache misses and references respectively is completely wrong. Instead, use the L1D events which are a better and more useful approximation despite ignoring instruction traffic. Reported-by: Alasdair Grant <alasdair.grant@arm.com> Reported-by: Matt Horsnell <matt.horsnell@arm.com> Reported-by: Michael Williams <michael.williams@arm.com> Cc: stable@kernel.org Cc: Jean Pihet <j-pihet@ti.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-10-14Merge branch 'hwmon-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging * 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (w83627ehf) Properly report thermal diode sensors
2011-10-14intel-iommu: Export a flag indicating that the IOMMU is used for iGFX.David Woodhouse
We really don't want this to work in the general case; device drivers *shouldn't* care whether they are behind an IOMMU or not. But the integrated graphics is a special case, because the IOMMU and the GTT are all kind of smashed into one and generally horrifically buggy, so it's reasonable for the graphics driver to want to know when the IOMMU is active for the graphics hardware. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2011-10-14intel-iommu: Workaround IOTLB hang on Ironlake GPUDavid Woodhouse
To work around a hardware issue, we have to submit IOTLB flushes while the graphics engine is idle. The graphics driver will (we hope) go to great lengths to ensure that it gets that right on the affected chipset(s)... so let's not screw it over by deferring the unmap and doing it later. That wouldn't be very helpful. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2011-10-14Merge branch 'gpio/merge' of git://git.secretlab.ca/git/linux-2.6Linus Torvalds
* 'gpio/merge' of git://git.secretlab.ca/git/linux-2.6: gpio-pca953x: fix gpio_base gpio/omap: fix build error with certain OMAP1 configs
2011-10-14Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfsLinus Torvalds
* 'for-linus' of git://oss.sgi.com/xfs/xfs: xfs: revert to using a kthread for AIL pushing xfs: force the log if we encounter pinned buffers in .iop_pushbuf xfs: do not update xa_last_pushed_lsn for locked items
2011-10-14Merge branch 'stable' of git://github.com/cmetcalf-tilera/linux-tileLinus Torvalds
* 'stable' of git://github.com/cmetcalf-tilera/linux-tile: tile: revert change from <asm/atomic.h> to <linux/atomic.h> in asm files
2011-10-14Merge branch 'x86-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tipLinus Torvalds
* 'x86-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip: x86: Default to vsyscall=native for now
2011-10-14x86, mrst: use a temporary variable for SFI irqMika Westerberg
SFI tables reside in RAM and should not be modified once they are written. Current code went to set pentry->irq to zero which causes subsequent reads to fail with invalid SFI table checksum. This will break kexec as the second kernel fails to validate SFI tables. To fix this we use temporary variable for irq number. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-10-13hwmon: (w83627ehf) Properly report thermal diode sensorsJean Delvare
The w83627ehf driver is improperly reporting thermal diode sensors as type 2, instead of 3. This caused "sensors" and possibly other monitoring tools to report these sensors as "transistor" instead of "thermal diode". Furthermore, diode subtype selection (CPU vs. external) is only supported by the original W83627EHF/EHG. All later models only support CPU diode type, and some (NCT6776F) don't even have the register in question so we should avoid reading from it. Signed-off-by: Jean Delvare <khali@linux-fr.org> Cc: stable@kernel.org Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
2011-10-13gpio-pca953x: fix gpio_baseHartmut Knaack
gpio_base was set to 0 if no system platform data or open firmware platform data was provided. This led to conflicts, if any other gpiochip with a gpiobase of 0 was instantiated already. Setting it to -1 will automatically use the first one available. Signed-off-by: Hartmut Knaack <knaack.h@gmx.de> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-10-13gpio/omap: fix build error with certain OMAP1 configsJanusz Krzysztofik
With commit f64ad1a0e21a, "gpio/omap: cleanup _set_gpio_wakeup(), remove ifdefs", access to build time conditionally omitted 'suspend_wakeup' member of the 'gpio_bank' structure has been placed unconditionally in function _set_gpio_wakeup(), which is always built. This resulted in the driver compilation broken for certain OMAP1, i.e., non-OMAP16xx, configurations. Really required or not in previously excluded cases, define this structure member unconditionally as a fix. Tested with a custom OMAP1510 only configuration. Signed-off-by: Janusz Krzysztofik <jkrzyszt@tis.icnet.pl> Acked-by: Kevin Hilman <khilman@ti.com> Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-10-13tile: revert change from <asm/atomic.h> to <linux/atomic.h> in asm filesChris Metcalf
The 32-bit TILEPro support uses some #defines in <asm/atomic_32.h> for atomic support routines in assembly. To make this more explicit, I've turned those includes into includes of <asm/atomic_32.h>, which should hopefully make it clear that they shouldn't be bombed into <linux/atomic.h> in any cleanups. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2011-10-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: mscan: too much data copied to CAN frame due to 16 bit accesses gro: refetch inet6_protos[] after pulling ext headers bnx2x: fix cl_id allocation for non-eth clients for NPAR mode mlx4_en: fix endianness with blue frame support
2011-10-13ide: Fix file references in drivers/ide/Johann Felix Soden
Fix file references in drivers/ide/ There are a lot of file references to now moved or deleted files in the whole tree, especially in documentation and Kconfig files. This patch fixes the references in drivers/ide/. Signed-off-by: Johann Felix Soden <johfel@users.sourceforge.net> Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-10-13Merge branch 'btrfs-3.0' of git://github.com/chrismason/linuxLinus Torvalds
* 'btrfs-3.0' of git://github.com/chrismason/linux: Btrfs: make sure not to defrag extents past i_size Btrfs: fix recursive auto-defrag
2011-10-12sparc: Avoid calling sigprocmask()David S. Miller
Use set_current_blocked() instead. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-12sparc: Use set_current_blocked()Matt Fleming
As described in e6fa16ab ("signal: sigprocmask() should do retarget_shared_pending()") the modification of current->blocked is incorrect as we need to check whether the signal we're about to block is pending in the shared queue. Cc: Oleg Nesterov <oleg@redhat.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Matt Fleming <matt.fleming@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-12IPVS netns shutdown/startup dead-lockHans Schillstrom
ip_vs_mutext is used by both netns shutdown code and startup and both implicit uses sk_lock-AF_INET mutex. cleanup CPU-1 startup CPU-2 ip_vs_dst_event() ip_vs_genl_set_cmd() sk_lock-AF_INET __ip_vs_mutex sk_lock-AF_INET __ip_vs_mutex * DEAD LOCK * A new mutex placed in ip_vs netns struct called sync_mutex is added. Comments from Julian and Simon added. This patch has been running for more than 3 month now and it seems to work. Ver. 3 IP_VS_SO_GET_DAEMON in do_ip_vs_get_ctl protected by sync_mutex instead of __ip_vs_mutex as sugested by Julian. Signed-off-by: Hans Schillstrom <hans@schillstrom.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-10-11xfs: revert to using a kthread for AIL pushingChristoph Hellwig
Currently we have a few issues with the way the workqueue code is used to implement AIL pushing: - it accidentally uses the same workqueue as the syncer action, and thus can be prevented from running if there are enough sync actions active in the system. - it doesn't use the HIGHPRI flag to queue at the head of the queue of work items At this point I'm not confident enough in getting all the workqueue flags and tweaks right to provide a perfectly reliable execution context for AIL pushing, which is the most important piece in XFS to make forward progress when the log fills. Revert back to use a kthread per filesystem which fixes all the above issues at the cost of having a task struct and stack around for each mounted filesystem. In addition this also gives us much better ways to diagnose any issues involving hung AIL pushing and removes a small amount of code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Stefan Priebe <s.priebe@profihost.ag> Tested-by: Stefan Priebe <s.priebe@profihost.ag> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com>
2011-10-11xfs: force the log if we encounter pinned buffers in .iop_pushbufChristoph Hellwig
We need to check for pinned buffers even in .iop_pushbuf given that inode items flush into the same buffers that may be pinned directly due operations on the unlinked inode list operating directly on buffers. To do this add a return value to .iop_pushbuf that tells the AIL push about this and use the existing log force mechanisms to unpin it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Stefan Priebe <s.priebe@profihost.ag> Tested-by: Stefan Priebe <s.priebe@profihost.ag> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com>
2011-10-11xfs: do not update xa_last_pushed_lsn for locked itemsChristoph Hellwig
If an item was locked we should not update xa_last_pushed_lsn and thus skip it when restarting the AIL scan as we need to be able to lock and write it out as soon as possible. Otherwise heavy lock contention might starve AIL pushing too easily, especially given the larger backoff once we moved xa_last_pushed_lsn all the way to the target lsn. Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Stefan Priebe <s.priebe@profihost.ag> Tested-by: Stefan Priebe <s.priebe@profihost.ag> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com>
2011-10-11Btrfs: make sure not to defrag extents past i_sizeChris Mason
The btrfs file defrag code will loop through the extents and force COW on them. But there is a concurrent truncate in the middle of the defrag, it might end up defragging the same range over and over again. The problem is that writepage won't go through and do anything on pages past i_size, so the cow won't happen, so the file will appear to still be fragmented. defrag will end up hitting the same extents again and again. In the worst case, the truncate can actually live lock with the defrag because the defrag keeps creating new ordered extents which the truncate code keeps waiting on. The fix here is to make defrag check for i_size inside the main loop, instead of just once before the looping starts. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-10-11x86: Default to vsyscall=native for nowAdrian Bunk
This UML breakage: linux-2.6.30.1[3800] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb9c498 ax:ffffffffff600000 si:0 di:606790 linux-2.6.30.1[3856] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb13168 ax:ffffffffff600000 si:0 di:606790 Is caused by commit 3ae36655 ("x86-64: Rework vsyscall emulation and add vsyscall= parameter") - the vsyscall emulation code is not fully cooked yet as UML relies on some rather fragile SIGSEGV semantics. Linus suggested in https://lkml.org/lkml/2011/8/9/376 to default to vsyscall=native for now, this patch implements that. Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Andrew Lutomirski <luto@mit.edu> Cc: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/20111005214047.GE14406@localhost.pp.htv.fi Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-10-10intel-iommu: Fix AB-BA lockdep reportRoland Dreier
When unbinding a device so that I could pass it through to a KVM VM, I got the lockdep report below. It looks like a legitimate lock ordering problem: - domain_context_mapping_one() takes iommu->lock and calls iommu_support_dev_iotlb(), which takes device_domain_lock (inside iommu->lock). - domain_remove_one_dev_info() starts by taking device_domain_lock then takes iommu->lock inside it (near the end of the function). So this is the classic AB-BA deadlock. It looks like a safe fix is to simply release device_domain_lock a bit earlier, since as far as I can tell, it doesn't protect any of the stuff accessed at the end of domain_remove_one_dev_info() anyway. BTW, the use of device_domain_lock looks a bit unsafe to me... it's at least not obvious to me why we aren't vulnerable to the race below: iommu_support_dev_iotlb() domain_remove_dev_info() lock device_domain_lock find info unlock device_domain_lock lock device_domain_lock find same info unlock device_domain_lock free_devinfo_mem(info) do stuff with info after it's free However I don't understand the locking here well enough to know if this is a real problem, let alone what the best fix is. Anyway here's the full lockdep output that prompted all of this: ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.39.1+ #1 ------------------------------------------------------- bash/13954 is trying to acquire lock: (&(&iommu->lock)->rlock){......}, at: [<ffffffff812f6421>] domain_remove_one_dev_info+0x121/0x230 but task is already holding lock: (device_domain_lock){-.-...}, at: [<ffffffff812f6508>] domain_remove_one_dev_info+0x208/0x230 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (device_domain_lock){-.-...}: [<ffffffff8109ca9d>] lock_acquire+0x9d/0x130 [<ffffffff81571475>] _raw_spin_lock_irqsave+0x55/0xa0 [<ffffffff812f8350>] domain_context_mapping_one+0x600/0x750 [<ffffffff812f84df>] domain_context_mapping+0x3f/0x120 [<ffffffff812f9175>] iommu_prepare_identity_map+0x1c5/0x1e0 [<ffffffff81ccf1ca>] intel_iommu_init+0x88e/0xb5e [<ffffffff81cab204>] pci_iommu_init+0x16/0x41 [<ffffffff81002165>] do_one_initcall+0x45/0x190 [<ffffffff81ca3d3f>] kernel_init+0xe3/0x168 [<ffffffff8157ac24>] kernel_thread_helper+0x4/0x10 -> #0 (&(&iommu->lock)->rlock){......}: [<ffffffff8109bf3e>] __lock_acquire+0x195e/0x1e10 [<ffffffff8109ca9d>] lock_acquire+0x9d/0x130 [<ffffffff81571475>] _raw_spin_lock_irqsave+0x55/0xa0 [<ffffffff812f6421>] domain_remove_one_dev_info+0x121/0x230 [<ffffffff812f8b42>] device_notifier+0x72/0x90 [<ffffffff8157555c>] notifier_call_chain+0x8c/0xc0 [<ffffffff81089768>] __blocking_notifier_call_chain+0x78/0xb0 [<ffffffff810897b6>] blocking_notifier_call_chain+0x16/0x20 [<ffffffff81373a5c>] __device_release_driver+0xbc/0xe0 [<ffffffff81373ccf>] device_release_driver+0x2f/0x50 [<ffffffff81372ee3>] driver_unbind+0xa3/0xc0 [<ffffffff813724ac>] drv_attr_store+0x2c/0x30 [<ffffffff811e4506>] sysfs_write_file+0xe6/0x170 [<ffffffff8117569e>] vfs_write+0xce/0x190 [<ffffffff811759e4>] sys_write+0x54/0xa0 [<ffffffff81579a82>] system_call_fastpath+0x16/0x1b other info that might help us debug this: 6 locks held by bash/13954: #0: (&buffer->mutex){+.+.+.}, at: [<ffffffff811e4464>] sysfs_write_file+0x44/0x170 #1: (s_active#3){++++.+}, at: [<ffffffff811e44ed>] sysfs_write_file+0xcd/0x170 #2: (&__lockdep_no_validate__){+.+.+.}, at: [<ffffffff81372edb>] driver_unbind+0x9b/0xc0 #3: (&__lockdep_no_validate__){+.+.+.}, at: [<ffffffff81373cc7>] device_release_driver+0x27/0x50 #4: (&(&priv->bus_notifier)->rwsem){.+.+.+}, at: [<ffffffff8108974f>] __blocking_notifier_call_chain+0x5f/0xb0 #5: (device_domain_lock){-.-...}, at: [<ffffffff812f6508>] domain_remove_one_dev_info+0x208/0x230 stack backtrace: Pid: 13954, comm: bash Not tainted 2.6.39.1+ #1 Call Trace: [<ffffffff810993a7>] print_circular_bug+0xf7/0x100 [<ffffffff8109bf3e>] __lock_acquire+0x195e/0x1e10 [<ffffffff810972bd>] ? trace_hardirqs_off+0xd/0x10 [<ffffffff8109d57d>] ? trace_hardirqs_on_caller+0x13d/0x180 [<ffffffff8109ca9d>] lock_acquire+0x9d/0x130 [<ffffffff812f6421>] ? domain_remove_one_dev_info+0x121/0x230 [<ffffffff81571475>] _raw_spin_lock_irqsave+0x55/0xa0 [<ffffffff812f6421>] ? domain_remove_one_dev_info+0x121/0x230 [<ffffffff810972bd>] ? trace_hardirqs_off+0xd/0x10 [<ffffffff812f6421>] domain_remove_one_dev_info+0x121/0x230 [<ffffffff812f8b42>] device_notifier+0x72/0x90 [<ffffffff8157555c>] notifier_call_chain+0x8c/0xc0 [<ffffffff81089768>] __blocking_notifier_call_chain+0x78/0xb0 [<ffffffff810897b6>] blocking_notifier_call_chain+0x16/0x20 [<ffffffff81373a5c>] __device_release_driver+0xbc/0xe0 [<ffffffff81373ccf>] device_release_driver+0x2f/0x50 [<ffffffff81372ee3>] driver_unbind+0xa3/0xc0 [<ffffffff813724ac>] drv_attr_store+0x2c/0x30 [<ffffffff811e4506>] sysfs_write_file+0xe6/0x170 [<ffffffff8117569e>] vfs_write+0xce/0x190 [<ffffffff811759e4>] sys_write+0x54/0xa0 [<ffffffff81579a82>] system_call_fastpath+0x16/0x1b Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>