summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2013-04-03ARM: 7685/1: delay: use private ticks_per_jiffy field for timer-based delay opsWill Deacon
Commit 70264367a243 ("ARM: 7653/2: do not scale loops_per_jiffy when using a constant delay clock") fixed a problem with our timer-based delay loop, where loops_per_jiffy is scaled by cpufreq yet used directly by the timer delay ops. This patch fixes the problem in a more elegant way by keeping a private ticks_per_jiffy field in the delay ops, independent of loops_per_jiffy and therefore not subject to scaling. The loop-based delay continues to use loops_per_jiffy directly, as it should. Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-04-03ARM: 7684/1: errata: Workaround for Cortex-A15 erratum 798181 (TLBI/DSB ↵Catalin Marinas
operations) On Cortex-A15 (r0p0..r3p2) the TLBI/DSB are not adequately shooting down all use of the old entries. This patch implements the erratum workaround which consists of: 1. Dummy TLBIMVAIS and DSB on the CPU doing the TLBI operation. 2. Send IPI to the CPUs that are running the same mm (and ASID) as the one being invalidated (or all the online CPUs for global pages). 3. CPU receiving the IPI executes a DMB and CLREX (part of the exception return code already). Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-04-03ARM: 7682/1: cache-l2x0: fix masking of RTL revision numbering and set_debug ↵Rob Herring
init Commit b8db6b8 (ARM: 7547/4: cache-l2x0: add support for Aurora L2 cache ctrl) moved the masking of the part ID which caused the RTL version to be lost. Commit 6248d06 (ARM: 7545/1: cache-l2x0: make outer_cache_fns a field of l2x0_of_data) changed how .set_debug is initialized. Both commits break commit 74ddcdb (ARM: 7608/1: l2x0: Only set .set_debug on PL310 r3p0 and earlier) which uses the RTL version to conditionally set .set_debug function pointer. Commit b8db6b8 also caused the printed cache ID to be missing the version information. Fix this by reverting how the part number is masked so the RTL version info is maintained. The cache-id-part DT property does not set the RTL bits so masking them should have no effect. Also, re-arrange the order of the function pointer init so the .set_debug function can be overridden. Reported-by: Paolo Pisati <paolo.pisati@canonical.com> Signed-off-by: Rob Herring <rob.herring@calxeda.com> Cc: Gregory CLEMENT <gregory.clement@free-electrons.com> Cc: Yehuda Yitschak <yehuday@marvell.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-04-03ARM: iWMMXt: always enable iWMMXt support with PJ4 CPUsRussell King
Jason Cooper reports these build errors: arch/arm/kernel/built-in.o: In function `iwmmxt_do': /.../arch/arm/kernel/pj4-cp0.c:36: undefined reference to `iwmmxt_task_release' /.../arch/arm/kernel/pj4-cp0.c:40: undefined reference to `iwmmxt_task_switch' make: *** [vmlinux] Error 1 This is caused because the PJ4 code explicitly references the iWMMXt code, but doesn't require it to be built. Fix this by ensuring that iWMMXt is always enabled with PJ4. Reported-by: Jason Cooper <jason@lakedaemon.net> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-03-22ARM: 7681/1: hw_breakpoint: use warn_once to avoid spam from reset_ctrl_regs()Santosh Shilimkar
CPU debug features like hardware break, watchpoints can be used only when the debug mode is enabled and available. Unfortunately on OMAP4 based devices, after a CPU power cycle, the debug feature gets disabled which leads to a flood of messages coming from reset_ctrl_regs() which gets called on every CPU_PM_EXIT with CPUidle enabled. So make use of warn_once() so that system is usable. Thanks to Will for pointers and Lokesh for the analysis of the issue. Tested-by: Lokesh Vutla <lokeshvutla@ti.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-03-22ARM: 7678/1: Work around faulty ISAR0 register in some Krait CPUsStepan Moskovchenko
Some early versions of the Krait CPU design incorrectly indicate that they only support the UDIV and SDIV instructions in Thumb mode when they actually support them in ARM and Thumb mode. It seems that these CPUs follow the DDI0406B ARM ARM which has two possible values for the divide instructions field, instead of the DDI0406C document which has three possible values. Work around this problem by checking the MIDR against Krait CPUs with this faulty ISAR0 register and force the hwcaps to indicate support in both modes. [sboyd: Rewrote commit text to reflect real reasoning now that we autodetect udiv/sdiv] Signed-off-by: Stepan Moskovchenko <stepanm@codeaurora.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-03-22ARM: 7680/1: Detect support for SDIV/UDIV from ISAR0 registerStephen Boyd
The ISAR0 register indicates support for the SDIV and UDIV instructions in both the Thumb and ARM instruction set. Read the register to detect the supported instructions and update the elf_hwcap mask as appropriate. This is better than adding more and more cpuid checks in proc-v7.S for each new cpu variant that supports these instructions. Acked-by: Will Deacon <will.deacon@arm.com> Cc: Stepan Moskovchenko <stepanm@codeaurora.org> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-03-22ARM: 7679/1: Clear IDIVT hwcap if CONFIG_ARM_THUMB=nStephen Boyd
Don't advertise support for the SDIV/UDIV thumb instructions if the kernel is not compiled with support for thumb userspace. This is in line with how we remove the THUMB hwcap in these configurations. Acked-by: Will Deacon <will.deacon@arm.com> Cc: Stepan Moskovchenko <stepanm@codeaurora.org> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-03-22ARM: 7677/1: LPAE: Fix mapping in alloc_init_section for unaligned addressesSricharan R
With LPAE enabled, alloc_init_section() does not map the entire address space for unaligned addresses. The issue also reproduced with CMA + LPAE. CMA tries to map 16MB with page granularity mappings during boot. alloc_init_pte() is called and out of 16MB, only 2MB gets mapped and rest remains unaccessible. Because of this OMAP5 boot is broken with CMA + LPAE enabled. Fix the issue by ensuring that the entire addresses are mapped. Signed-off-by: R Sricharan <r.sricharan@ti.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoffer Dall <chris@cloudcar.com> Cc: Santosh Shilimkar <santosh.shilimkar@ti.com> Tested-by: Laura Abbott <lauraa@codeaurora.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Christoffer Dall <chris@cloudcar.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-03-22Merge branch 'kvm-arm/vgic-fixes' of ↵Russell King
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into fixes
2013-03-22Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull CIFS fixes from Steve French: "Three small CIFS Fixes (the most important of the three fixes a recent problem authenticating to Windows 8 using cifs rather than SMB2)" * 'for-next' of git://git.samba.org/sfrench/cifs-2.6: cifs: ignore everything in SPNEGO blob after mechTypes cifs: delay super block destruction until all cifsFileInfo objects are gone cifs: map NT_STATUS_SHARING_VIOLATION to EBUSY instead of ETXTBSY
2013-03-22Merge tag 'ext4_for_linue' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "Fix a number of regression and other bugs in ext4, most of which were relatively obscure cornercases or races that were found using regression tests." * tag 'ext4_for_linue' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (21 commits) ext4: fix data=journal fast mount/umount hang ext4: fix ext4_evict_inode() racing against workqueue processing code ext4: fix memory leakage in mext_check_coverage ext4: use s_extent_max_zeroout_kb value as number of kb ext4: use atomic64_t for the per-flexbg free_clusters count jbd2: fix use after free in jbd2_journal_dirty_metadata() ext4: reserve metadata block for every delayed write ext4: update reserved space after the 'correction' ext4: do not use yield() ext4: remove unused variable in ext4_free_blocks() ext4: fix WARN_ON from ext4_releasepage() ext4: fix the wrong number of the allocated blocks in ext4_split_extent() ext4: update extent status tree after an extent is zeroed out ext4: fix wrong m_len value after unwritten extent conversion ext4: add self-testing infrastructure to do a sanity check ext4: avoid a potential overflow in ext4_es_can_be_merged() ext4: invalidate extent status tree during extent migration ext4: remove unnecessary wait for extent conversion in ext4_fallocate() ext4: add warning to ext4_convert_unwritten_extents_endio ext4: disable merging of uninitialized extents ...
2013-03-21cifs: ignore everything in SPNEGO blob after mechTypesJeff Layton
We've had several reports of people attempting to mount Windows 8 shares and getting failures with a return code of -EINVAL. The default sec= mode changed recently to sec=ntlmssp. With that, we expect and parse a SPNEGO blob from the server in the NEGOTIATE reply. The current decode_negTokenInit function first parses all of the mechTypes and then tries to parse the rest of the negTokenInit reply. The parser however currently expects a mechListMIC or nothing to follow the mechTypes, but Windows 8 puts a mechToken field there instead to carry some info for the new NegoEx stuff. In practice, we don't do anything with the fields after the mechTypes anyway so I don't see any real benefit in continuing to parse them. This patch just has the kernel ignore the fields after the mechTypes. We'll probably need to reinstate some of this if we ever want to support NegoEx. Reported-by: Jason Burgess <jason@jacknife2.dns2go.com> Reported-by: Yan Li <elliot.li.tech@gmail.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Steve French <sfrench@us.ibm.com>
2013-03-21Merge branch 'next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux Pull thermal management fixes from Zhang Rui. * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: thermal: exynos_thermal: return a proper error code while thermal_zone_device_register fail. thermal: rcar_thermal: propagate return value of thermal_zone_device_register Thermal: kirkwood: Convert to devm_ioremap_resource() Thermal: rcar: Convert to devm_ioremap_resource() Thermal: dove: Convert to devm_ioremap_resource() thermal: rcar: fix missing unlock on error in rcar_thermal_update_temp()
2013-03-21Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "A fair chunk of the linecount comes from a fix for a tracing bug that corrupts latency tracing buffers when the overwrite mode is changed on the fly - the rest is mostly assorted fewliner fixlets." * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86: Add SNB/SNB-EP scheduling constraints for cycle_activity event kprobes/x86: Check Interrupt Flag modifier when registering probe kprobes: Make hash_64() as always inlined perf: Generate EXIT event only once per task context perf: Reset hwc->last_period on sw clock events tracing: Prevent buffer overwrite disabled for latency tracers tracing: Keep overwrite in sync between regular and snapshot buffers tracing: Protect tracer flags with trace_types_lock perf tools: Fix LIBNUMA build with glibc 2.12 and older. tracing: Fix free of probe entry by calling call_rcu_sched() perf/POWER7: Create a sysfs format entry for Power7 events perf probe: Fix segfault libtraceevent: Remove hard coded include to /usr/local/include in Makefile perf record: Fix -C option perf tools: check if -DFORTIFY_SOURCE=2 is allowed perf report: Fix build with NO_NEWT=1 perf annotate: Fix build with NO_NEWT=1 tracing: Fix race in snapshot swapping
2013-03-21Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds
Pull drm fixes from Dave Airlie: "Radeon, intel and nouveau, along with one mgag200 fix - intel fix for an ioctl overflow, along with a regression fix for some phantom irqs on Ironlake. - nouveau has a lockdep warning and a bunch of thermal fixes - radeon has new pci ids and some minor fixes." * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (26 commits) drm/mgag200: Bug fix: Modified pll algorithm for EH project drm/i915: stop using GMBUS IRQs on Gen4 chips drm/nv50/kms: prevent lockdep false-positive in page flipping path drm/nouveau/core: fix return value of nouveau_object_del() MAINTAINERS: intel-gfx is no longer subscribers-only drm/i915: Use the fixed pixel clock for eDP in intel_dp_set_m_n() drm/nouveau/hwmon: do not expose a buggy temperature if it is unavailable drm/nouveau/therm: display the availability of the internal sensor drm/nouveau/therm: disable temperature management if the sensor isn't readable drm/nouveau/therm: disable auto fan management if temperature is not available drm/nv40/therm: reserve negative temperatures for errors drm/nv40/therm: disable temperature reading if the bios misses some parameters drm/nouveau/therm-ic: the temperature is off by sensor_constant, warn the user drm/nouveau/therm: remove some confusion introduced by therm_mode drm/nouveau/therm: do not make assumptions on temperature drm/nv40/therm: increase the sensor's settling delay to 20ms drm/nv40/therm: improve selection between the old and the new style Revert "drm/i915: try to train DP even harder" drm/radeon: add Richland pci ids drm/radeon: add support for Richland APUs ...
2013-03-21Merge tag 'dm-3.9-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm Pull device-mapper fixes from Alasdair G Kergon: "Fix reported data loss with discards and thin snapshots; avoid a deadlock observed in dm verity; fix a race in the new dm cache code along with some other minor bugs; store the cache policy version on disk to make the stored hints format future-proof." * tag 'dm-3.9-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm: dm cache: policy ignore hints if generated by different version dm cache: policy change version from string to integer set dm cache: fix race in writethrough implementation dm cache: metadata clear dirty bits on clean shutdown dm cache: avoid calling policy destructor twice on error dm cache: detect cache_create failure dm cache: avoid 64 bit division on 32 bit dm verity: avoid deadlock dm thin: fix non power of two discard granularity calc dm thin: fix discard corruption
2013-03-21Merge branch 'drm-intel-fixes' of ↵Dave Airlie
git://people.freedesktop.org/~danvet/drm-intel into drm-next Daniel writes: Bunch of fixes, all pretty high-priority - Fix execbuf argument checking (Kees Cook) - Optionally obfuscate kernel addresses in dumps (Kees Cook) - Two patches from Takashi Iwai to fix DP link training regressions he's seen. - intel-gfx is no longer subscribers-only (well, just no longer moderated in an annoying way for non-subscribers), update MAINTAINERS - gm45 gmbus irq fallout fix (Jiri Kosina) * 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel: drm/i915: stop using GMBUS IRQs on Gen4 chips MAINTAINERS: intel-gfx is no longer subscribers-only drm/i915: Use the fixed pixel clock for eDP in intel_dp_set_m_n() Revert "drm/i915: try to train DP even harder" drm/i915: bounds check execbuffer relocation count drm/i915: restrict kernel address leak in debugfs
2013-03-21drm/mgag200: Bug fix: Modified pll algorithm for EH projectJulia Lemire
While testing the mgag200 kms driver on the HP ProLiant Gen8, a bug was seen. Once the bootloader would load the selected kernel, the screen would go black. At first it was assumed that the mgag200 kms driver was hanging. But after setting up the grub serial output, it was seen that the driver was being loaded properly. After trying serval monitors, one finaly displayed the message "Frequency Out of Range". By comparing the kms pll algorithm with the previous mgag200 xorg driver pll algorithm, discrepencies were found. Once the kms pll algorithm was modified, the expected pll values were produced. This fix was tested on several monitors of varying native resolutions. Signed-off-by: Julia Lemire <jlemire@matrox.com> Cc: stable@vger.kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-03-20dm cache: policy ignore hints if generated by different versionMike Snitzer
When reading the dm cache metadata from disk, ignore the policy hints unless they were generated by the same major version number of the same policy module. The hints are considered to be private data belonging to the specific module that generated them and there is no requirement for them to make sense to different versions of the policy that generated them. Policy modules are all required to work fine if no previous hints are supplied (or if existing hints are lost). Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2013-03-20dm cache: policy change version from string to integer setMike Snitzer
Separate dm cache policy version string into 3 unsigned numbers corresponding to major, minor and patchlevel and store them at the end of the on-disk metadata so we know which version of the policy generated the hints in case a future version wants to use them differently. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2013-03-20dm cache: fix race in writethrough implementationJoe Thornber
We have found a race in the optimisation used in the dm cache writethrough implementation. Currently, dm core sends the cache target two bios, one for the origin device and one for the cache device and these are processed in parallel. This patch avoids the race by changing the code back to a simpler (slower) implementation which processes the two writes in series, one after the other, until we can develop a complete fix for the problem. When the cache is in writethrough mode it needs to send WRITE bios to both the origin and cache devices. Previously we've been implementing this by having dm core query the cache target on every write to find out how many copies of the bio it wants. The cache will ask for two bios if the block is in the cache, and one otherwise. Then main problem with this is it's racey. At the time this check is made the bio hasn't yet been submitted and so isn't being taken into account when quiescing a block for migration (promotion or demotion). This means a single bio may be submitted when two were needed because the block has since been promoted to the cache (catastrophic), or two bios where only one is needed (harmless). I really don't want to start entering bios into the quiescing system (deferred_set) in the get_num_write_bios callback. Instead this patch simplifies things; only one bio is submitted by the core, this is first written to the origin and then the cache device in series. Obviously this will have a latency impact. deferred_writethrough_bios is introduced to record bios that must be later issued to the cache device from the worker thread. This deferred submission, after the origin bio completes, is required given that we're in interrupt context (writethrough_endio). Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2013-03-20dm cache: metadata clear dirty bits on clean shutdownJoe Thornber
When writing the dirty bitset to the metadata device on a clean shutdown, clear the dirty bits. Previously they were left indicating the cache was dirty. This led to confusion about whether there really was dirty data in the cache or not. (This was a harmless bug.) Reported-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2013-03-20dm cache: avoid calling policy destructor twice on errorHeinz Mauelshagen
If the cache policy's config values are not able to be set we must set the policy to NULL after destroying it in create_cache_policy() so we don't attempt to destroy it a second time later. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2013-03-20dm cache: detect cache_create failureHeinz Mauelshagen
Return error if cache_create() fails. A missing return check made cache_ctr continue even after an error in cache_create() resulting in the cache object being destroyed. So a simple failure like an odd number of cache policy config value arguments would result in an oops. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2013-03-20dm cache: avoid 64 bit division on 32 bitJoe Thornber
Squash various 32bit link errors. >> on i386: >> drivers/built-in.o: In function `is_discarded_oblock': >> dm-cache-target.c:(.text+0x1ea28e): undefined reference to `__udivdi3' ... Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2013-03-20dm verity: avoid deadlockMikulas Patocka
A deadlock was found in the prefetch code in the dm verity map function. This patch fixes this by transferring the prefetch to a worker thread and skipping it completely if kmalloc fails. If generic_make_request is called recursively, it queues the I/O request on the current->bio_list without making the I/O request and returns. The routine making the recursive call cannot wait for the I/O to complete. The deadlock occurs when one thread grabs the bufio_client mutex and waits for an I/O to complete but the I/O is queued on another thread's current->bio_list and is waiting to get the mutex held by the first thread. The fix recognises that prefetching is not essential. If memory can be allocated, it queues the prefetch request to the worker thread, but if not, it does nothing. Signed-off-by: Paul Taysom <taysom@chromium.org> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Cc: stable@kernel.org
2013-03-20dm thin: fix non power of two discard granularity calcJoe Thornber
Fix a discard granularity calculation to work for non power of 2 block sizes. In order for thinp to passdown discard bios to the underlying data device, the data device must have a discard granularity that is a factor of the thinp block size. Originally this check was done by using bitops since the block_size was known to be a power of two. Introduced by commit f13945d75730081830b6f3360266950e2b7c9067 ("dm thin: support a non power of 2 discard_granularity"). Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2013-03-20dm thin: fix discard corruptionJoe Thornber
Fix a bug in dm_btree_remove that could leave leaf values with incorrect reference counts. The effect of this was that removal of a shared block could result in the space maps thinking the block was no longer used. More concretely, if you have a thin device and a snapshot of it, sending a discard to a shared region of the thin could corrupt the snapshot. Thinp uses a 2-level nested btree to store it's mappings. This first level is indexed by thin device, and the second level by logical block. Often when we're removing an entry in this mapping tree we need to rebalance nodes, which can involve shadowing them, possibly creating a copy if the block is shared. If we do create a copy then children of that node need to have their reference counts incremented. In this way reference counts percolate down the tree as shared trees diverge. The rebalance functions were incrementing the children at the appropriate time, but they were always assuming the children were internal nodes. This meant the leaf values (in our case packed block/flags entries) were not being incremented. Cc: stable@vger.kernel.org Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2013-03-20ext4: fix data=journal fast mount/umount hangTheodore Ts'o
In data=journal mode, if we unmount the file system before a transaction has a chance to complete, when the journal inode is being evicted, we can end up calling into jbd2_log_wait_commit() for the last transaction, after the journalling machinery has been shut down. Arguably we should adjust ext4_should_journal_data() to return FALSE for the journal inode, but the only place it matters is ext4_evict_inode(), and so to save a bit of CPU time, and to make the patch much more obviously correct by inspection(tm), we'll fix it by explicitly not trying to waiting for a journal commit when we are evicting the journal inode, since it's guaranteed to never succeed in this case. This can be easily replicated via: mount -t ext4 -o data=journal /dev/vdb /vdb ; umount /vdb ------------[ cut here ]------------ WARNING: at /usr/projects/linux/ext4/fs/jbd2/journal.c:542 __jbd2_log_start_commit+0xba/0xcd() Hardware name: Bochs JBD2: bad log_start_commit: 3005630206 3005630206 0 0 Modules linked in: Pid: 2909, comm: umount Not tainted 3.8.0-rc3 #1020 Call Trace: [<c015c0ef>] warn_slowpath_common+0x68/0x7d [<c02b7e7d>] ? __jbd2_log_start_commit+0xba/0xcd [<c015c177>] warn_slowpath_fmt+0x2b/0x2f [<c02b7e7d>] __jbd2_log_start_commit+0xba/0xcd [<c02b8075>] jbd2_log_start_commit+0x24/0x34 [<c0279ed5>] ext4_evict_inode+0x71/0x2e3 [<c021f0ec>] evict+0x94/0x135 [<c021f9aa>] iput+0x10a/0x110 [<c02b7836>] jbd2_journal_destroy+0x190/0x1ce [<c0175284>] ? bit_waitqueue+0x50/0x50 [<c028d23f>] ext4_put_super+0x52/0x294 [<c020efe3>] generic_shutdown_super+0x48/0xb4 [<c020f071>] kill_block_super+0x22/0x60 [<c020f3e0>] deactivate_locked_super+0x22/0x49 [<c020f5d6>] deactivate_super+0x30/0x33 [<c0222795>] mntput_no_expire+0x107/0x10c [<c02233a7>] sys_umount+0x2cf/0x2e0 [<c02233ca>] sys_oldumount+0x12/0x14 [<c08096b8>] syscall_call+0x7/0xb ---[ end trace 6a954cc790501c1f ]--- jbd2_log_wait_commit: error: j_commit_request=-1289337090, tid=0 Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz> Cc: stable@vger.kernel.org
2013-03-20ext4: fix ext4_evict_inode() racing against workqueue processing codeTheodore Ts'o
Commit 84c17543ab56 (ext4: move work from io_end to inode) triggered a regression when running xfstest #270 when the file system is mounted with dioread_nolock. The problem is that after ext4_evict_inode() calls ext4_ioend_wait(), this guarantees that last io_end structure has been freed, but it does not guarantee that the workqueue structure, which was moved into the inode by commit 84c17543ab56, is actually finished. Once ext4_flush_completed_IO() calls ext4_free_io_end() on CPU #1, this will allow ext4_ioend_wait() to return on CPU #2, at which point the evict_inode() codepath can race against the workqueue code on CPU #1 accessing EXT4_I(inode)->i_unwritten_work to find the next item of work to do. Fix this by calling cancel_work_sync() in ext4_ioend_wait(), which will be renamed ext4_ioend_shutdown(), since it is only used by ext4_evict_inode(). Also, move the call to ext4_ioend_shutdown() until after truncate_inode_pages() and filemap_write_and_wait() are called, to make sure all dirty pages have been written back and flushed from the page cache first. BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<c01dda6a>] cwq_activate_delayed_work+0x3b/0x7e *pdpt = 0000000030bc3001 *pde = 0000000000000000 Oops: 0000 [#1] SMP DEBUG_PAGEALLOC Modules linked in: Pid: 6, comm: kworker/u:0 Not tainted 3.8.0-rc3-00013-g84c1754-dirty #91 Bochs Bochs EIP: 0060:[<c01dda6a>] EFLAGS: 00010046 CPU: 0 EIP is at cwq_activate_delayed_work+0x3b/0x7e EAX: 00000000 EBX: 00000000 ECX: f505fe54 EDX: 00000000 ESI: ed5b697c EDI: 00000006 EBP: f64b7e8c ESP: f64b7e84 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 CR0: 8005003b CR2: 00000000 CR3: 30bc2000 CR4: 000006f0 DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 DR6: ffff0ff0 DR7: 00000400 Process kworker/u:0 (pid: 6, ti=f64b6000 task=f64b4160 task.ti=f64b6000) Stack: f505fe00 00000006 f64b7e9c c01de3d7 f6435540 00000003 f64b7efc c01def1d f6435540 00000002 00000000 0000008a c16d0808 c040a10b c16d07d8 c16d08b0 f505fe00 c16d0780 00000000 00000000 ee153df4 c1ce4a30 c17d0e30 00000000 Call Trace: [<c01de3d7>] cwq_dec_nr_in_flight+0x71/0xfb [<c01def1d>] process_one_work+0x5d8/0x637 [<c040a10b>] ? ext4_end_bio+0x300/0x300 [<c01e3105>] worker_thread+0x249/0x3ef [<c01ea317>] kthread+0xd8/0xeb [<c01e2ebc>] ? manage_workers+0x4bb/0x4bb [<c023a370>] ? trace_hardirqs_on+0x27/0x37 [<c0f1b4b7>] ret_from_kernel_thread+0x1b/0x28 [<c01ea23f>] ? __init_kthread_worker+0x71/0x71 Code: 01 83 15 ac ff 6c c1 00 31 db 89 c6 8b 00 a8 04 74 12 89 c3 30 db 83 05 b0 ff 6c c1 01 83 15 b4 ff 6c c1 00 89 f0 e8 42 ff ff ff <8b> 13 89 f0 83 05 b8 ff 6c c1 6c c1 00 31 c9 83 EIP: [<c01dda6a>] cwq_activate_delayed_work+0x3b/0x7e SS:ESP 0068:f64b7e84 CR2: 0000000000000000 ---[ end trace a1923229da53d8a4 ]--- Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Jan Kara <jack@suse.cz>
2013-03-20Merge branch 'drm-fixes-3.9' of git://people.freedesktop.org/~agd5f/linux ↵Dave Airlie
into drm-next Alex writes: "Mostly just small bug fixes. Big change is new pci ids for Richland APUs." * 'drm-fixes-3.9' of git://people.freedesktop.org/~agd5f/linux: drm/radeon: add Richland pci ids drm/radeon: add support for Richland APUs drm/radeon/benchmark: allow same domains for dma copy drm/radeon/benchmark: make sure bo blit copy exists before using it drm/radeon: fix backend map setup on 1 RB trinity boards drm/radeon: fix S/R on VM systems (cayman/TN/SI)
2013-03-20Merge branch 'drm-nouveau-fixes-3.9' of ↵Dave Airlie
git://anongit.freedesktop.org/git/nouveau/linux-2.6 into drm-next Lots of thermal fixes and fix a lockdep warning we've been seeing. * 'drm-nouveau-fixes-3.9' of git://anongit.freedesktop.org/git/nouveau/linux-2.6: drm/nv50/kms: prevent lockdep false-positive in page flipping path drm/nouveau/core: fix return value of nouveau_object_del() drm/nouveau/hwmon: do not expose a buggy temperature if it is unavailable drm/nouveau/therm: display the availability of the internal sensor drm/nouveau/therm: disable temperature management if the sensor isn't readable drm/nouveau/therm: disable auto fan management if temperature is not available drm/nv40/therm: reserve negative temperatures for errors drm/nv40/therm: disable temperature reading if the bios misses some parameters drm/nouveau/therm-ic: the temperature is off by sensor_constant, warn the user drm/nouveau/therm: remove some confusion introduced by therm_mode drm/nouveau/therm: do not make assumptions on temperature drm/nv40/therm: increase the sensor's settling delay to 20ms drm/nv40/therm: improve selection between the old and the new style
2013-03-20Merge tag 'vfio-v3.9-rc4' of git://github.com/awilliam/linux-vfioLinus Torvalds
Pull vfio fix from Alex Williamson. * tag 'vfio-v3.9-rc4' of git://github.com/awilliam/linux-vfio: vfio: include <linux/slab.h> for kmalloc
2013-03-20Merge git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Marcelo Tosatti. * git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: Fix bounds checking in ioapic indirect register reads (CVE-2013-1798) KVM: x86: Convert MSR_KVM_SYSTEM_TIME to use gfn_to_hva_cache functions (CVE-2013-1797) KVM: x86: fix for buffer overflow in handling of MSR_KVM_SYSTEM_TIME (CVE-2013-1796) KVM: x86: fix deadlock in clock-in-progress request handling KVM: allow host header to be included even for !CONFIG_KVM
2013-03-19drm/i915: stop using GMBUS IRQs on Gen4 chipsJiri Kosina
Commit 28c70f162 ("drm/i915: use the gmbus irq for waits") switched to using GMBUS irqs instead of GPIO bit-banging for chipset generations 4 and above. It turns out though that on many systems this leads to spurious interrupts being generated, long after the register write to disable the IRQs has been issued. Typically this results in the spurious interrupt source getting disabled: [ 9.636345] irq 16: nobody cared (try booting with the "irqpoll" option) [ 9.637915] Pid: 4157, comm: ifup Tainted: GF 3.9.0-rc2-00341-g0863702 #422 [ 9.639484] Call Trace: [ 9.640731] <IRQ> [<ffffffff8109b40d>] __report_bad_irq+0x1d/0xc7 [ 9.640731] [<ffffffff8109b7db>] note_interrupt+0x15b/0x1e8 [ 9.640731] [<ffffffff810999f7>] handle_irq_event_percpu+0x1bf/0x214 [ 9.640731] [<ffffffff81099a88>] handle_irq_event+0x3c/0x5c [ 9.640731] [<ffffffff8109c139>] handle_fasteoi_irq+0x7a/0xb0 [ 9.640731] [<ffffffff8100400e>] handle_irq+0x1a/0x24 [ 9.640731] [<ffffffff81003d17>] do_IRQ+0x48/0xaf [ 9.640731] [<ffffffff8142f1ea>] common_interrupt+0x6a/0x6a [ 9.640731] <EOI> [<ffffffff8142f952>] ? system_call_fastpath+0x16/0x1b [ 9.640731] handlers: [ 9.640731] [<ffffffffa000d771>] usb_hcd_irq [usbcore] [ 9.640731] [<ffffffffa0306189>] yenta_interrupt [yenta_socket] [ 9.640731] Disabling IRQ #16 The really curious thing is now that irq 16 is _not_ the interrupt for the i915 driver when using MSI, but it _is_ the interrupt when not using MSI. So by all indications it seems like gmbus is able to generate a legacy (shared) interrupt in MSI mode on some configurations. I've tried to reproduce this and the differentiating thing seems to be that on unaffected systems no other device uses irq 16 (which seems to be the non-MSI intel gfx interrupt on all gm45). I have no idea how that even can happen. To avoid tempting this elephant into a rage, just disable gmbus interrupt support on gen 4. v2: Improve the commit message with exact details of what's going on. Also add a comment in the code to warn against this particular elephant in the room. v3: Move the comment explaing how gen4 blows up next to the definition of HAS_GMBUS_IRQ to keep the code-flow straight. Suggested by Chris Wilson. Signed-off-by: Jiri Kosina <jkosina@suse.cz> (v1) Acked-by: Chris Wilson <chris@chris-wilson.co.uk> References: https://lkml.org/lkml/2013/3/8/325 Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-03-19Merge tag 'for-linus-v3.9-rc4' of git://oss.sgi.com/xfs/xfsLinus Torvalds
Pull XFS fixes from Ben Myers: - Fix for a potential infinite loop which was introduced in commit 4d559a3bcb73 ("xfs: limit speculative prealloc near ENOSPC thresholds") - Fix for the return type of xfs_iomap_eof_prealloc_initial_size from commit a1e16c26660b ("xfs: limit speculative prealloc size on sparse files") - Fix for a failed buffer readahead causing subsequent callers to fail incorrectly * tag 'for-linus-v3.9-rc4' of git://oss.sgi.com/xfs/xfs: xfs: ensure we capture IO errors correctly xfs: fix xfs_iomap_eof_prealloc_initial_size type xfs: fix potential infinite loop in xfs_iomap_prealloc_size()
2013-03-19PCI: Use ROM images from firmware only if no other ROM source availableMatthew Garrett
Mantas Mikulėnas reported that his graphics hardware failed to initialise after commit f9a37be0f02a ("x86: Use PCI setup data"). The aim of this commit was to ensure that ROM images were available on some Apple systems that don't expose the GPU ROM via any other source. In this case, UEFI appears to have provided a broken ROM image that we were using even though there was a perfectly valid ROM available via other sources. The simplest way to handle this seems to be to just re-order pci_map_rom() and leave any firmare-supplied ROM to last. Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com> Tested-by: Mantas Mikulėnas <grawity@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds
Pull sparc fixes from David Miller: "Just some minor fixups, a sunsu console setup panic cure, and recognition of a Fujitsu sun4v cpu." * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc: remove unused "config BITS" sparc: delete "if !ULTRA_HAS_POPULATION_COUNT" sparc64: correctly recognize SPARC64-X chips sparc,leon: fix GRPCI2 device0 PCI config space access sunsu: Fix panic in case of nonexistent port at "console=ttySY" cmdline option
2013-03-19Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64 Pull arm64 fixes from Catalin Marinas: - Fix !SMP build error. - Fix padding computation in struct ucontext (no ABI change). - Minor clean-up after the signal patches (unused var). - Two old Kconfig options clean-up. * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64: arm64: Kconfig.debug: Remove unused CONFIG_DEBUG_ERRORS arm64: Do not select GENERIC_HARDIRQS_NO_DEPRECATED arm64: fix padding computation in struct ucontext arm64: Fix build error with !SMP arm64: Removed unused variable in compat_setup_rt_frame()
2013-03-19sparc: remove unused "config BITS"Paul Bolle
sparc's asm/module.h got removed in commit 786d35d45cc40b2a51a18f73e14e135d47fdced7 ("Make most arch asm/module.h files use asm-generic/module.h"). That removed the only two uses of this Kconfig symbol. So we can remove its entry too. > >From arch/sparc/Makefile: > ifeq ($(CONFIG_SPARC32),y) > [...] > > [...] > export BITS := 32 > [...] > > else > [...] > > [...] > export BITS := 64 > [...] > > So $(BITS) is set depending on whether CONFIG_SPARC32 is set or not. > Using $(BITS) in sparc's Makefiles is not using CONFIG_BITS. That > doesn't count as usage of "config BITS". Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Acked-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Fix ARM BPF JIT handling of negative 'k' values, from Chen Gang. 2) Insufficient space reserved for bridge netlink values, fix from Stephen Hemminger. 3) Some dst_neigh_lookup*() callers don't interpret error pointer correctly, fix from Zhouyi Zhou. 4) Fix transport match in SCTP active_path loops, from Xugeng Zhang. 5) Fix qeth driver handling of multi-order SKB frags, from Frank Blaschka. 6) fec driver is missing napi_disable() call, resulting in crashes on unload, from Georg Hofmann. 7) Don't try to handle PMTU events on a listening socket, fix from Eric Dumazet. 8) Fix timestamp location calculations in IP option processing, from David Ward. 9) FIB_TABLE_HASHSZ setting is not controlled by the correct kconfig tests, from Denis V Lunev. 10) Fix TX descriptor push handling in SFC driver, from Ben Hutchings. 11) Fix isdn/hisax and tulip/de4x5 kconfig dependencies, from Arnd Bergmann. 12) bnx2x statistics don't handle 4GB rollover correctly, fix from Maciej Żenczykowski. 13) Openvswitch bug fixes for vport del/new error reporting, missing genlmsg_end() call in netlink processing, and mis-parsing of LLC/SNAP ethernet types. From Rich Lane. 14) SKB pfmemalloc state should only be propagated from the head page of a compound page, fix from Pavel Emelyanov. 15) Fix link handling in tg3 driver for 5715 chips when autonegotation is disabled. From Nithin Sujir. 16) Fix inverted test of cpdma_check_free_tx_desc return value in davinci_emac driver, from Mugunthan V N. 17) vlan_depth is incorrectly calculated in skb_network_protocol(), from Li RongQing. 18) Fix probing of Gobi 1K devices in qmi_wwan driver, and fix NCM device mode backwards compat in cdc_ncm driver. From Bjørn Mork. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (52 commits) inet: limit length of fragment queue hash table bucket lists qeth: Fix scatter-gather regression qeth: Fix invalid router settings handling qeth: delay feature trace tcp: dont handle MTU reduction on LISTEN socket bnx2x: fix occasional statistics off-by-4GB error vhost/net: fix heads usage of ubuf_info bridge: Add support for setting BR_ROOT_BLOCK flag. bnx2x: add missing napi deletion in error path drivers: net: ethernet: ti: davinci_emac: fix usage of cpdma_check_free_tx_desc() ethernet/tulip: DE4x5 needs VIRT_TO_BUS isdn: hisax: netjet requires VIRT_TO_BUS net: cdc_ncm, cdc_mbim: allow user to prefer NCM for backwards compatibility rtnetlink: Mask the rta_type when range checking Revert "ip_gre: make ipgre_tunnel_xmit() not parse network header as IP unconditionally" Fix dst_neigh_lookup/dst_neigh_lookup_skb return value handling bug smsc75xx: configuration help incorrectly mentions smsc95xx net: fec: fix missing napi_disable call net: fec: restart the FEC when PHY speed changes skb: Propagate pfmemalloc on skb from head page only ...
2013-03-19sparc: delete "if !ULTRA_HAS_POPULATION_COUNT"Paul Bolle
Commit 2d78d4beb64eb07d50665432867971c481192ebf ("[PATCH] bitops: sparc64: use generic bitops") made the default of GENERIC_HWEIGHT depend on !ULTRA_HAS_POPULATION_COUNT. But since there's no Kconfig symbol with that name, this always evaluates to true. Delete this dependency. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Acked-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-19KVM: Fix bounds checking in ioapic indirect register reads (CVE-2013-1798)Andy Honig
If the guest specifies a IOAPIC_REG_SELECT with an invalid value and follows that with a read of the IOAPIC_REG_WINDOW KVM does not properly validate that request. ioapic_read_indirect contains an ASSERT(redir_index < IOAPIC_NUM_PINS), but the ASSERT has no effect in non-debug builds. In recent kernels this allows a guest to cause a kernel oops by reading invalid memory. In older kernels (pre-3.3) this allows a guest to read from large ranges of host memory. Tested: tested against apic unit tests. Signed-off-by: Andrew Honig <ahonig@google.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-03-19KVM: x86: Convert MSR_KVM_SYSTEM_TIME to use gfn_to_hva_cache functions ↵Andy Honig
(CVE-2013-1797) There is a potential use after free issue with the handling of MSR_KVM_SYSTEM_TIME. If the guest specifies a GPA in a movable or removable memory such as frame buffers then KVM might continue to write to that address even after it's removed via KVM_SET_USER_MEMORY_REGION. KVM pins the page in memory so it's unlikely to cause an issue, but if the user space component re-purposes the memory previously used for the guest, then the guest will be able to corrupt that memory. Tested: Tested against kvmclock unit test Signed-off-by: Andrew Honig <ahonig@google.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-03-19KVM: x86: fix for buffer overflow in handling of MSR_KVM_SYSTEM_TIME ↵Andy Honig
(CVE-2013-1796) If the guest sets the GPA of the time_page so that the request to update the time straddles a page then KVM will write onto an incorrect page. The write is done byusing kmap atomic to get a pointer to the page for the time structure and then performing a memcpy to that page starting at an offset that the guest controls. Well behaved guests always provide a 32-byte aligned address, however a malicious guest could use this to corrupt host kernel memory. Tested: Tested against kvmclock unit test. Signed-off-by: Andrew Honig <ahonig@google.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-03-19arm64: Kconfig.debug: Remove unused CONFIG_DEBUG_ERRORSPaul Bolle
The Kconfig entry for DEBUG_ERRORS is a verbatim copy of the former arm entry for that symbol. It got removed in v2.6.39 because it wasn't actually used anywhere. There are still no users of DEBUG_ERRORS so remove this entry too. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> [catalin.marinas@arm.com: removed option from defconfig] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-03-19arm64: Do not select GENERIC_HARDIRQS_NO_DEPRECATEDPaul Bolle
Config option GENERIC_HARDIRQS_NO_DEPRECATED was removed in commit 78c89825649a9a5ed526c507603196f467d781a5 ("genirq: Remove the now obsolete config options and select statements"), but the select was accidentally reintroduced in commit 8c2c3df31e3b87cb5348e48776c366ebd1dc5a7a ("arm64: Build infrastructure"). Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2013-03-19inet: limit length of fragment queue hash table bucket listsHannes Frederic Sowa
This patch introduces a constant limit of the fragment queue hash table bucket list lengths. Currently the limit 128 is choosen somewhat arbitrary and just ensures that we can fill up the fragment cache with empty packets up to the default ip_frag_high_thresh limits. It should just protect from list iteration eating considerable amounts of cpu. If we reach the maximum length in one hash bucket a warning is printed. This is implemented on the caller side of inet_frag_find to distinguish between the different users of inet_fragment.c. I dropped the out of memory warning in the ipv4 fragment lookup path, because we already get a warning by the slab allocator. Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Jesper Dangaard Brouer <jbrouer@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-19qeth: Fix scatter-gather regressionFrank Blaschka
This patch fixes a scatter-gather regression introduced with commit 5640f768 net: use a per task frag allocator Now the qeth driver can cope with bigger framents and split a fragment in sub framents if required. Signed-off-by: Frank Blaschka <blaschka@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>