summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2015-04-13Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "First batch of KVM changes for 4.1 The most interesting bit here is irqfd/ioeventfd support for ARM and ARM64. Summary: ARM/ARM64: fixes for live migration, irqfd and ioeventfd support (enabling vhost, too), page aging s390: interrupt handling rework, allowing to inject all local interrupts via new ioctl and to get/set the full local irq state for migration and introspection. New ioctls to access memory by virtual address, and to get/set the guest storage keys. SIMD support. MIPS: FPU and MIPS SIMD Architecture (MSA) support. Includes some patches from Ralf Baechle's MIPS tree. x86: bugfixes (notably for pvclock, the others are small) and cleanups. Another small latency improvement for the TSC deadline timer" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (146 commits) KVM: use slowpath for cross page cached accesses kvm: mmu: lazy collapse small sptes into large sptes KVM: x86: Clear CR2 on VCPU reset KVM: x86: DR0-DR3 are not clear on reset KVM: x86: BSP in MSR_IA32_APICBASE is writable KVM: x86: simplify kvm_apic_map KVM: x86: avoid logical_map when it is invalid KVM: x86: fix mixed APIC mode broadcast KVM: x86: use MDA for interrupt matching kvm/ppc/mpic: drop unused IRQ_testbit KVM: nVMX: remove unnecessary double caching of MAXPHYADDR KVM: nVMX: checks for address bits beyond MAXPHYADDR on VM-entry KVM: x86: cache maxphyaddr CPUID leaf in struct kvm_vcpu KVM: vmx: pass error code with internal error #2 x86: vdso: fix pvclock races with task migration KVM: remove kvm_read_hva and kvm_read_hva_atomic KVM: x86: optimize delivery of TSC deadline timer interrupt KVM: x86: extract blocking logic from __vcpu_run kvm: x86: fix x86 eflags fixed bit KVM: s390: migrate vcpu interrupt state ...
2015-04-12Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs and fs fixes from Al Viro: "Several AIO and OCFS2 fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: ocfs2: _really_ sync the right range ocfs2_file_write_iter: keep return value and current position update in sync [regression] ocfs2: do *not* increment ->ki_pos twice ioctx_alloc(): fix vma (and file) leak on failure fix mremap() vs. ioctx_kill() race
2015-04-11Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "This is our remaining set of three fixes for 4.0: two oops fixes(one for cable pulls triggering oopses and the other be2iscsi specific) and one warn on in sysfs on multipath devices using enclosures" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: Defer processing of REQ_PREEMPT requests for blocked devices be2iscsi: Fix kernel panic when device initialization fails enclosure: fix WARN_ON removing an adapter in multi-path devices
2015-04-10Merge tag 'pm+acpi-4.0-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management and ACPI fixes from Rafael Wysocki: "These are stable-candidate fixes of some recently reported issues in the cpufreq core, cpuidle core, the ACPI cpuidle driver and the hibernate core. Specifics: - Revert a 3.17 hibernate commit that was supposed to fix an issue related to e820 reserved regions, but broke resume from hibernation on Lenovo x230 (Rafael J Wysocki). - Prevent the ACPI cpuidle driver from overwriting the name and description of the C0 state set by the core when the list of C-states changes (Thomas Schlichter). - Remove the no longer needed state_count field from struct cpuidle_device which prevents the list of C-states shown by the sysfs interface from becoming incorrect when the current number of them is different from the number of C-states on boot (Bartlomiej Zolnierkiewicz). - The cpufreq core updates the policy object of the only online CPU during system resume to make it reflect the current hardware state, but it always assumes that CPU to be CPU0 which need not be the case, so fix the code to avoid that assumption (Viresh Kumar)" * tag 'pm+acpi-4.0-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: Revert "PM / hibernate: avoid unsafe pages in e820 reserved regions" cpuidle: ACPI: do not overwrite name and description of C0 cpuidle: remove state_count field from struct cpuidle_device cpufreq: Schedule work for the first-online CPU on resume
2015-04-09Merge branches 'pm-sleep', 'pm-cpufreq' and 'pm-cpuidle'Rafael J. Wysocki
* pm-sleep: Revert "PM / hibernate: avoid unsafe pages in e820 reserved regions" * pm-cpufreq: cpufreq: Schedule work for the first-online CPU on resume * pm-cpuidle: cpuidle: ACPI: do not overwrite name and description of C0 cpuidle: remove state_count field from struct cpuidle_device
2015-04-08Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "Three fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm: numa: disable change protection for vma(VM_HUGETLB) include/linux/dmapool.h: declare struct device mm: move zone lock to a different cache line than order-0 free page lists
2015-04-08Defer processing of REQ_PREEMPT requests for blocked devicesBart Van Assche
SCSI transport drivers and SCSI LLDs block a SCSI device if the transport layer is not operational. This means that in this state no requests should be processed, even if the REQ_PREEMPT flag has been set. This patch avoids that a rescan shortly after a cable pull sporadically triggers the following kernel oops: BUG: unable to handle kernel paging request at ffffc9001a6bc084 IP: [<ffffffffa04e08f2>] mlx4_ib_post_send+0xd2/0xb30 [mlx4_ib] Process rescan-scsi-bus (pid: 9241, threadinfo ffff88053484a000, task ffff880534aae100) Call Trace: [<ffffffffa0718135>] srp_post_send+0x65/0x70 [ib_srp] [<ffffffffa071b9df>] srp_queuecommand+0x1cf/0x3e0 [ib_srp] [<ffffffffa0001ff1>] scsi_dispatch_cmd+0x101/0x280 [scsi_mod] [<ffffffffa0009ad1>] scsi_request_fn+0x411/0x4d0 [scsi_mod] [<ffffffff81223b37>] __blk_run_queue+0x27/0x30 [<ffffffff8122a8d2>] blk_execute_rq_nowait+0x82/0x110 [<ffffffff8122a9c2>] blk_execute_rq+0x62/0xf0 [<ffffffffa000b0e8>] scsi_execute+0xe8/0x190 [scsi_mod] [<ffffffffa000b2f3>] scsi_execute_req+0xa3/0x130 [scsi_mod] [<ffffffffa000c1aa>] scsi_probe_lun+0x17a/0x450 [scsi_mod] [<ffffffffa000ce86>] scsi_probe_and_add_lun+0x156/0x480 [scsi_mod] [<ffffffffa000dc2f>] __scsi_scan_target+0xdf/0x1f0 [scsi_mod] [<ffffffffa000dfa3>] scsi_scan_host_selected+0x183/0x1c0 [scsi_mod] [<ffffffffa000edfb>] scsi_scan+0xdb/0xe0 [scsi_mod] [<ffffffffa000ee13>] store_scan+0x13/0x20 [scsi_mod] [<ffffffff811c8d9b>] sysfs_write_file+0xcb/0x160 [<ffffffff811589de>] vfs_write+0xce/0x140 [<ffffffff81158b53>] sys_write+0x53/0xa0 [<ffffffff81464592>] system_call_fastpath+0x16/0x1b [<00007f611c9d9300>] 0x7f611c9d92ff Reported-by: Max Gurtuvoy <maxg@mellanox.com> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Cc: <stable@vger.kernel.org> Signed-off-by: James Bottomley <JBottomley@Odin.com>
2015-04-08KVM: x86: BSP in MSR_IA32_APICBASE is writableNadav Amit
After reset, the CPU can change the BSP, which will be used upon INIT. Reset should return the BSP which QEMU asked for, and therefore handled accordingly. To quote: "If the MP protocol has completed and a BSP is chosen, subsequent INITs (either to a specific processor or system wide) do not cause the MP protocol to be repeated." [Intel SDM 8.4.2: MP Initialization Protocol Requirements and Restrictions] Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Message-Id: <1427933438-12782-3-git-send-email-namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-04-08Merge tag 'media/v3.20-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media updates from Mauro Carvalho Chehab: "A series of fixup patches for version 4.0: - one VB2 core fixup, when stopping the stream; - one VB2 core fixup for dma-contig memory type; - driver fixes at rtl28xx, s5p (tv, jpeg, mfc, soc-camera, sh_veu, cx23885, gspca" * tag 'media/v3.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: [media] rtl28xxu: return success for unimplemented FE callback [media] rtl2832: disable regmap register cache [media] vb2: Fix dma_dir setting for dma-contig mem type [media] media: s5p-mfc: fix broken pointer cast on 64bit arch [media] media: s5p-mfc: fix mmap support for 64bit arch [media] cx23885: fix querycap [media] sh_veu: v4l2_dev wasn't set [media] s5p-mfc: Fix NULL pointer dereference caused by not set q->lock [media] s5p-jpeg: exynos3250: fix erroneous reset procedure [media] s5p-tv: hdmi needs I2C support [media] s5p-jpeg: Initialize cb and cr to zero [media] media: fix gspca drivers build dependencies [media] soc-camera: Fix devm_kfree() in soc_of_bind() [media] media: atmel-isi: increase the burst length to improve the performance [media] vb2: fix 'UNBALANCED' warnings when calling vb2_thread_stop()
2015-04-07include/linux/dmapool.h: declare struct deviceMark Brown
dmapool uses struct device in function arguments but relies on an implicit inclusion to declare struct device causing warnings in some configurations: include/linux/dmapool.h:31:7: warning: 'struct device' declared inside parameter list Fix this by adding a struct device declaration to the file. Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-07mm: move zone lock to a different cache line than order-0 free page listsMel Gorman
Huang Ying reported the following problem due to commit 3484b2de9499 ("mm: rearrange zone fields into read-only, page alloc, statistics and page reclaim lines") from the Intel performance tests 24b7e5819ad5cbef 3484b2de9499df23c4604a513b ---------------- -------------------------- %stddev %change %stddev \ | \ 152288 \261 0% -46.2% 81911 \261 0% aim7.jobs-per-min 237 \261 0% +85.6% 440 \261 0% aim7.time.elapsed_time 237 \261 0% +85.6% 440 \261 0% aim7.time.elapsed_time.max 25026 \261 0% +70.7% 42712 \261 0% aim7.time.system_time 2186645 \261 5% +32.0% 2885949 \261 4% aim7.time.voluntary_context_switches 4576561 \261 1% +24.9% 5715773 \261 0% aim7.time.involuntary_context_switches The problem is specific to very large machines under stress. It was not reproducible with the machines I had used to justify the original patch because large numbers of CPUs are required. When pressure is high enough, the cache line is bouncing between CPUs trying to acquire the lock and the holder of the lock adjusting free lists. The intention was that the acquirer of the lock would automatically have the cache line holding the free lists but according to Huang, this is not a universal win. One possibility is to move the zone lock to its own cache line but it increases the size of the zone. This patch moves the lock to the other end of the free lists where they do not contend under high pressure. It does mean the page allocator paths now require more cache lines but Huang reports that it restores performance to previous levels on large machines %stddev %change %stddev \ | \ 84568 \261 1% +94.3% 164280 \261 1% aim7.jobs-per-min 2881944 \261 2% -35.1% 1870386 \261 8% aim7.time.voluntary_context_switches 681 \261 1% -3.4% 658 \261 0% aim7.time.user_time 5538139 \261 0% -12.1% 4867884 \261 0% aim7.time.involuntary_context_switches 44174 \261 1% -46.0% 23848 \261 1% aim7.time.system_time 426 \261 1% -48.4% 219 \261 1% aim7.time.elapsed_time 426 \261 1% -48.4% 219 \261 1% aim7.time.elapsed_time.max 468 \261 1% -43.1% 266 \261 2% uptime.boot Signed-off-by: Mel Gorman <mgorman@suse.de> Reported-by: Huang Ying <ying.huang@intel.com> Tested-by: Huang Ying <ying.huang@intel.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-07Merge tag 'kvm-s390-next-20150331' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD Features and fixes for 4.1 (kvm/next) 1. Assorted changes 1.1 allow more feature bits for the guest 1.2 Store breaking event address on program interrupts 2. Interrupt handling rework 2.1 Fix copy_to_user while holding a spinlock (cc stable) 2.2 Rework floating interrupts to follow the priorities 2.3 Allow to inject all local interrupts via new ioctl 2.4 allow to get/set the full local irq state, e.g. for migration and introspection
2015-04-07Merge tag 'kvm-arm-for-4.1' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into 'kvm-next' KVM/ARM changes for v4.1: - fixes for live migration - irqfd support - kvm-io-bus & vgic rework to enable ioeventfd - page ageing for stage-2 translation - various cleanups
2015-04-07Merge tag 'kvm-arm-fixes-4.0-rc5' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into 'kvm-next' Fixes for KVM/ARM for 4.0-rc5. Fixes page refcounting issues in our Stage-2 page table management code, fixes a missing unlock in a gicv3 error path, and fixes a race that can cause lost interrupts if signals are pending just prior to entering the guest.
2015-04-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) In TCP, don't register an FRTO for cumulatively ACK'd data that was previously SACK'd, from Neal Cardwell. 2) Need to hold RNL mutex in ipv4 multicast code namespace cleanup, from Cong WANG. 3) Similarly we have to hold RNL mutex for fib_rules_unregister(), also from Cong WANG. 4) Revert and rework netns nsid allocation fix, from Nicolas Dichtel. 5) When we encapsulate for a tunnel device, skb->sk still points to the user socket. So this leads to cases where we retraverse the ipv4/ipv6 output path with skb->sk being of some other address family (f.e. AF_PACKET). This can cause things to crash since the ipv4 output path is dereferencing an AF_PACKET socket as if it were an ipv4 one. The short term fix for 'net' and -stable is to elide these socket checks once we've entered an encapsulation sequence by testing xmit_recursion. Longer term we have a better solution wherein we pass the tunnel's socket down through the output paths, but that is way too invasive for 'net' and -stable. From Hannes Frederic Sowa. 6) l2tp_init() failure path forgets to unregister per-net ops, from Cong WANG. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: net/mlx4_core: Fix error message deprecation for ConnectX-2 cards net: dsa: fix filling routing table from OF description l2tp: unregister l2tp_net_ops on failure path mvneta: dont call mvneta_adjust_link() manually ipv6: protect skb->sk accesses from recursive dereference inside the stack netns: don't allocate an id for dead netns Revert "netns: don't clear nsid too early on removal" ip6mr: call del_timer_sync() in ip6mr_free_table() net: move fib_rules_unregister() under rtnl lock ipv4: take rtnl_lock and mark mrt table as freed on namespace cleanup tcp: fix FRTO undo on cumulative ACK of SACKed range xen-netfront: transmit fully GSO-sized packets
2015-04-06fix mremap() vs. ioctx_kill() raceAl Viro
teach ->mremap() method to return an error and have it fail for aio mappings in process of being killed Note that in case of ->mremap() failure we need to undo move_page_tables() we'd already done; we could call ->mremap() first, but then the failure of move_page_tables() would require undoing whatever _successful_ ->mremap() has done, which would be a lot more headache in general. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-06ipv6: protect skb->sk accesses from recursive dereference inside the stackhannes@stressinduktion.org
We should not consult skb->sk for output decisions in xmit recursion levels > 0 in the stack. Otherwise local socket settings could influence the result of e.g. tunnel encapsulation process. ipv6 does not conform with this in three places: 1) ip6_fragment: we do consult ipv6_npinfo for frag_size 2) sk_mc_loop in ipv6 uses skb->sk and checks if we should loop the packet back to the local socket 3) ip6_skb_dst_mtu could query the settings from the user socket and force a wrong MTU Furthermore: In sk_mc_loop we could potentially land in WARN_ON(1) if we use a PF_PACKET socket ontop of an IPv6-backed vxlan device. Reuse xmit_recursion as we are currently only interested in protecting tunnel devices. Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-03Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input subsystem fixes from Dmitry Torokhov: "A fix for ALPS driver for issue introduced in the latest update and a tweak for yet another Lenovo box in Synaptics. There will be more ALPS tweaks coming.." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: define INPUT_PROP_ACCELEROMETER behavior Input: synaptics - fix min-max quirk value for E440 Input: synaptics - add quirk for Thinkpad E440 Input: ALPS - fix max coordinates for v5 and v7 protocols Input: add MT_TOOL_PALM
2015-04-03Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block layer fix from Jens Axboe: "Just one patch in this pull request, fixing a regression caused by a 'mathematically correct' change to lcm()" * 'for-linus' of git://git.kernel.dk/linux-block: block: fix blk_stack_limits() regression due to lcm() change
2015-04-03cpuidle: remove state_count field from struct cpuidle_deviceBartlomiej Zolnierkiewicz
Thomas Schlichter reports the following issue on his Samsung NC20: "The C-states C1 and C2 to the OS when connected to AC, and additionally provides the C3 C-state when disconnected from AC. However, the number of C-states shown in sysfs is fixed to the number of C-states present at boot. If I boot with AC connected, I always only see the C-states up to C2 even if I disconnect AC. The reason is commit 130a5f692425 (ACPI / cpuidle: remove dev->state_count setting). It removes the update of dev->state_count, but sysfs uses exactly this variable to show the C-states. The fix is to use drv->state_count in sysfs. As this is currently the last user of dev->state_count, this variable can be completely removed." Remove dev->state_count as per the above. Reported-by: Thomas Schlichter <thomas.schlichter@web.de> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: 3.14+ <stable@vger.kernel.org> # 3.14+ [ rjw: Changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-04-02Merge tag 'irqchip-fixes-4.0-2' of git://git.infradead.org/users/jcooper/linuxLinus Torvalds
Pull irqchip fixes from Jason Cooper: "This is the second round of fixes for irqchip. It contains some fixes found while the arm64 guys were writing the kvm gicv3 its emulation. GICv3 ITS: - Small batch of fixes discovered while writing the kvm ITS emulation" * tag 'irqchip-fixes-4.0-2' of git://git.infradead.org/users/jcooper/linux: irqchip: gicv3-its: Use non-cacheable accesses when no shareability irqchip: gicv3-its: Fix PROP/PEND and BASE/CBASE confusion irqchip: gicv3-its: Fix device ID encoding irqchip: gicv3-its: Fix encoding of collection's target redistributor
2015-04-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Fix use-after-free with mac80211 RX A-MPDU reorder timer, from Johannes Berg. 2) iwlwifi leaks memory every module load/unload cycles, fix from Larry Finger. 3) Need to use for_each_netdev_safe() in rtnl_group_changelink() otherwise we can crash, from WANG Cong. 4) mlx4 driver does register_netdev() too early in the probe sequence, from Ido Shamay. 5) Don't allow router discovery hop limit to decrease the interface's hop limit, from D.S. Ljungmark. 6) tx_packets and tx_bytes improperly accounted for certain classes of USB network devices, fix from Ben Hutchings. 7) ip{6}mr_rules_init() mistakenly use plain kfree to release the ipmr tables in the error path, they must instead use ip{6}mr_free_table(). Fix from WANG Cong. 8) cxgb4 doesn't properly quiesce all RX activity before unregistering the netdevice. Fix from Hariprasad Shenai. 9) Fix hash corruptions in ipvlan driver, from Jiri Benc. 10) nla_memcpy(), like a real memcpy, should fully initialize the destination buffer, even if the source attribute is smaller. Fix from Jiri Benc. 11) Fix wrong error code returned from iucv_sock_sendmsg(). We should use whatever sock_alloc_send_skb() put into 'err'. From Eugene Crosser. 12) Fix slab object leak on module unload in TIPC, from Ying Xue. 13) Need a READ_ONCE() when reading the cached RX socket route in tcp_v{4,6}_early_demux(). From Michal Kubecek. 14) Still too many problems with TPC support in the ath9k driver, so disable it for now. From Felix Fietkau. 15) When in AP mode the rtlwifi driver can leak DMA mappings, fix from Larry Finger. 16) Missing kzalloc() failure check in gs_usb CAN driver, from Colin Ian King. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (52 commits) cxgb4: Fix to dump devlog, even if FW is crashed cxgb4: Firmware macro changes for fw verison 1.13.32.0 bnx2x: Fix kdump when iommu=on bnx2x: Fix kdump on 4-port device mac80211: fix RX A-MPDU session reorder timer deletion MAINTAINERS: Update Intel Wired Ethernet Driver info tipc: fix a slab object leak net/usb/r8152: add device id for Lenovo TP USB 3.0 Ethernet af_iucv: fix AF_IUCV sendmsg() errno openvswitch: Return vport module ref before destruction netlink: pad nla_memcpy dest buffer with zeroes bonding: Bonding Overriding Configuration logic restored. ipvlan: fix check for IP addresses in control path ipvlan: do not use rcu operations for address list ipvlan: protect against concurrent link removal ipvlan: fix addr hash list corruption net: fec: setup right value for mdio hold time net: tcp6: fix double call of tcp_v6_fill_cb() cxgb4vf: Fix sparse warnings netns: don't clear nsid too early on removal ...
2015-04-01Merge tag 'lazytime_fix' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull lazytime fixes from Ted Ts'o: "This fixes a problem in the lazy time patches, which can cause frequently updated inods to never have their timestamps updated. These changes guarantee that no timestamp on disk will be stale by more than 24 hours" * tag 'lazytime_fix' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: fs: add dirtytime_expire_seconds sysctl fs: make sure the timestamps for lazytime inodes eventually get written
2015-04-01Merge branch 'for-4.0' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
Pull nfsd fixes from Bruce Fields: "Two main issues: - We found that turning on pNFS by default (when it's configured at build time) was too aggressive, so we want to switch the default before the 4.0 release. - Recent client changes to increase open parallelism uncovered a serious bug lurking in the server's open code. Also fix a krb5/selinux regression. The rest is mainly smaller pNFS fixes" * 'for-4.0' of git://linux-nfs.org/~bfields/linux: sunrpc: make debugfs file creation failure non-fatal nfsd: require an explicit option to enable pNFS NFSD: Fix bad update of layout in nfsd4_return_file_layout NFSD: Take care the return value from nfsd4_encode_stateid NFSD: Printk blocklayout length and offset as format 0x%llx nfsd: return correct lockowner when there is a race on hash insert nfsd: return correct openowner when there is a race to put one in the hash NFSD: Put exports after nfsd4_layout_verify fail NFSD: Error out when register_shrinker() fail NFSD: Take care the return value from nfsd4_decode_stateid NFSD: Check layout type when returning client layouts NFSD: restore trace event lost in mismerge
2015-03-31KVM: s390: migrate vcpu interrupt stateJens Freimann
This patch adds support to migrate vcpu interrupts. Two new vcpu ioctls are added which get/set the complete status of pending interrupts in one go. The ioctls are marked as available with the new capability KVM_CAP_S390_IRQ_STATE. We can not use a ONEREG, as the number of pending local interrupts is not constant and depends on the number of CPUs. To retrieve the interrupt state we add an ioctl KVM_S390_GET_IRQ_STATE. Its input parameter is a pointer to a struct kvm_s390_irq_state which has a buffer and length. For all currently pending interrupts, we copy a struct kvm_s390_irq into the buffer and pass it to userspace. To store interrupt state into a buffer provided by userspace, we add an ioctl KVM_S390_SET_IRQ_STATE. It passes a struct kvm_s390_irq_state into the kernel and injects all interrupts contained in the buffer. Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2015-03-31KVM: s390: add ioctl to inject local interruptsJens Freimann
We have introduced struct kvm_s390_irq a while ago which allows to inject all kinds of interrupts as defined in the Principles of Operation. Add ioctl to inject interrupts with the extended struct kvm_s390_irq Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2015-03-31sunrpc: make debugfs file creation failure non-fatalJeff Layton
We currently have a problem that SELinux policy is being enforced when creating debugfs files. If a debugfs file is created as a side effect of doing some syscall, then that creation can fail if the SELinux policy for that process prevents it. This seems wrong. We don't do that for files under /proc, for instance, so Bruce has proposed a patch to fix that. While discussing that patch however, Greg K.H. stated: "No kernel code should care / fail if a debugfs function fails, so please fix up the sunrpc code first." This patch converts all of the sunrpc debugfs setup code to be void return functins, and the callers to not look for errors from those functions. This should allow rpc_clnt and rpc_xprt creation to work, even if the kernel fails to create debugfs files for some reason. Symptoms were failing krb5 mounts on systems using gss-proxy and selinux. Fixes: 388f0c776781 "sunrpc: add a debugfs rpc_xprt directory..." Cc: stable@vger.kernel.org Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-31block: fix blk_stack_limits() regression due to lcm() changeMike Snitzer
Linux 3.19 commit 69c953c ("lib/lcm.c: lcm(n,0)=lcm(0,n) is 0, not n") caused blk_stack_limits() to not properly stack queue_limits for stacked devices (e.g. DM). Fix this regression by establishing lcm_not_zero() and switching blk_stack_limits() over to using it. DM uses blk_set_stacking_limits() to establish the initial top-level queue_limits that are then built up based on underlying devices' limits using blk_stack_limits(). In the case of optimal_io_size (io_opt) blk_set_stacking_limits() establishes a default value of 0. With commit 69c953c, lcm(0, n) is no longer n, which compromises proper stacking of the underlying devices' io_opt. Test: $ modprobe scsi_debug dev_size_mb=10 num_tgts=1 opt_blks=1536 $ cat /sys/block/sde/queue/optimal_io_size 786432 $ dmsetup create node --table "0 100 linear /dev/sde 0" Before this fix: $ cat /sys/block/dm-5/queue/optimal_io_size 0 After this fix: $ cat /sys/block/dm-5/queue/optimal_io_size 786432 Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org # 3.19+ Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-03-30nfsd: require an explicit option to enable pNFSChristoph Hellwig
Turns out sending out layouts to any client is a bad idea if they can't get at the storage device, so require explicit admin action to enable pNFS. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-30KVM: arm/arm64: rework MMIO abort handling to use KVM MMIO busAndre Przywara
Currently we have struct kvm_exit_mmio for encapsulating MMIO abort data to be passed on from syndrome decoding all the way down to the VGIC register handlers. Now as we switch the MMIO handling to be routed through the KVM MMIO bus, it does not make sense anymore to use that structure already from the beginning. So we keep the data in local variables until we put them into the kvm_io_bus framework. Then we fill kvm_exit_mmio in the VGIC only, making it a VGIC private structure. On that way we replace the data buffer in that structure with a pointer pointing to a single location in a local variable, so we get rid of some copying on the way. With all of the virtual GIC emulation code now being registered with the kvm_io_bus, we can remove all of the old MMIO handling code and its dispatching functionality. I didn't bother to rename kvm_exit_mmio (to vgic_mmio or something), because that touches a lot of code lines without any good reason. This is based on an original patch by Nikolay. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Cc: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-03-30KVM: arm/arm64: prepare GICv3 emulation to use kvm_io_bus MMIO handlingAndre Przywara
Using the framework provided by the recent vgic.c changes, we register a kvm_io_bus device on mapping the virtual GICv3 resources. The distributor mapping is pretty straight forward, but the redistributors need some more love, since they need to be tagged with the respective redistributor (read: VCPU) they are connected with. We use the kvm_io_bus framework to register one devices per VCPU. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-03-29irqchip: gicv3-its: Use non-cacheable accesses when no shareabilityMarc Zyngier
If the ITS or the redistributors report their shareability as zero, then it is important to make sure they will no generate any cacheable traffic, as this is unlikely to produce the expected result. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Link: https://lkml.kernel.org/r/1427465705-17126-5-git-send-email-marc.zyngier@arm.com Signed-off-by: Jason Cooper <jason@lakedaemon.net>
2015-03-29irqchip: gicv3-its: Fix PROP/PEND and BASE/CBASE confusionMarc Zyngier
The ITS driver sometime mixes up the use of GICR_PROPBASE bitfields for the GICR_PENDBASE register, and GITS_BASER for GICR_CBASE. This does not lead to any observable bug because similar bits are at the same location, but this just make the code even harder to understand... This patch provides the required #defines and fixes the mixup. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Link: https://lkml.kernel.org/r/1427465705-17126-4-git-send-email-marc.zyngier@arm.com Signed-off-by: Jason Cooper <jason@lakedaemon.net>
2015-03-29usbnet: Fix tx_bytes statistic running backward in cdc_ncmBen Hutchings
cdc_ncm disagrees with usbnet about how much framing overhead should be counted in the tx_bytes statistics, and tries 'fix' this by decrementing tx_bytes on the transmit path. But statistics must never be decremented except due to roll-over; this will thoroughly confuse user-space. Also, tx_bytes is only incremented by usbnet in the completion path. Fix this by requiring drivers that set FLAG_MULTI_FRAME to set a tx_bytes delta along with the tx_packets count. Fixes: beeecd42c3b4 ("net: cdc_ncm/cdc_mbim: adding NCM protocol statistics") Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Bjørn Mork <bjorn@mork.no>
2015-03-29usbnet: Fix tx_packets stat for FLAG_MULTI_FRAME driversBen Hutchings
Currently the usbnet core does not update the tx_packets statistic for drivers with FLAG_MULTI_PACKET and there is no hook in the TX completion path where they could do this. cdc_ncm and dependent drivers are bumping tx_packets stat on the transmit path while asix and sr9800 aren't updating it at all. Add a packet count in struct skb_data so these drivers can fill it in, initialise it to 1 for other drivers, and add the packet count to the tx_packets statistic on completion. Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Tested-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-27MIPS: KVM: Wire up MSA capabilityJames Hogan
Now that the code is in place for KVM to support MIPS SIMD Architecutre (MSA) in MIPS guests, wire up the new KVM_CAP_MIPS_MSA capability. For backwards compatibility, the capability must be explicitly enabled in order to detect or make use of MSA from the guest. The capability is not supported if the hardware supports MSA vector partitioning, since the extra support cannot be tested yet and it extends the state that the userland program would have to save. Signed-off-by: James Hogan <james.hogan@imgtec.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Gleb Natapov <gleb@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Cc: linux-api@vger.kernel.org Cc: linux-doc@vger.kernel.org
2015-03-27MIPS: KVM: Wire up FPU capabilityJames Hogan
Now that the code is in place for KVM to support FPU in MIPS KVM guests, wire up the new KVM_CAP_MIPS_FPU capability. For backwards compatibility, the capability must be explicitly enabled in order to detect or make use of the FPU from the guest. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Gleb Natapov <gleb@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Cc: linux-api@vger.kernel.org Cc: linux-doc@vger.kernel.org
2015-03-26KVM: arm/arm64: prepare GICv2 emulation to be handled by kvm_io_busAndre Przywara
Using the framework provided by the recent vgic.c changes we register a kvm_io_bus device when initializing the virtual GICv2. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-03-26KVM: arm/arm64: implement kvm_io_bus MMIO handling for the VGICAndre Przywara
Currently we use a lot of VGIC specific code to do the MMIO dispatching. Use the previous reworks to add kvm_io_bus style MMIO handlers. Those are not yet called by the MMIO abort handler, also the actual VGIC emulator function do not make use of it yet, but will be enabled with the following patches. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-03-26KVM: move iodev.h from virt/kvm/ to include/kvmAndre Przywara
iodev.h contains definitions for the kvm_io_bus framework. This is needed both by the generic KVM code in virt/kvm as well as by architecture specific code under arch/. Putting the header file in virt/kvm and using local includes in the architecture part seems at least dodgy to me, so let's move the file into include/kvm, so that a more natural "#include <kvm/iodev.h>" can be used by all of the code. This also solves a problem later when using struct kvm_io_device in arm_vgic.h. Fixing up the FSF address in the GPL header and a wrong include path on the way. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-03-26KVM: Redesign kvm_io_bus_ API to pass VCPU structure to the callbacks.Nikolay Nikolaev
This is needed in e.g. ARM vGIC emulation, where the MMIO handling depends on the VCPU that does the access. Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com> Signed-off-by: Andre Przywara <andre.przywara@arm.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-03-25mm: numa: slow PTE scan rate if migration failures occurMel Gorman
Dave Chinner reported the following on https://lkml.org/lkml/2015/3/1/226 Across the board the 4.0-rc1 numbers are much slower, and the degradation is far worse when using the large memory footprint configs. Perf points straight at the cause - this is from 4.0-rc1 on the "-o bhash=101073" config: - 56.07% 56.07% [kernel] [k] default_send_IPI_mask_sequence_phys - default_send_IPI_mask_sequence_phys - 99.99% physflat_send_IPI_mask - 99.37% native_send_call_func_ipi smp_call_function_many - native_flush_tlb_others - 99.85% flush_tlb_page ptep_clear_flush try_to_unmap_one rmap_walk try_to_unmap migrate_pages migrate_misplaced_page - handle_mm_fault - 99.73% __do_page_fault trace_do_page_fault do_async_page_fault + async_page_fault 0.63% native_send_call_func_single_ipi generic_exec_single smp_call_function_single This is showing excessive migration activity even though excessive migrations are meant to get throttled. Normally, the scan rate is tuned on a per-task basis depending on the locality of faults. However, if migrations fail for any reason then the PTE scanner may scan faster if the faults continue to be remote. This means there is higher system CPU overhead and fault trapping at exactly the time we know that migrations cannot happen. This patch tracks when migration failures occur and slows the PTE scanner. Signed-off-by: Mel Gorman <mgorman@suse.de> Reported-by: Dave Chinner <david@fromorbit.com> Tested-by: Dave Chinner <david@fromorbit.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-03-25Merge branch 'for-4.0-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata Pull libata fix from Tejun Heo: "One patch to fix a regression from the recent switch to blk-mq tag allocation which can cause oops on SAS-attached SATA drives" * 'for-4.0-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: ata: Add a new flag to destinguish sas controller
2015-03-24Merge tag 'regulator-fix-v4.0-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fixes from Mark Brown: "Two fixes here, one typo fix in the documentation and one fix for a system hang with one of the Palmas chips caused by the use of an incorrect offset being provided for one of the registers" * tag 'regulator-fix-v4.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: Fix documentation for regmap in the config regulator: palmas: Correct TPS659038 register definition for REGEN2
2015-03-24Merge tag 'regmap-fix-v4.0-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap Pull regmap fix from Mark Brown: "This patch fixes a bad interaction between the support that was added for having regmaps without devices for early system controller initialization and the trace support. There's a very good analysis of the actual issue in the commit message for the change" * tag 'regmap-fix-v4.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap: regmap: introduce regmap_name to fix syscon regmap trace events
2015-03-23Merge tag 'kvm-s390-next-20150318' of ↵Marcelo Tosatti
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into queue KVM: s390: Features and fixes for 4.1 (kvm/next) 1. Fixes 2. Implement access register mode in KVM 3. Provide a userspace post handler for the STSI instruction 4. Provide an interface for compliant memory accesses 5. Provide an interface for getting/setting the guest storage key 6. Fixup for the vector facility patches: do not announce the vector facility in the guest for old QEMUs. 1-5 were initially shown as RFC in http://www.spinics.net/lists/kvm/msg114720.html some small review changes - added some ACKs - have the AR mode patches first - get rid of unnecessary AR_INVAL define - typos and language 6. two new patches The two new patches fixup the vector support patches that were introduced in the last pull request for QEMU versions that dont know about vector support and guests that do. (We announce the facility bit, but dont enable the facility so vector aware guests will crash on vector instructions).
2015-03-23x86: kvm: Revert "remove sched notifier for cross-cpu migrations"Marcelo Tosatti
The following point: 2. per-CPU pvclock time info is updated if the underlying CPU changes. Is not true anymore since "KVM: x86: update pvclock area conditionally, on cpu migration". Add task migration notification back. Problem noticed by Andy Lutomirski. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> CC: stable@kernel.org # 3.11+
2015-03-23Merge remote-tracking branches 'regulator/fix/doc' and ↵Mark Brown
'regulator/fix/palmas' into regulator-linus
2015-03-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Validate iov ranges before feeding them into iov_iter_init(), from Al Viro. 2) We changed copy_from_msghdr_from_user() to zero out the msg_namelen is a NULL pointer is given for the msg_name. Do the same in the compat code too. From Catalin Marinas. 3) Fix partially initialized tuples in netfilter conntrack helper, from Ian Wilson. 4) Missing continue; statement in nft_hash walker can lead to crashes, from Herbert Xu. 5) tproxy_tg6_check looks for IP6T_INV_PROTO in ->flags instead of ->invflags, fix from Pablo Neira Ayuso. 6) Incorrect memory account of TCP FINs can result in negative socket memory accounting values. Fix from Josh Hunt. 7) Don't allow virtual functions to enable VLAN promiscuous mode in be2net driver, from Vasundhara Volam. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: netfilter: nft_compat: set IP6T_F_PROTO flag if protocol is set cx82310_eth: wait for firmware to become ready net: validate the range we feed to iov_iter_init() in sys_sendto/sys_recvfrom net: compat: Update get_compat_msghdr() to match copy_msghdr_from_user() behaviour be2net: use PCI MMIO read instead of config read for errors be2net: restrict MODIFY_EQ_DELAY cmd to a max of 8 EQs be2net: Prevent VFs from enabling VLAN promiscuous mode tcp: fix tcp fin memory accounting ipv6: fix backtracking for throw routes net: ethernet: pcnet32: Setup the SRAM and NOUFLO on Am79C97{3, 5} ipv6: call ipv6_proxy_select_ident instead of ipv6_select_ident in udp6_ufo_fragment netfilter: xt_TPROXY: fix invflags check in tproxy_tg6_check() netfilter: restore rule tracing via nfnetlink_log netfilter: nf_tables: allow to change chain policy without hook if it exists netfilter: Fix potential crash in nft_hash walker netfilter: Zero the tuple in nfnl_cthelper_parse_tuple()
2015-03-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for your net tree, they are: 1) Fix missing initialization of tuple structure in nfnetlink_cthelper to avoid mismatches when looking up to attach userspace helpers to flows, from Ian Wilson. 2) Fix potential crash in nft_hash when we hit -EAGAIN in nft_hash_walk(), from Herbert Xu. 3) We don't need to indicate the hook information to update the basechain default policy in nf_tables. 4) Restore tracing over nfnetlink_log due to recent rework to accomodate logging infrastructure into nf_tables. 5) Fix wrong IP6T_INV_PROTO check in xt_TPROXY. 6) Set IP6T_F_PROTO flag in nft_compat so we can use SYNPROXY6 and REJECT6 from xt over nftables. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>