summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2017-09-25fsl_qbman: SDK DPAA 1.x QBMan driversMadalin Bucur
commit 4b184a137767069ef1982c94da389f557e1681cb [qbman part] Signed-off-by: Roy Pledge <roy.pledge@nxp.com> Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com> Integrated-by: Zhao Qiang <qiang.zhao@nxp.com>
2017-07-14soc: fsl: add GUTS driver for QorIQ platformsyangbo lu
The global utilities block controls power management, I/O device enabling, power-onreset(POR) configuration monitoring, alternate function selection for multiplexed signals,and clock control. This patch adds a driver to manage and access global utilities block. Initially only reading SVR and registering soc device are supported. Other guts accesses, such as reading RCW, should eventually be moved into this driver as well. Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> [remove makefile/kconfig] Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
2017-07-14drivers/mtd: Add deep sleep support for IFCPrabhakar Kushwaha
Add support of suspend, resume function to support deep sleep. Also make sure of SRAM initialization during resume. Signed-off-by: Raghav Dogra <raghav.dogra@nxp.com> Signed-off-by: Prabhakar Kushwaha <prabhakar.kushwaha@nxp.com>
2017-07-14add devm_alloc_percpu supportZhao Qiang
Signed-off-by: Zhao Qiang <qiang.zhao@nxp.com>
2017-07-14net: readd skb_recycle()Madalin Bucur
Adding back skb_recycle() as it's used by the DPAA Ethernet driver. This was removed from the upstream kernel because it was lacking users. Signed-off-by: Madalin Bucur <madalin.bucur@freescale.com>
2017-07-14fmd: SDK DPAA 1.x FMan driverMadalin Bucur
commit 40382fa630cac60ef72149e0f2bc074501f1e4b4 [add svr.h] Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com> Integrated-by: Zhao Qiang <qiang.zhao@nxp.com>
2017-07-14net: make ndo_get_stats64 a void functionstephen hemminger
The network device operation for reading statistics is only called in one place, and it ignores the return value. Having a structure return value is potentially confusing because some future driver could incorrectly assume that the return value was used. Fix all drivers with ndo_get_stats64 to have a void function. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-14net: centralize net_device min/max MTU checkingJarod Wilson
While looking into an MTU issue with sfc, I started noticing that almost every NIC driver with an ndo_change_mtu function implemented almost exactly the same range checks, and in many cases, that was the only practical thing their ndo_change_mtu function was doing. Quite a few drivers have either 68, 64, 60 or 46 as their minimum MTU value checked, and then various sizes from 1500 to 65535 for their maximum MTU value. We can remove a whole lot of redundant code here if we simple store min_mtu and max_mtu in net_device, and check against those in net/core/dev.c's dev_set_mtu(). In theory, there should be zero functional change with this patch, it just puts the infrastructure in place. Subsequent patches will attempt to start using said infrastructure, with theoretically zero change in functionality. CC: netdev@vger.kernel.org Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-14base: soc: Introduce soc_device_match() interfaceArnd Bergmann
We keep running into cases where device drivers want to know the exact version of the a SoC they are currently running on. In the past, this has usually been done through a vendor specific API that can be called by a driver, or by directly accessing some kind of version register that is not part of the device itself but that belongs to a global register area of the chip. Common reasons for doing this include: - A machine is not using devicetree or similar for passing data about on-chip devices, but just announces their presence using boot-time platform devices, and the machine code itself does not care about the revision. - There is existing firmware or boot loaders with existing DT binaries with generic compatible strings that do not identify the particular revision of each device, but the driver knows which SoC revisions include which part. - A prerelease version of a chip has some quirks and we are using the same version of the bootloader and the DT blob on both the prerelease and the final version. An update of the DT binding seems inappropriate because that would involve maintaining multiple copies of the dts and/or bootloader. This patch introduces the soc_device_match() interface that is meant to work like of_match_node() but instead of identifying the version of a device, it identifies the SoC itself using a vendor-agnostic interface. Unlike of_match_node(), we do not do an exact string compare but instead use glob_match() to allow wildcards in strings. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-14core/linux: usb :fsl: Remove USB Errata checking codeNikhil Badola
Remove USB errata checking code from driver. Applicability of erratum is retrieved by reading corresponding property in device tree. This property is written during device tree fixup. Signed-off-by: Ramneek Mehresh <ramneek.mehresh@freescale.com> Signed-off-by: Nikhil Badola <nikhil.badola@freescale.com> Signed-off-by: yinbo.zhu <yinbo.zhu@nxp.com>
2017-07-14core/linux: fsl/usb: Stops USB controller init if PLL fails to lockRamneek Mehresh
USB erratum-A006918 workaround tries to start internal PHY inside uboot (when PLL fails to lock). However, if the workaround also fails, then USB initialization is also stopped inside Linux. Erratum-A006918 workaround failure creates "fsl,erratum_a006918" node in device-tree. Presence of this node in device-tree is used to stop USB controller initialization in Linux Signed-off-by: Ramneek Mehresh <ramneek.mehresh@freescale.com> Signed-off-by: Suresh Gupta <suresh.gupta@freescale.com> Signed-off-by: yinbo.zhu <yinbo.zhu@nxp.com>
2017-06-30 Merge tag 'v4.9.35' into linux-linaro-lsk-v4.9Alex Shi
This is the 4.9.35 stable release
2017-06-29time: Fix CLOCK_MONOTONIC_RAW sub-nanosecond accountingJohn Stultz
commit 3d88d56c5873f6eebe23e05c3da701960146b801 upstream. Due to how the MONOTONIC_RAW accumulation logic was handled, there is the potential for a 1ns discontinuity when we do accumulations. This small discontinuity has for the most part gone un-noticed, but since ARM64 enabled CLOCK_MONOTONIC_RAW in their vDSO clock_gettime implementation, we've seen failures with the inconsistency-check test in kselftest. This patch addresses the issue by using the same sub-ns accumulation handling that CLOCK_MONOTONIC uses, which avoids the issue for in-kernel users. Since the ARM64 vDSO implementation has its own clock_gettime calculation logic, this patch reduces the frequency of errors, but failures are still seen. The ARM64 vDSO will need to be updated to include the sub-nanosecond xtime_nsec values in its calculation for this issue to be completely fixed. Signed-off-by: John Stultz <john.stultz@linaro.org> Tested-by: Daniel Mentz <danielmentz@google.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Stephen Boyd <stephen.boyd@linaro.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Miroslav Lichvar <mlichvar@redhat.com> Link: http://lkml.kernel.org/r/1496965462-20003-3-git-send-email-john.stultz@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-29time: Fix clock->read(clock) race around clocksource changesJohn Stultz
commit ceea5e3771ed2378668455fa21861bead7504df5 upstream. In tests, which excercise switching of clocksources, a NULL pointer dereference can be observed on AMR64 platforms in the clocksource read() function: u64 clocksource_mmio_readl_down(struct clocksource *c) { return ~(u64)readl_relaxed(to_mmio_clksrc(c)->reg) & c->mask; } This is called from the core timekeeping code via: cycle_now = tkr->read(tkr->clock); tkr->read is the cached tkr->clock->read() function pointer. When the clocksource is changed then tkr->clock and tkr->read are updated sequentially. The code above results in a sequential load operation of tkr->read and tkr->clock as well. If the store to tkr->clock hits between the loads of tkr->read and tkr->clock, then the old read() function is called with the new clock pointer. As a consequence the read() function dereferences a different data structure and the resulting 'reg' pointer can point anywhere including NULL. This problem was introduced when the timekeeping code was switched over to use struct tk_read_base. Before that, it was theoretically possible as well when the compiler decided to reload clock in the code sequence: now = tk->clock->read(tk->clock); Add a helper function which avoids the issue by reading tk_read_base->clock once into a local variable clk and then issue the read function via clk->read(clk). This guarantees that the read() function always gets the proper clocksource pointer handed in. Since there is now no use for the tkr.read pointer, this patch also removes it, and to address stopping the fast timekeeper during suspend/resume, it introduces a dummy clocksource to use rather then just a dummy read function. Signed-off-by: John Stultz <john.stultz@linaro.org> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Stephen Boyd <stephen.boyd@linaro.org> Cc: Miroslav Lichvar <mlichvar@redhat.com> Cc: Daniel Mentz <danielmentz@google.com> Link: http://lkml.kernel.org/r/1496965462-20003-2-git-send-email-john.stultz@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-28Merge branch 'lsk/kdump/for-v4.9' into linux-linaro-lsk-v4.9Alex Shi
2017-06-24mm: larger stack guard gap, between vmasHugh Dickins
commit 1be7107fbe18eed3e319a6c3e83c78254b693acb upstream. Stack guard page is a useful feature to reduce a risk of stack smashing into a different mapping. We have been using a single page gap which is sufficient to prevent having stack adjacent to a different mapping. But this seems to be insufficient in the light of the stack usage in userspace. E.g. glibc uses as large as 64kB alloca() in many commonly used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX] which is 256kB or stack strings with MAX_ARG_STRLEN. This will become especially dangerous for suid binaries and the default no limit for the stack size limit because those applications can be tricked to consume a large portion of the stack and a single glibc call could jump over the guard page. These attacks are not theoretical, unfortunatelly. Make those attacks less probable by increasing the stack guard gap to 1MB (on systems with 4k pages; but make it depend on the page size because systems with larger base pages might cap stack allocations in the PAGE_SIZE units) which should cover larger alloca() and VLA stack allocations. It is obviously not a full fix because the problem is somehow inherent, but it should reduce attack space a lot. One could argue that the gap size should be configurable from userspace, but that can be done later when somebody finds that the new 1MB is wrong for some special case applications. For now, add a kernel command line option (stack_guard_gap) to specify the stack gap size (in page units). Implementation wise, first delete all the old code for stack guard page: because although we could get away with accounting one extra page in a stack vma, accounting a larger gap can break userspace - case in point, a program run with "ulimit -S -v 20000" failed when the 1MB gap was counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK and strict non-overcommit mode. Instead of keeping gap inside the stack vma, maintain the stack guard gap as a gap between vmas: using vm_start_gap() in place of vm_start (or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few places which need to respect the gap - mainly arch_get_unmapped_area(), and and the vma tree's subtree_gap support for that. Original-patch-by: Oleg Nesterov <oleg@redhat.com> Original-patch-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Tested-by: Helge Deller <deller@gmx.de> # parisc Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [wt: backport to 4.11: adjust context] [wt: backport to 4.9: adjust context ; kernel doc was not in admin-guide] Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-24USB: hub: fix SS max number of portsJohan Hovold
commit 93491ced3c87c94b12220dbac0527e1356702179 upstream. Add define for the maximum number of ports on a SuperSpeed hub as per USB 3.1 spec Table 10-5, and use it when verifying the retrieved hub descriptor. This specifically avoids benign attempts to update the DeviceRemovable mask for non-existing ports (should we get that far). Fixes: dbe79bbe9dcb ("USB 3.0 Hub Changes") Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-18 Merge tag 'v4.9.33' into linux-linaro-lsk-v4.9Alex Shi
This is the 4.9.33 stable release
2017-06-17netfilter: nft_log: restrict the log prefix length to 127Liping Zhang
[ Upstream commit 5ce6b04ce96896e8a79e6f60740ced911eaac7a4 ] First, log prefix will be truncated to NF_LOG_PREFIXLEN-1, i.e. 127, at nf_log_packet(), so the extra part is useless. Second, after adding a log rule with a very very long prefix, we will fail to dump the nft rules after this _special_ one, but acctually, they do exist. For example: # name_65000=$(printf "%0.sQ" {1..65000}) # nft add rule filter output log prefix "$name_65000" # nft add rule filter output counter # nft add rule filter output counter # nft list chain filter output table ip filter { chain output { type filter hook output priority 0; policy accept; } } So now, restrict the log prefix length to NF_LOG_PREFIXLEN-1. Fixes: 96518518cc41 ("netfilter: add nftables") Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-17kernel/watchdog: prevent false hardlockup on overloaded systemDon Zickus
[ Upstream commit b94f51183b0617e7b9b4fb4137d4cf1cab7547c2 ] On an overloaded system, it is possible that a change in the watchdog threshold can be delayed long enough to trigger a false positive. This can easily be achieved by having a cpu spinning indefinitely on a task, while another cpu updates watchdog threshold. What happens is while trying to park the watchdog threads, the hrtimers on the other cpus trigger and reprogram themselves with the new slower watchdog threshold. Meanwhile, the nmi watchdog is still programmed with the old faster threshold. Because the one cpu is blocked, it prevents the thread parking on the other cpus from completing, which is needed to shutdown the nmi watchdog and reprogram it correctly. As a result, a false positive from the nmi watchdog is reported. Fix this by setting a park_in_progress flag to block all lockups until the parking is complete. Fix provided by Ulrich Obergfell. [akpm@linux-foundation.org: s/park_in_progress/watchdog_park_in_progress/] Link: http://lkml.kernel.org/r/1481041033-192236-1-git-send-email-dzickus@redhat.com Signed-off-by: Don Zickus <dzickus@redhat.com> Reviewed-by: Aaron Tomlin <atomlin@redhat.com> Cc: Ulrich Obergfell <uobergfe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-17kernel/watchdog.c: move shared definitions to nmi.hBabu Moger
[ Upstream commit 249e52e35580fcfe5dad53a7dcd7c1252788749c ] Patch series "Clean up watchdog handlers", v2. This is an attempt to cleanup watchdog handlers. Right now, kernel/watchdog.c implements both softlockup and hardlockup detectors. Softlockup code is generic. Hardlockup code is arch specific. Some architectures don't use hardlockup detectors. They use their own watchdog detectors. To make both these combination work, we have numerous #ifdefs in kernel/watchdog.c. We are trying here to make these handlers independent of each other. Also provide an interface for architectures to implement their own handlers. watchdog_nmi_enable and watchdog_nmi_disable will be defined as weak such that architectures can override its definitions. Thanks to Don Zickus for his suggestions. Here are our previous discussions http://www.spinics.net/lists/sparclinux/msg16543.html http://www.spinics.net/lists/sparclinux/msg16441.html This patch (of 3): Move shared macros and definitions to nmi.h so that watchdog.c, new file watchdog_hld.c or any other architecture specific handler can use those definitions. Link: http://lkml.kernel.org/r/1478034826-43888-2-git-send-email-babu.moger@oracle.com Signed-off-by: Babu Moger <babu.moger@oracle.com> Acked-by: Don Zickus <dzickus@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Andi Kleen <andi@firstfloor.org> Cc: Yaowei Bai <baiyaowei@cmss.chinamobile.com> Cc: Aaron Tomlin <atomlin@redhat.com> Cc: Ulrich Obergfell <uobergfe@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com> Cc: Josh Hunt <johunt@akamai.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-17net: phy: micrel: add support for KSZ8795Sean Nyekjaer
[ Upstream commit 9d162ed69f51cbd9ee5a0c7e82aba7acc96362ff ] This is adds support for the PHYs in the KSZ8795 5port managed switch. It will allow to detect the link between the switch and the soc and uses the same read_status functions as the KSZ8873MLL switch. Signed-off-by: Sean Nyekjaer <sean.nyekjaer@prevas.dk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-17drm: Don't race connector registrationDaniel Vetter
[ Upstream commit e6e7b48b295afa5a5ab440de0a94d9ad8b3ce2d0 ] I was under the misconception that the sysfs dev stuff can be fully set up, and then registered all in one step with device_add. That's true for properties and property groups, but not for parents and child devices. Those must be fully registered before you can register a child. Add a bit of tracking to make sure that asynchronous mst connector hotplugging gets this right. For consistency we rely upon the implicit barriers of the connector->mutex, which is taken anyway, to ensure that at least either the connector or device registration call will work out. Mildly tested since I can't reliably reproduce this on my mst box here. Reported-by: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1484237756-2720-1-git-send-email-daniel.vetter@ffwll.ch Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-17drm: prevent double-(un)registration for connectorsDaniel Vetter
[ Upstream commit e73ab00e9a0f1731f34d0620a9c55f5c30c4ad4e ] If we're unlucky then the registration from a hotplugged connector might race with the final registration step on driver load. And since MST topology discover is asynchronous that's even somewhat likely. v2: Also update the kerneldoc for @registered! v3: Review from Chris: - Improve kerneldoc for late_register/early_unregister callbacks. - Use mutex_destroy. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Sean Paul <seanpaul@chromium.org> Reported-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161218133545.2106-1-daniel.vetter@ffwll.ch Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-17ipv6: fix flow labels when the traffic class is non-0Dimitris Michailidis
[ Upstream commit 90427ef5d2a4b9a24079889bf16afdcdaebc4240 ] ip6_make_flowlabel() determines the flow label for IPv6 packets. It's supposed to be passed a flow label, which it returns as is if non-0 and in some other cases, otherwise it calculates a new value. The problem is callers often pass a flowi6.flowlabel, which may also contain traffic class bits. If the traffic class is non-0 ip6_make_flowlabel() mistakes the non-0 it gets as a flow label and returns the whole thing. Thus it can return a 'flow label' longer than 20b and the low 20b of that is typically 0 resulting in packets with 0 label. Moreover, different packets of a flow may be labeled differently. For a TCP flow with ECN non-payload and payload packets get different labels as exemplified by this pair of consecutive packets: (pure ACK) Internet Protocol Version 6, Src: 2002:af5:11a3::, Dst: 2002:af5:11a2:: 0110 .... = Version: 6 .... 0000 0000 .... .... .... .... .... = Traffic Class: 0x00 (DSCP: CS0, ECN: Not-ECT) .... 0000 00.. .... .... .... .... .... = Differentiated Services Codepoint: Default (0) .... .... ..00 .... .... .... .... .... = Explicit Congestion Notification: Not ECN-Capable Transport (0) .... .... .... 0001 1100 1110 0100 1001 = Flow Label: 0x1ce49 Payload Length: 32 Next Header: TCP (6) (payload) Internet Protocol Version 6, Src: 2002:af5:11a3::, Dst: 2002:af5:11a2:: 0110 .... = Version: 6 .... 0000 0010 .... .... .... .... .... = Traffic Class: 0x02 (DSCP: CS0, ECN: ECT(0)) .... 0000 00.. .... .... .... .... .... = Differentiated Services Codepoint: Default (0) .... .... ..10 .... .... .... .... .... = Explicit Congestion Notification: ECN-Capable Transport codepoint '10' (2) .... .... .... 0000 0000 0000 0000 0000 = Flow Label: 0x00000 Payload Length: 688 Next Header: TCP (6) This patch allows ip6_make_flowlabel() to be passed more than just a flow label and has it extract the part it really wants. This was simpler than modifying the callers. With this patch packets like the above become Internet Protocol Version 6, Src: 2002:af5:11a3::, Dst: 2002:af5:11a2:: 0110 .... = Version: 6 .... 0000 0000 .... .... .... .... .... = Traffic Class: 0x00 (DSCP: CS0, ECN: Not-ECT) .... 0000 00.. .... .... .... .... .... = Differentiated Services Codepoint: Default (0) .... .... ..00 .... .... .... .... .... = Explicit Congestion Notification: Not ECN-Capable Transport (0) .... .... .... 1010 1111 1010 0101 1110 = Flow Label: 0xafa5e Payload Length: 32 Next Header: TCP (6) Internet Protocol Version 6, Src: 2002:af5:11a3::, Dst: 2002:af5:11a2:: 0110 .... = Version: 6 .... 0000 0010 .... .... .... .... .... = Traffic Class: 0x02 (DSCP: CS0, ECN: ECT(0)) .... 0000 00.. .... .... .... .... .... = Differentiated Services Codepoint: Default (0) .... .... ..10 .... .... .... .... .... = Explicit Congestion Notification: ECN-Capable Transport codepoint '10' (2) .... .... .... 1010 1111 1010 0101 1110 = Flow Label: 0xafa5e Payload Length: 688 Next Header: TCP (6) Signed-off-by: Dimitris Michailidis <dmichail@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-17fscache: Fix dead object requeueDavid Howells
[ Upstream commit e26bfebdfc0d212d366de9990a096665d5c0209a ] Under some circumstances, an fscache object can become queued such that it fscache_object_work_func() can be called once the object is in the OBJECT_DEAD state. This results in the kernel oopsing when it tries to invoke the handler for the state (which is hard coded to 0x2). The way this comes about is something like the following: (1) The object dispatcher is processing a work state for an object. This is done in workqueue context. (2) An out-of-band event comes in that isn't masked, causing the object to be queued, say EV_KILL. (3) The object dispatcher finishes processing the current work state on that object and then sees there's another event to process, so, without returning to the workqueue core, it processes that event too. It then follows the chain of events that initiates until we reach OBJECT_DEAD without going through a wait state (such as WAIT_FOR_CLEARANCE). At this point, object->events may be 0, object->event_mask will be 0 and oob_event_mask will be 0. (4) The object dispatcher returns to the workqueue processor, and in due course, this sees that the object's work item is still queued and invokes it again. (5) The current state is a work state (OBJECT_DEAD), so the dispatcher jumps to it - resulting in an OOPS. When I'm seeing this, the work state in (1) appears to have been either LOOK_UP_OBJECT or CREATE_OBJECT (object->oob_table is fscache_osm_lookup_oob). The window for (2) is very small: (A) object->event_mask is cleared whilst the event dispatch process is underway - though there's no memory barrier to force this to the top of the function. The window, therefore is from the time the object was selected by the workqueue processor and made requeueable to the time the mask was cleared. (B) fscache_raise_event() will only queue the object if it manages to set the event bit and the corresponding event_mask bit was set. The enqueuement is then deferred slightly whilst we get a ref on the object and get the per-CPU variable for workqueue congestion. This slight deferral slightly increases the probability by allowing extra time for the workqueue to make the item requeueable. Handle this by giving the dead state a processor function and checking the for the dead state address rather than seeing if the processor function is address 0x2. The dead state processor function can then set a flag to indicate that it's occurred and give a warning if it occurs more than once per object. If this race occurs, an oops similar to the following is seen (note the RIP value): BUG: unable to handle kernel NULL pointer dereference at 0000000000000002 IP: [<0000000000000002>] 0x1 PGD 0 Oops: 0010 [#1] SMP Modules linked in: ... CPU: 17 PID: 16077 Comm: kworker/u48:9 Not tainted 3.10.0-327.18.2.el7.x86_64 #1 Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 12/27/2015 Workqueue: fscache_object fscache_object_work_func [fscache] task: ffff880302b63980 ti: ffff880717544000 task.ti: ffff880717544000 RIP: 0010:[<0000000000000002>] [<0000000000000002>] 0x1 RSP: 0018:ffff880717547df8 EFLAGS: 00010202 RAX: ffffffffa0368640 RBX: ffff880edf7a4480 RCX: dead000000200200 RDX: 0000000000000002 RSI: 00000000ffffffff RDI: ffff880edf7a4480 RBP: ffff880717547e18 R08: 0000000000000000 R09: dfc40a25cb3a4510 R10: dfc40a25cb3a4510 R11: 0000000000000400 R12: 0000000000000000 R13: ffff880edf7a4510 R14: ffff8817f6153400 R15: 0000000000000600 FS: 0000000000000000(0000) GS:ffff88181f420000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000002 CR3: 000000000194a000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: ffffffffa0363695 ffff880edf7a4510 ffff88093f16f900 ffff8817faa4ec00 ffff880717547e60 ffffffff8109d5db 00000000faa4ec18 0000000000000000 ffff8817faa4ec18 ffff88093f16f930 ffff880302b63980 ffff88093f16f900 Call Trace: [<ffffffffa0363695>] ? fscache_object_work_func+0xa5/0x200 [fscache] [<ffffffff8109d5db>] process_one_work+0x17b/0x470 [<ffffffff8109e4ac>] worker_thread+0x21c/0x400 [<ffffffff8109e290>] ? rescuer_thread+0x400/0x400 [<ffffffff810a5acf>] kthread+0xcf/0xe0 [<ffffffff810a5a00>] ? kthread_create_on_node+0x140/0x140 [<ffffffff816460d8>] ret_from_fork+0x58/0x90 [<ffffffff810a5a00>] ? kthread_create_on_node+0x140/0x140 Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Jeremy McNicoll <jeremymc@redhat.com> Tested-by: Frank Sorenson <sorenson@redhat.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-17net: fix ndo_features_check/ndo_fix_features comment orderingDimitris Michailidis
[ Upstream commit 1a2a14444d32b89b28116daea86f63ced1716668 ] Commit cdba756f5803a2 ("net: move ndo_features_check() close to ndo_start_xmit()") inadvertently moved the doc comment for .ndo_fix_features instead of .ndo_features_check. Fix the comment ordering. Fixes: cdba756f5803a2 ("net: move ndo_features_check() close to ndo_start_xmit()") Signed-off-by: Dimitris Michailidis <dmichail@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-17log2: make order_base_2() behave correctly on const input value zeroArd Biesheuvel
commit 29905b52fad0854351f57bab867647e4982285bf upstream. The function order_base_2() is defined (according to the comment block) as returning zero on input zero, but subsequently passes the input into roundup_pow_of_two(), which is explicitly undefined for input zero. This has gone unnoticed until now, but optimization passes in GCC 7 may produce constant folded function instances where a constant value of zero is passed into order_base_2(), resulting in link errors against the deliberately undefined '____ilog2_NaN'. So update order_base_2() to adhere to its own documented interface. [ See http://marc.info/?l=linux-kernel&m=147672952517795&w=2 and follow-up discussion for more background. The gcc "optimization pass" is really just broken, but now the GCC trunk problem seems to have escaped out of just specially built daily images, so we need to work around it in mainline. - Linus ] Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-17PCI/PM: Add needs_resume flag to avoid suspend complete optimizationImre Deak
commit 4d071c3238987325b9e50e33051a40d1cce311cc upstream. Some drivers - like i915 - may not support the system suspend direct complete optimization due to differences in their runtime and system suspend sequence. Add a flag that when set resumes the device before calling the driver's system suspend handlers which effectively disables the optimization. Needed by a future patch fixing suspend/resume on i915. Suggested by Rafael. Signed-off-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: stable@vger.kernel.org (rebased on v4.8, added kernel version to commit message stable tag) Signed-off-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-15Merge remote-tracking branch 'lts/linux-4.9.y' into linux-linaro-lsk-v4.9Alex Shi
Conflicts: arch/arm64/kernel/entry.S compatiable with PAN in arch/arm64/kernel/traps.c
2017-06-14fs: add i_blocksize()Fabian Frederick
commit 93407472a21b82f39c955ea7787e5bc7da100642 upstream. Replace all 1 << inode->i_blkbits and (1 << inode->i_blkbits) in fs branch. This patch also fixes multiple checkpatch warnings: WARNING: Prefer 'unsigned int' to bare use of 'unsigned' Thanks to Andrew Morton for suggesting more appropriate function instead of macro. [geliangtang@gmail.com: truncate: use i_blocksize()] Link: http://lkml.kernel.org/r/9c8b2cd83c8f5653805d43debde9fa8817e02fc4.1484895804.git.geliangtang@gmail.com Link: http://lkml.kernel.org/r/1481319905-10126-1-git-send-email-fabf@skynet.be Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-14cpuset: consider dying css as offlineTejun Heo
commit 41c25707d21716826e3c1f60967f5550610ec1c9 upstream. In most cases, a cgroup controller don't care about the liftimes of cgroups. For the controller, a css becomes online when ->css_online() is called on it and offline when ->css_offline() is called. However, cpuset is special in that the user interface it exposes cares whether certain cgroups exist or not. Combined with the RCU delay between cgroup removal and css offlining, this can lead to user visible behavior oddities where operations which should succeed after cgroup removals fail for some time period. The effects of cgroup removals are delayed when seen from userland. This patch adds css_is_dying() which tests whether offline is pending and updates is_cpuset_online() so that the function returns false also while offline is pending. This gets rid of the userland visible delays. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Daniel Jordan <daniel.m.jordan@oracle.com> Link: http://lkml.kernel.org/r/327ca1f5-7957-fbb9-9e5f-9ba149d40ba2@oracle.com Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-14cgroup: Prevent kill_css() from being called more than onceWaiman Long
commit 33c35aa4817864e056fd772230b0c6b552e36ea2 upstream. The kill_css() function may be called more than once under the condition that the css was killed but not physically removed yet followed by the removal of the cgroup that is hosting the css. This patch prevents any harmm from being done when that happens. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-14ptrace: Properly initialize ptracer_cred on forkEric W. Biederman
commit c70d9d809fdeecedb96972457ee45c49a232d97f upstream. When I introduced ptracer_cred I failed to consider the weirdness of fork where the task_struct copies the old value by default. This winds up leaving ptracer_cred set even when a process forks and the child process does not wind up being ptraced. Because ptracer_cred is not set on non-ptraced processes whose parents were ptraced this has broken the ability of the enlightenment window manager to start setuid children. Fix this by properly initializing ptracer_cred in ptrace_init_task This must be done with a little bit of care to preserve the current value of ptracer_cred when ptrace carries through fork. Re-reading the ptracer_cred from the ptracing process at this point is inconsistent with how PT_PTRACE_CAP has been maintained all of these years. Tested-by: Takashi Iwai <tiwai@suse.de> Fixes: 64b875f7ac8a ("ptrace: Capture the ptracer's creds not PT_PTRACE_CAP") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-14net: ping: do not abuse udp_poll()Eric Dumazet
[ Upstream commit 77d4b1d36926a9b8387c6b53eeba42bcaaffcea3 ] Alexander reported various KASAN messages triggered in recent kernels The problem is that ping sockets should not use udp_poll() in the first place, and recent changes in UDP stack finally exposed this old bug. Fixes: c319b4d76b9e ("net: ipv4: add IPPROTO_ICMP socket kind") Fixes: 6d0bfe226116 ("net: ipv6: Add IPv6 support to the ping socket.") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Sasha Levin <alexander.levin@verizon.com> Cc: Solar Designer <solar@openwall.com> Cc: Vasiliy Kulikov <segoon@openwall.com> Cc: Lorenzo Colitti <lorenzo@google.com> Acked-By: Lorenzo Colitti <lorenzo@google.com> Tested-By: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-08memblock: add memblock_cap_memory_range()AKASHI Takahiro
Add memblock_cap_memory_range() which will remove all the memblock regions except the memory range specified in the arguments. In addition, rework is done on memblock_mem_limit_remove_map() to re-implement it using memblock_cap_memory_range(). This function, like memblock_mem_limit_remove_map(), will not remove memblocks with MEMMAP_NOMAP attribute as they may be mapped and accessed later as "device memory." See the commit a571d4eb55d8 ("mm/memblock.c: add new infrastructure to address the mem limit issue"). This function is used, in a succeeding patch in the series of arm64 kdump suuport, to limit the range of usable memory, or System RAM, on crash dump kernel. (Please note that "mem=" parameter is of little use for this purpose.) Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Reviewed-by: Will Deacon <will.deacon@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Dennis Chen <dennis.chen@arm.com> Cc: linux-mm@kvack.org Cc: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-06-08memblock: add memblock_clear_nomap()AKASHI Takahiro
This function, with a combination of memblock_mark_nomap(), will be used in a later kdump patch for arm64 when it temporarily isolates some range of memory from the other memory blocks in order to create a specific kernel mapping at boot time. Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-06-08 Merge tag 'v4.9.31' into linux-linaro-lsk-v4.9Alex Shi
This is the 4.9.31 stable release
2017-06-07mm: consider memblock reservations for deferred memory initialization sizingMichal Hocko
commit 864b9a393dcb5aed09b8fd31b9bbda0fdda99374 upstream. We have seen an early OOM killer invocation on ppc64 systems with crashkernel=4096M: kthreadd invoked oom-killer: gfp_mask=0x16040c0(GFP_KERNEL|__GFP_COMP|__GFP_NOTRACK), nodemask=7, order=0, oom_score_adj=0 kthreadd cpuset=/ mems_allowed=7 CPU: 0 PID: 2 Comm: kthreadd Not tainted 4.4.68-1.gd7fe927-default #1 Call Trace: dump_stack+0xb0/0xf0 (unreliable) dump_header+0xb0/0x258 out_of_memory+0x5f0/0x640 __alloc_pages_nodemask+0xa8c/0xc80 kmem_getpages+0x84/0x1a0 fallback_alloc+0x2a4/0x320 kmem_cache_alloc_node+0xc0/0x2e0 copy_process.isra.25+0x260/0x1b30 _do_fork+0x94/0x470 kernel_thread+0x48/0x60 kthreadd+0x264/0x330 ret_from_kernel_thread+0x5c/0xa4 Mem-Info: active_anon:0 inactive_anon:0 isolated_anon:0 active_file:0 inactive_file:0 isolated_file:0 unevictable:0 dirty:0 writeback:0 unstable:0 slab_reclaimable:5 slab_unreclaimable:73 mapped:0 shmem:0 pagetables:0 bounce:0 free:0 free_pcp:0 free_cma:0 Node 7 DMA free:0kB min:0kB low:0kB high:0kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:52428800kB managed:110016kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:320kB slab_unreclaimable:4672kB kernel_stack:1152kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes lowmem_reserve[]: 0 0 0 0 Node 7 DMA: 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB 0*8192kB 0*16384kB = 0kB 0 total pagecache pages 0 pages in swap cache Swap cache stats: add 0, delete 0, find 0/0 Free swap = 0kB Total swap = 0kB 819200 pages RAM 0 pages HighMem/MovableOnly 817481 pages reserved 0 pages cma reserved 0 pages hwpoisoned the reason is that the managed memory is too low (only 110MB) while the rest of the the 50GB is still waiting for the deferred intialization to be done. update_defer_init estimates the initial memoty to initialize to 2GB at least but it doesn't consider any memory allocated in that range. In this particular case we've had Reserving 4096MB of memory at 128MB for crashkernel (System RAM: 51200MB) so the low 2GB is mostly depleted. Fix this by considering memblock allocations in the initial static initialization estimation. Move the max_initialise to reset_deferred_meminit and implement a simple memblock_reserved_memory helper which iterates all reserved blocks and sums the size of all that start below the given address. The cumulative size is than added on top of the initial estimation. This is still not ideal because reset_deferred_meminit doesn't consider holes and so reservation might be above the initial estimation whihch we ignore but let's make the logic simpler until we really need to handle more complicated cases. Fixes: 3a80a7fa7989 ("mm: meminit: initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set") Link: http://lkml.kernel.org/r/20170531104010.GI27783@dhcp22.suse.cz Signed-off-by: Michal Hocko <mhocko@suse.com> Acked-by: Mel Gorman <mgorman@suse.de> Tested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-07ipv4: add reference counting to metricsEric Dumazet
[ Upstream commit 3fb07daff8e99243366a081e5129560734de4ada ] Andrey Konovalov reported crashes in ipv4_mtu() I could reproduce the issue with KASAN kernels, between 10.246.7.151 and 10.246.7.152 : 1) 20 concurrent netperf -t TCP_RR -H 10.246.7.152 -l 1000 & 2) At the same time run following loop : while : do ip ro add 10.246.7.152 dev eth0 src 10.246.7.151 mtu 1500 ip ro del 10.246.7.152 dev eth0 src 10.246.7.151 mtu 1500 done Cong Wang attempted to add back rt->fi in commit 82486aa6f1b9 ("ipv4: restore rt->fi for reference counting") but this proved to add some issues that were complex to solve. Instead, I suggested to add a refcount to the metrics themselves, being a standalone object (in particular, no reference to other objects) I tried to make this patch as small as possible to ease its backport, instead of being super clean. Note that we believe that only ipv4 dst need to take care of the metric refcount. But if this is wrong, this patch adds the basic infrastructure to extend this to other families. Many thanks to Julian Anastasov for reviewing this patch, and Cong Wang for his efforts on this problem. Fixes: 2860583fe840 ("ipv4: Kill rt->fi") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Julian Anastasov <ja@ssi.bg> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-07vlan: Fix tcp checksum offloads in Q-in-Q vlansVlad Yasevich
[ Upstream commit 35d2f80b07bbe03fb358afb0bdeff7437a7d67ff ] It appears that TCP checksum offloading has been broken for Q-in-Q vlans. The behavior was execerbated by the series commit afb0bc972b52 ("Merge branch 'stacked_vlan_tso'") that that enabled accleleration features on stacked vlans. However, event without that series, it is possible to trigger this issue. It just requires a lot more specialized configuration. The root cause is the interaction between how netdev_intersect_features() works, the features actually set on the vlan devices and HW having the ability to run checksum with longer headers. The issue starts when netdev_interesect_features() replaces NETIF_F_HW_CSUM with a combination of NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM, if the HW advertises IP|IPV6 specific checksums. This happens for tagged and multi-tagged packets. However, HW that enables IP|IPV6 checksum offloading doesn't gurantee that packets with arbitrarily long headers can be checksummed. This patch disables IP|IPV6 checksums on the packet for multi-tagged packets. CC: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> CC: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Acked-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-07net/mlx5: Avoid using pending command interface slotsMohamad Haj Yahia
[ Upstream commit 73dd3a4839c1d27c36d4dcc92e1ff44225ecbeb7 ] Currently when firmware command gets stuck or it takes long time to complete, the driver command will get timeout and the command slot is freed and can be used for new commands, and if the firmware receive new command on the old busy slot its behavior is unexpected and this could be harmful. To fix this when the driver command gets timeout we return failure, but we don't free the command slot and we wait for the firmware to explicitly respond to that command. Once all the entries are busy we will stop processing new firmware commands. Fixes: 9cba4ebcf374 ('net/mlx5: Fix potential deadlock in command mode change') Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Cc: kernel-team@fb.com Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-27Merge branch 'v4.9/topic/PANemulation' into linux-linaro-lsk-v4.9Alex Shi
2017-05-27thread_info: include <current.h> for THREAD_INFO_IN_TASKMark Rutland
When CONFIG_THREAD_INFO_IN_TASK is selected, the current_thread_info() macro relies on current having been defined prior to its use. However, not all users of current_thread_info() include <asm/current.h>, and thus current is not guaranteed to be defined. When CONFIG_THREAD_INFO_IN_TASK is not selected, it's possible that get_current() / current are based upon current_thread_info(), and <asm/current.h> includes <asm/thread_info.h>. Thus always including <asm/current.h> would result in circular dependences on some platforms. To ensure both cases work, this patch includes <asm/current.h>, but only when CONFIG_THREAD_INFO_IN_TASK is selected. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Andy Lutomirski <luto@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> (cherry picked from commit dc3d2a679cd8631b8a570fc8ca5f4712d7d25698) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2017-05-27thread_info: factor out restart_blockMark Rutland
Since commit f56141e3e2d9aabf ("all arches, signal: move restart_block to struct task_struct"), thread_info and restart_block have been logically distinct, yet struct restart_block is still defined in <linux/thread_info.h>. At least one architecture (erroneously) uses restart_block as part of its thread_info, and thus the definition of restart_block must come before the include of <asm/thread_info>. Subsequent patches in this series need to shuffle the order of includes and definitions in <linux/thread_info.h>, and will make this ordering fragile. This patch moves the definition of restart_block out to its own header. This serves as generic cleanup, logically separating thread_info and restart_block, and also makes it easier to avoid fragility. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Andy Lutomirski <luto@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> (cherry picked from commit 53d74d056a4e306a72b8883d325b5d853c0618e6) Signed-off-by: Alex Shi <alex.shi@linaro.org>
2017-05-26 Merge tag 'v4.9.30' into linux-linaro-lsk-v4.9Alex Shi
This is the 4.9.30 stable release
2017-05-25tracing/kprobes: Enforce kprobes teardown after testingThomas Gleixner
commit 30e7d894c1478c88d50ce94ddcdbd7f9763d9cdd upstream. Enabling the tracer selftest triggers occasionally the warning in text_poke(), which warns when the to be modified page is not marked reserved. The reason is that the tracer selftest installs kprobes on functions marked __init for testing. These probes are removed after the tests, but that removal schedules the delayed kprobes_optimizer work, which will do the actual text poke. If the work is executed after the init text is freed, then the warning triggers. The bug can be reproduced reliably when the work delay is increased. Flush the optimizer work and wait for the optimizing/unoptimizing lists to become empty before returning from the kprobes tracer selftest. That ensures that all operations which were queued due to the probes removal have completed. Link: http://lkml.kernel.org/r/20170516094802.76a468bb@gandalf.local.home Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Fixes: 6274de498 ("kprobes: Support delayed unoptimizing") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25iio: hid-sensor: Store restore poll and hysteresis on S3Srinivas Pandruvada
commit 5d9854eaea776441b38a9a45b4e6879524c4f48c upstream. This change undo the change done by 'commit 3bec24747446 ("iio: hid-sensor-trigger: Change get poll value function order to avoid sensor properties losing after resume from S3")' as this breaks some USB/i2c sensor hubs. Instead of relying on HW for restoring poll and hysteresis, driver stores and restores on resume (S3). In this way user space modified settings are not lost for any kind of sensor hub behavior. In this change, whenever user space modifies sampling frequency or hysteresis driver will get the feature value from the hub and store in the per device hid_sensor_common data structure. On resume callback from S3, system will set the feature to sensor hub, if user space ever modified the feature value. Fixes: 3bec24747446 ("iio: hid-sensor-trigger: Change get poll value function order to avoid sensor properties losing after resume from S3") Reported-by: Ritesh Raj Sarraf <rrs@researchut.com> Tested-by: Ritesh Raj Sarraf <rrs@researchut.com> Tested-by: Song, Hongyan <hongyan.song@intel.com> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-23tee: generic TEE subsystemJens Wiklander
Initial patch for generic TEE subsystem. This subsystem provides: * Registration/un-registration of TEE drivers. * Shared memory between normal world and secure world. * Ioctl interface for interaction with user space. * Sysfs implementation_id of TEE driver A TEE (Trusted Execution Environment) driver is a driver that interfaces with a trusted OS running in some secure environment, for example, TrustZone on ARM cpus, or a separate secure co-processor etc. The TEE subsystem can serve a TEE driver for a Global Platform compliant TEE, but it's not limited to only Global Platform TEEs. This patch builds on other similar implementations trying to solve the same problem: * "optee_linuxdriver" by among others Jean-michel DELORME<jean-michel.delorme@st.com> and Emmanuel MICHEL <emmanuel.michel@st.com> * "Generic TrustZone Driver" by Javier González <javier@javigon.com> Acked-by: Andreas Dannenberg <dannenberg@ti.com> Tested-by: Jerome Forissier <jerome.forissier@linaro.org> (HiKey) Tested-by: Volodymyr Babchuk <vlad.babchuk@gmail.com> (RCAR H3) Tested-by: Scott Branden <scott.branden@broadcom.com> Reviewed-by: Javier González <javier@javigon.com> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org> (cherry picked from commit 967c9cca2cc50569efc65945325c173cecba83bd) Signed-off-by: Alex Shi <alex.shi@linaro.org> Conflicts: remove fsi in drivers/Kconfig and drivers/Makefile
2017-05-14block: get rid of blk_integrity_revalidate()Ilya Dryomov
commit 19b7ccf8651df09d274671b53039c672a52ad84d upstream. Commit 25520d55cdb6 ("block: Inline blk_integrity in struct gendisk") introduced blk_integrity_revalidate(), which seems to assume ownership of the stable pages flag and unilaterally clears it if no blk_integrity profile is registered: if (bi->profile) disk->queue->backing_dev_info->capabilities |= BDI_CAP_STABLE_WRITES; else disk->queue->backing_dev_info->capabilities &= ~BDI_CAP_STABLE_WRITES; It's called from revalidate_disk() and rescan_partitions(), making it impossible to enable stable pages for drivers that support partitions and don't use blk_integrity: while the call in revalidate_disk() can be trivially worked around (see zram, which doesn't support partitions and hence gets away with zram_revalidate_disk()), rescan_partitions() can be triggered from userspace at any time. This breaks rbd, where the ceph messenger is responsible for generating/verifying CRCs. Since blk_integrity_{un,}register() "must" be used for (un)registering the integrity profile with the block layer, move BDI_CAP_STABLE_WRITES setting there. This way drivers that call blk_integrity_register() and use integrity infrastructure won't interfere with drivers that don't but still want stable pages. Fixes: 25520d55cdb6 ("block: Inline blk_integrity in struct gendisk") Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Mike Snitzer <snitzer@redhat.com> Tested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> [idryomov@gmail.com: backport to < 4.11: bdi is embedded in queue] Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>