summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-02-25perf: Fix scaling vs. perf_event_enable_on_exec()Peter Zijlstra
The recent commit 3e349507d12d ("perf: Fix perf_enable_on_exec() event scheduling") caused this by moving task_ctx_sched_out() from before __perf_event_mask_enable() to after it. The overlooked consequence of that change is that task_ctx_sched_out() would update the ctx time fields, and now __perf_event_mask_enable() uses stale time. In order to fix this, explicitly stop our context's time before enabling the event(s). Reported-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dvyukov@google.com Cc: eranian@google.com Cc: panand@redhat.com Cc: sasha.levin@oracle.com Cc: vince@deater.net Fixes: 3e349507d12d ("perf: Fix perf_enable_on_exec() event scheduling") Link: http://lkml.kernel.org/r/20160224174948.159242158@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-25perf: Fix ctx time tracking by introducing EVENT_TIMEPeter Zijlstra
Currently any ctx_sched_in() call will re-start the ctx time tracking, this means that calls like: ctx_sched_in(.event_type = EVENT_PINNED); ctx_sched_in(.event_type = EVENT_FLEXIBLE); will have a hole in their ctx time tracking. This is likely harmless but can confuse things a little. By adding EVENT_TIME, we can have the first ctx_sched_in() (is_active: 0 -> !0) start the time and any further ctx_sched_in() will leave the timestamps alone. Secondly, this allows for an early disable like: ctx_sched_out(.event_type = EVENT_TIME); which would update the ctx time (if the ctx is active) and any further calls to ctx_sched_out() would not further modify the ctx time. For ctx_sched_in() any 0 -> !0 transition will automatically include EVENT_TIME. For ctx_sched_out(), any transition that clears EVENT_ALL will automatically clear EVENT_TIME. These two rules ensure that under normal circumstances we need not bother with EVENT_TIME and get natural ctx time behaviour. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dvyukov@google.com Cc: eranian@google.com Cc: oleg@redhat.com Cc: panand@redhat.com Cc: sasha.levin@oracle.com Cc: vince@deater.net Link: http://lkml.kernel.org/r/20160224174948.100446561@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-25perf: Cure event->pending_disable racePeter Zijlstra
Because event_sched_out() checks event->pending_disable _before_ actually disabling the event, it can happen that the event fires after it checks but before it gets disabled. This would leave event->pending_disable set and the queued irq_work will try and process it. However, if the event trigger was during schedule(), the event might have been de-scheduled by the time the irq_work runs, and perf_event_disable_local() will fail. Fix this by checking event->pending_disable _after_ we call event->pmu->del(). This depends on the latter being a compiler barrier, such that the compiler does not lift the load and re-creates the problem. Tested-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dvyukov@google.com Cc: eranian@google.com Cc: oleg@redhat.com Cc: panand@redhat.com Cc: sasha.levin@oracle.com Cc: vince@deater.net Link: http://lkml.kernel.org/r/20160224174948.040469884@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-25perf: Fix race between event install and jump_labelsPeter Zijlstra
perf_install_in_context() relies upon the context switch hooks to have scheduled in events when the IPI misses its target -- after all, if the task has moved from the CPU (or wasn't running at all), it will have to context switch to run elsewhere. This however doesn't appear to be happening. It is possible for the IPI to not happen (task wasn't running) only to later observe the task running with an inactive context. The only possible explanation is that the context switch hooks are not called. Therefore put in a sync_sched() after toggling the jump_label to guarantee all CPUs will have them enabled before we install an event. A simple if (0->1) sync_sched() will not in fact work, because any further increment can race and complete before the sync_sched(). Therefore we must jump through some hoops. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dvyukov@google.com Cc: eranian@google.com Cc: oleg@redhat.com Cc: panand@redhat.com Cc: sasha.levin@oracle.com Cc: vince@deater.net Link: http://lkml.kernel.org/r/20160224174947.980211985@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-25perf: Fix cloningPeter Zijlstra
Alexander reported that when the 'original' context gets destroyed, no new clones happen. This can happen irrespective of the ctx switch optimization, any task can die, even the parent, and we want to continue monitoring the task hierarchy until we either close the event or no tasks are left in the hierarchy. perf_event_init_context() will attempt to pin the 'parent' context during clone(). At that point current is the parent, and since current cannot have exited while executing clone(), its context cannot have passed through perf_event_exit_task_context(). Therefore perf_pin_task_context() cannot observe ctx->task == TASK_TOMBSTONE. However, since inherit_event() does: if (parent_event->parent) parent_event = parent_event->parent; it looks at the 'original' event when it does: is_orphaned_event(). This can return true if the context that contains the this event has passed through perf_event_exit_task_context(). And thus we'll fail to clone the perf context. Fix this by adding a new state: STATE_DEAD, which is set by perf_release() to indicate that the filedesc (or kernel reference) is dead and there are no observers for our data left. Only for STATE_DEAD will is_orphaned_event() be true and inhibit cloning. STATE_EXIT is otherwise preserved such that is_event_hup() remains functional and will report when the observed task hierarchy becomes empty. Reported-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Tested-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dvyukov@google.com Cc: eranian@google.com Cc: oleg@redhat.com Cc: panand@redhat.com Cc: sasha.levin@oracle.com Cc: vince@deater.net Fixes: c6e5b73242d2 ("perf: Synchronously clean up child events") Link: http://lkml.kernel.org/r/20160224174947.919845295@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-25perf: Only update context time when activePeter Zijlstra
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dvyukov@google.com Cc: eranian@google.com Cc: oleg@redhat.com Cc: panand@redhat.com Cc: sasha.levin@oracle.com Cc: vince@deater.net Link: http://lkml.kernel.org/r/20160224174947.860690919@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-25perf: Allow perf_release() with !event->ctxPeter Zijlstra
In the err_file: fput(event_file) case, the event will not yet have been attached to a context. However perf_release() does assume it has been. Cure this. Tested-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dvyukov@google.com Cc: eranian@google.com Cc: oleg@redhat.com Cc: panand@redhat.com Cc: sasha.levin@oracle.com Cc: vince@deater.net Link: http://lkml.kernel.org/r/20160224174947.793996260@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-25perf: Do not double freePeter Zijlstra
In case of: err_file: fput(event_file), we'll end up calling perf_release() which in turn will free the event. Do not then free the event _again_. Tested-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dvyukov@google.com Cc: eranian@google.com Cc: oleg@redhat.com Cc: panand@redhat.com Cc: sasha.levin@oracle.com Cc: vince@deater.net Link: http://lkml.kernel.org/r/20160224174947.697350349@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-25perf: Close install vs. exit racePeter Zijlstra
Consider the following scenario: CPU0 CPU1 ctx = find_get_ctx(); perf_event_exit_task_context() mutex_lock(&ctx->mutex); perf_install_in_context(ctx, ...); /* NO-OP */ mutex_unlock(&ctx->mutex); ... perf_release() WARN_ON_ONCE(event->state != STATE_EXIT); Since the event doesn't pass through perf_remove_from_context() because perf_install_in_context() NO-OPs because the ctx is dead, and perf_event_exit_task_context() will not observe the event because its not attached yet, the event->state will not be set. Solve this by revalidating ctx->task after we acquire ctx->mutex and failing the event creation as a whole. Tested-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dvyukov@google.com Cc: eranian@google.com Cc: oleg@redhat.com Cc: panand@redhat.com Cc: sasha.levin@oracle.com Cc: vince@deater.net Link: http://lkml.kernel.org/r/20160224174947.626853419@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-02-24Merge tag 'arc-4.5-rc6-fixes-upd' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull ARC fixes from Vineet Gupta: - Fix for csd deadlock due to missing self IPI - Accompanying IPI cleanups / optimization - Brown paper bag bug in one of the cleanups above - Boot reporting updates for new hardware features - Don't force DEVTMPFS if INITRAMFS * tag 'arc-4.5-rc6-fixes-upd' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: arc: SMP: CONFIG_ARC_IPI_DBG cleanup ARC: SMP: No need for CONFIG_ARC_IPI_DBG ARCv2: Elide sending new cross core intr if receiver didn't ack prev ARCv2: SMP: Push IPI_IRQ into IPI provider ARC: [intc-compact] Remove IPI setup from ARCompact port ARCv2: SMP: Emulate IPI to self using software triggered interrupt arc: get rid of DEVTMPFS dependency on INITRAMFS_SOURCE ARCv2: boot report CCMs (Closely Coupled Memories) ARCv2: boot print Low Latency Memory ARC: Assume multiplier is always present
2016-02-24Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: "Assorted fixes - xattr one from this cycle, the rest - stable fodder" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs/pnode.c: treat zero mnt_group_id-s as unequal affs_do_readpage_ofs(): just use kmap_atomic() around memcpy() xattr handlers: plug a lock leak in simple_xattr_list fs: allow no_seek_end_llseek to actually seek
2016-02-24thp: call pmdp_invalidate() with correct virtual addressKirill A. Shutemov
Sebastian Ott and Gerald Schaefer reported random crashes on s390. It was bisected to my THP refcounting patchset. The problem is that pmdp_invalidated() called with wrong virtual address. It got offset up by HPAGE_PMD_SIZE by loop over ptes. The solution is to introduce new variable to be used in loop and don't touch 'haddr'. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reported-and-tested-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Reported-and-tested-by Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-24arc: SMP: CONFIG_ARC_IPI_DBG cleanupValentin Rothberg
Previous Commit ("ARC: SMP: No need for CONFIG_ARC_IPI_DBG") removed the Kconfig option ARC_IPI_DBG. Remove the last reference on this option. Signed-off-by: Valentin Rothberg <valentinrothberg@gmail.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-02-24ARC: SMP: No need for CONFIG_ARC_IPI_DBGVineet Gupta
This was more relevant during SMP bringup. The warning for bogus msg better be visible always. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-02-24ARCv2: Elide sending new cross core intr if receiver didn't ack prevVineet Gupta
ARConnect/MCIP IPI sending has a retry-wait loop in case caller had not seen a previous such interrupt. Turns out that it is not needed at all. Linux cross core calling allows coalescing multiple IPIs to same receiver - it is fine as long as there is one. This logic is built into upper layer already, at a higher level of abstraction. ipi_send_msg_one() sets the actual msg payload, but it only calls MCIP IPI sending if msg holder was empty (using atomic-set-new-and-get-old construct). Thus it is unlikely that the retry-wait looping was ever getting exercised at all. Cc: Chuck Jordan <cjordan@synopsys.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-02-24ARCv2: SMP: Push IPI_IRQ into IPI providerVineet Gupta
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-02-24ARC: [intc-compact] Remove IPI setup from ARCompact portVineet Gupta
There is no real ARC700 based SMP SoC so remove IPI definition. EZChip's SMP ARC700 is going to use a different intc and IPI provider anyways. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-02-24ARCv2: SMP: Emulate IPI to self using software triggered interruptVineet Gupta
ARConnect/MCIP Inter-Core-Interrupt module can't send interrupt to local core. So use core intc capability to trigger software interrupt to self, using an unsued IRQ #21. This showed up as csd deadlock with LTP trace_sched on a dual core system. This test acts as scheduler fuzzer, triggering all sorts of schedulting activity. Trouble starts with IPI to self, which doesn't get delivered (effectively lost due to H/w capability), but the msg intended to be sent remain enqueued in per-cpu @ipi_data. All subsequent IPIs to this core from other cores get elided due to the IPI coalescing optimization in ipi_send_msg_one() where a pending msg implies an IPI already sent and assumes other core is yet to ack it. After the elided IPI, other core simply goes into csd_lock_wait() but never comes out as this core never sees the interrupt. Fixes STAR 9001008624 Cc: Peter Zijlstra <peterz@infradead.org> Cc: <stable@vger.kernel.org> [4.2] Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-02-24Merge tag 'dm-4.5-fix' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fix from Mike Snitzer: "Fix a 112 byte leak for each IO request that is requeued while DM multipath is handling faults due to path failures. This leak does not happen if blk-mq DM multipath is used. It only occurs if .request_fn DM multipath is stacked ontop of blk-mq paths (e.g. scsi-mq devices)" * tag 'dm-4.5-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm: fix dm_rq_target_io leak on faults with .request_fn DM w/ blk-mq paths
2016-02-24Merge tag 'mmc-v4.5-rc4' of git://git.linaro.org/people/ulf.hansson/mmcLinus Torvalds
Pull MMC fix from Ulf Hansson: "Here's an mmc fix intended for v4.5 rc6. MMC host: - omap_hsmmc: Fix PM regression for deferred probe" * tag 'mmc-v4.5-rc4' of git://git.linaro.org/people/ulf.hansson/mmc: mmc: omap_hsmmc: Fix PM regression with deferred probe for pm_runtime_reinit
2016-02-24Merge tag 'nfs-for-4.5-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client bugfixes from Trond Myklebust: "Stable bugfixes: - Fix nfs_size_to_loff_t - NFSv4: Fix a dentry leak on alias use Other bugfixes: - Don't schedule a layoutreturn if the layout segment can be freed immediately. - Always set NFS_LAYOUT_RETURN_REQUESTED with lo->plh_return_iomode - rpcrdma_bc_receive_call() should init rq_private_buf.len - fix stateid handling for the NFS v4.2 operations - pnfs/blocklayout: fix a memeory leak when using,vmalloc_to_page - fix panic in gss_pipe_downcall() in fips mode - Fix a race between layoutget and pnfs_destroy_layout - Fix a race between layoutget and bulk recalls" * tag 'nfs-for-4.5-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFSv4.x/pnfs: Fix a race between layoutget and bulk recalls NFSv4.x/pnfs: Fix a race between layoutget and pnfs_destroy_layout auth_gss: fix panic in gss_pipe_downcall() in fips mode pnfs/blocklayout: fix a memeory leak when using,vmalloc_to_page nfs4: fix stateid handling for the NFS v4.2 operations NFSv4: Fix a dentry leak on alias use xprtrdma: rpcrdma_bc_receive_call() should init rq_private_buf.len pNFS: Always set NFS_LAYOUT_RETURN_REQUESTED with lo->plh_return_iomode pNFS: Fix pnfs_mark_matching_lsegs_return() nfs: fix nfs_size_to_loff_t
2016-02-24x86: fix SMAP in 32-bit environmentsLinus Torvalds
In commit 11f1a4b9755f ("x86: reorganize SMAP handling in user space accesses") I changed how the stac/clac instructions were generated around the user space accesses, which then made it possible to do batched accesses efficiently for user string copies etc. However, in doing so, I completely spaced out, and didn't even think about the 32-bit case. And nobody really even seemed to notice, because SMAP doesn't even exist until modern Skylake processors, and you'd have to be crazy to run 32-bit kernels on a modern CPU. Which brings us to Andy Lutomirski. He actually tested the 32-bit kernel on new hardware, and noticed that it doesn't work. My bad. The trivial fix is to add the required uaccess begin/end markers around the raw accesses in <asm/uaccess_32.h>. I feel a bit bad about this patch, just because that header file really should be cleaned up to avoid all the duplicated code in it, and this commit just expands on the problem. But this just fixes the bug without any bigger cleanup surgery. Reported-and-tested-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-23arc: get rid of DEVTMPFS dependency on INITRAMFS_SOURCEAlexey Brodkin
Even though DEVTMPFS is required when our pre-built initramfs is used it is not the case in general. It is perfectly possible to use initramfs with device nodes already populated or there could be other usages, see discussion below for more detials: http://thread.gmane.org/gmane.comp.embedded.openwrt.devel/37819/focus=37821 This change removes mentioned dependency from arch/arc/Kconfig updating instead those defconfigs that are usually used with this kind of pre-build initramfs. And while at it all touched defconfigs were regenerated via savedefconfig and some options were removed: * USB is selected by other options implicitly * VGA_CONSOLE is disableb for ARC since 031e29b5877f31676739dc2f847d04c2c0732034 * EXT3_FS automatically selects EXT4_FS * MTDxxx and JFFS2_FS make no sense for AXS because AXS NAND controller is not upstreamed * NET_OSCI_LAN is not in upstream as well * ARCPGU_xxx options make no sense because ARC PGU is not yet in upstream and when it gets there all config options would be taken from devicetree Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-02-22NFSv4.x/pnfs: Fix a race between layoutget and bulk recallsTrond Myklebust
Replace another case where the layout 'plh_block_lgets' can trigger infinite loops in send_layoutget(). Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-02-22NFSv4.x/pnfs: Fix a race between layoutget and pnfs_destroy_layoutTrond Myklebust
If the server reboots while there is a layoutget outstanding, then the call to pnfs_choose_layoutget_stateid() will fail with an EAGAIN error, which causes an infinite loop in send_layoutget(). The reason why we never break out of the loop is that the layout 'plh_block_lgets' field is never cleared. Fix is to replace plh_block_lgets with NFS_LAYOUT_INVALID_STID, which can be reset after a new layoutget. Fixes: ab7d763e477c5 ("pNFS: Ensure nfs4_layoutget_prepare returns...") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-02-22Merge tag 'trace-fixes-v4.5-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "Two more small fixes. One is by Yang Shi who added a READ_ONCE_NOCHECK() to the scan of the stack made by the stack tracer. As the stack tracer scans the entire kernel stack, KASAN triggers seeing it as a "stack out of bounds" error. As the scan is looking at the contents of the stack from parent functions. The NOCHECK() tells KASAN that this is done on purpose, and is not some kind of stack overflow. The second fix is to the ftrace selftests, to retrieve the PID of executed commands from the shell with '$!' and not by parsing 'jobs'" * tag 'trace-fixes-v4.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing, kasan: Silence Kasan warning in check_stack of stack_tracer ftracetest: Fix instance test to use proper shell command for pids
2016-02-22Merge tag 'for-linus-4.5-rc5-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen bug fixes from David Vrabel: - Two scsiback fixes (resource leak and spurious warning). - Fix DMA mapping of compound pages on arm/arm64. - Fix some pciback regressions in MSI-X handling. - Fix a pcifront crash due to some uninitialize state. * tag 'for-linus-4.5-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/pcifront: Fix mysterious crashes when NUMA locality information was extracted. xen/pcifront: Report the errors better. xen/pciback: Save the number of MSI-X entries to be copied later. xen/pciback: Check PF instead of VF for PCI_COMMAND_MEMORY xen: fix potential integer overflow in queue_reply xen/arm: correctly handle DMA mapping of compound pages xen/scsiback: avoid warnings when adding multiple LUNs to a domain xen/scsiback: correct frontend counting
2016-02-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: "Looks like a lot, but mostly driver fixes scattered all over as usual. Of note: 1) Add conditional sched in nf conntrack in cleanup to avoid NMI watchdogs. From Florian Westphal. 2) Fix deadlock in nfnetlink cttimeout, also from Floarian. 3) Fix handling of slaves in bonding ARP monitor validation, from Jay Vosburgh. 4) Callers of ip_cmsg_send() are responsible for freeing IP options, some were not doing so. Fix from Eric Dumazet. 5) Fix per-cpu bugs in mvneta driver, from Gregory CLEMENT. 6) Fix vlan handling in mv88e6xxx DSA driver, from Vivien Didelot. 7) bcm7xxx PHY driver bug fixes from Florian Fainelli. 8) Avoid unaligned accesses to protocol headers wrt. GRE, from Alexander Duyck. 9) SKB leaks and other problems in arc_emac driver, from Alexander Kochetkov. 10) tcp_v4_inbound_md5_hash() releases listener socket instead of request socket on error path, oops. Fix from Eric Dumazet. 11) Missing socket release in pppoe_rcv_core() that seems to have existed basically forever. From Guillaume Nault. 12) Missing slave_dev unregister in dsa_slave_create() error path, from Florian Fainelli. 13) crypto_alloc_hash() never returns NULL, fix return value check in __tcp_alloc_md5sig_pool. From Insu Yun. 14) Properly expire exception route entries in ipv4, from Xin Long. 15) Fix races in tcp/dccp listener socket dismantle, from Eric Dumazet. 16) Don't set IFF_TX_SKB_SHARING in vxlan, geneve, or GRE, it's not legal. These drivers modify the SKB on transmit. From Jiri Benc. 17) Fix regression in the initialziation of netdev->tx_queue_len. From Phil Sutter. 18) Missing unlock in tipc_nl_add_bc_link() error path, from Insu Yun. 19) SCTP port hash sizing does not properly ensure that table is a power of two in size. From Neil Horman. 20) Fix initializing of software copy of MAC address in fmvj18x_cs driver, from Ken Kawasaki" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (129 commits) bnx2x: Fix 84833 phy command handler bnx2x: Fix led setting for 84858 phy. bnx2x: Correct 84858 PHY fw version bnx2x: Fix 84833 RX CRC bnx2x: Fix link-forcing for KR2 net: ethernet: davicom: fix devicetree irq resource fmvj18x_cs: fix incorrect indexing of dev->dev_addr[] when copying the MAC address Driver: Vmxnet3: Update Rx ring 2 max size net: netcp: rework the code for get/set sw_data in dma desc soc: ti: knav_dma: rename pad in struct knav_dma_desc to sw_data net: ti: netcp: restore get/set_pad_info() functionality MAINTAINERS: Drop myself as xen netback maintainer sctp: Fix port hash table size computation can: ems_usb: Fix possible tx overflow Bluetooth: hci_core: Avoid mixing up req_complete and req_complete_skb net: bcmgenet: Fix internal PHY link state af_unix: Don't use continue to re-execute unix_stream_read_generic loop unix_diag: fix incorrect sign extension in unix_lookup_by_ino bnxt_en: Failure to update PHY is not fatal condition. bnxt_en: Remove unnecessary call to update PHY settings. ...
2016-02-22Merge tag 'hwmon-for-linus-v4.5-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fixes from Guenter Roeck: "Two fixes headed for stable: - Remove an unnecessary speed_index lookup for thermal hook in the gpio-fan driver. The unnecessary speed lookup can hog the system. - Handle negative conversion values correctly in the ads1015 driver" * tag 'hwmon-for-linus-v4.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (gpio-fan) Remove un-necessary speed_index lookup for thermal hook hwmon: (ads1015) Handle negative conversion values correctly
2016-02-22Merge tag 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma fixes from Doug Ledford: "One ocrdma fix: - The new CQ API support was added to ocrdma, but they got the arming logic wrong, so without this, transfers eventually fail when they fail to arm the interrupt properly under load Two related fixes for mlx4: - When we added the 64bit extended counters support to the core IB code, they forgot to update the RoCE side of the mlx4 driver (the IB side they properly updated). I debated whether or not to include these patches as they could be considered feature enablement patches, but the existing code will blindy copy the 32bit counters, whether any counters were requested at all (a bug). These two patches make it (a) check to see that counters were requested and (b) copy the right counters (the 64bit support is new, the 32bit is not). For that reason I went ahead and took them" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: IB/mlx4: Add support for the port info class for RoCE ports IB/mlx4: Add support for extended counters over RoCE ports RDMA/ocrdma: Fix arm logic to align with new cq API
2016-02-22Merge branch 'i2c/for-current' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "Some bugfixes from I2C for you: A fix for a RuntimePM regression with OMAP, a fix to enable TCO for Lewisburg platforms, and a typo fix while we are here" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: i801: Adding Intel Lewisburg support for iTCO i2c: uniphier: fix typos in error messages i2c: omap: Fix PM regression with deferred probe for pm_runtime_reinit
2016-02-22Merge tag 'linux-can-fixes-for-4.5-20160221' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2016-02-21 this is a pull reqeust of one patch for net/master. The patch is by Gerhard Uttenthaler and fixes a potential tx overflow in the ems_usb driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-22Merge branch 'bnx2x-848xx-phy-fixes'David S. Miller
Yuval Mintz says: ==================== bnx2x: Fix 848xx phys This series contains link-related fixes, mostly for the 848xx phys [2 patches are for 84833, and 2 patches are for 84858]. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-22bnx2x: Fix 84833 phy command handlerYuval Mintz
Current initialization sequence is lacking, causing some configurations to fail. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-22bnx2x: Fix led setting for 84858 phy.Yuval Mintz
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-22bnx2x: Correct 84858 PHY fw versionYuval Mintz
The phy's firmware version isn't being parsed properly as it's currently parsed like the rest of the 848xx phys. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-22bnx2x: Fix 84833 RX CRCYuval Mintz
There's a problem in current 84833 phy configuration - in case 1Gb link is configured and jumbo-sized packets are being used, device will experience RX crc errors. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-22bnx2x: Fix link-forcing for KR2Yuval Mintz
Currently, when link is using KR2 it cannot be forced to any speed other than 20g. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.om> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-22Merge branch 'for-upstream' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Johan Hedberg says: ==================== pull request: bluetooth 2016-02-20 Here's an important patch for 4.5 which fixes potential invalid pointer access when processing completed Bluetooth HCI commands. Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-22net: ethernet: davicom: fix devicetree irq resourceRobert Jarzmik
The dm9000 driver doesn't work in at least one device-tree configuration, spitting an error message on irq resource : [    1.062495] dm9000 8000000.ethernet: insufficient resources [    1.068439] dm9000 8000000.ethernet: not found (-2). [    1.073451] dm9000: probe of 8000000.ethernet failed with error -2 The reason behind is that the interrupt might be provided by a gpio controller, not probed when dm9000 is probed, and needing the probe deferral mechanism to apply. Currently, the interrupt is directly taken from resources. This patch changes this to use the more generic platform_get_irq(), which handles the deferral. Moreover, since commit Fixes: 7085a7401ba5 ("drivers: platform: parse IRQ flags from resources"), the interrupt trigger flags are honored in platform_get_irq(), so remove the needless code in dm9000. Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr> Acked-by: Marcel Ziswiler <marcel@ziswiler.com> Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Tested-by: Sergei Ianovich <ynvich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-22fmvj18x_cs: fix incorrect indexing of dev->dev_addr[] when copying the MAC ↵Ken Kawasaki
address fix incorrect indexing of dev->dev_addr[] when copying the MAC address of FMV-J182 at buf[5]. Signed-off-by: Ken Kawasaki <ken_kawasaki@nifty.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-22Driver: Vmxnet3: Update Rx ring 2 max sizeShrikrishna Khare
Device emulation supports max size of 4096. Signed-off-by: Shrikrishna Khare <skhare@vmware.com> Signed-off-by: Bhavesh Davda <bhavesh@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-22Merge branch 'netcp-fixes'David S. Miller
Murali Karicheri says: ==================== net: ti: netcp: restore get/set_pad_info() functionality This series fixes a regression and add some improvements for the ease of maintainance. Incorporated comments against v1. Changelogs: v2 : combined 2-3 into one patch as this involves a header change fixed a parse warning in 3/4 per comment from Arnd. Removed Sign-off from Arnd against 1/4 added comments in 3/3 to alert on the usage of sw data per review comments v1 : added 2-4 to accomodate feedback received from review v0 : initial version to fix the regression (From Grygorii) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-22net: netcp: rework the code for get/set sw_data in dma descKaricheri, Muralidharan
SW data field in descriptor can be used by software to hold private data for the driver. As there are 4 words available for this purpose, use separate macros to place it or retrieve the same to/from descriptors. Also do type cast of data types accordingly. Cc: Wingman Kwok <w-kwok2@ti.com> Cc: Mugunthan V N <mugunthanvnm@ti.com> CC: Arnd Bergmann <arnd@arndb.de> CC: Grygorii Strashko <grygorii.strashko@ti.com> CC: David Laight <David.Laight@ACULAB.COM> Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-22soc: ti: knav_dma: rename pad in struct knav_dma_desc to sw_dataKaricheri, Muralidharan
Rename the pad to sw_data as per description of this field in the hardware spec(refer sprugr9 from www.ti.com). Latest version of the document is at http://www.ti.com/lit/ug/sprugr9h/sprugr9h.pdf and section 3.1 Host Packet Descriptor describes this field. Define and use a constant for the size of sw_data field similar to other fields in the struct for desc and document the sw_data field in the header. As the sw_data is not touched by hw, it's type can be changed to u32. Rename the helpers to match with the updated dma desc field sw_data. Cc: Wingman Kwok <w-kwok2@ti.com> Cc: Mugunthan V N <mugunthanvnm@ti.com> CC: Arnd Bergmann <arnd@arndb.de> CC: Grygorii Strashko <grygorii.strashko@ti.com> CC: David Laight <David.Laight@ACULAB.COM> Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-22net: ti: netcp: restore get/set_pad_info() functionalityKaricheri, Muralidharan
The commit 899077791403 ("netcp: try to reduce type confusion in descriptors") introduces a regression in Kernel 4.5-rc1 and it breaks get/set_pad_info() functionality. The TI NETCP driver uses pad0 and pad1 fields of knav_dma_desc to store DMA/MEM buffer pointer and buffer size respectively. And in both cases for Keystone 2 the pointer type size is 32 bit regardless of LAPE enabled or not, because CONFIG_ARCH_DMA_ADDR_T_64BIT originally is not expected to be defined. Unfortunately, above commit changed buffer's pointers save/restore code (get/set_pad_info()) and added intermediate conversation to u64 which works incorrectly on 32bit Keystone 2 and causes TI NETCP driver crash in RX/TX path due to "Unable to handle kernel NULL pointer" exception. This issue was reported and discussed in [1]. Hence, fix it by partially reverting above commit and restoring get/set_pad_info() functionality as it was before. [1] https://www.mail-archive.com/netdev@vger.kernel.org/msg95361.html Cc: Wingman Kwok <w-kwok2@ti.com> Cc: Mugunthan V N <mugunthanvnm@ti.com> CC: David Laight <David.Laight@ACULAB.COM> CC: Arnd Bergmann <arnd@arndb.de> Reported-by: Franklin S Cooper Jr <fcooper@ti.com> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-22MAINTAINERS: Drop myself as xen netback maintainerIan Campbell
Wei has been picking this up for quite a while now. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-22sctp: Fix port hash table size computationNeil Horman
Dmitry Vyukov noted recently that the sctp_port_hashtable had an error in its size computation, observing that the current method never guaranteed that the hashsize (measured in number of entries) would be a power of two, which the input hash function for that table requires. The root cause of the problem is that two values need to be computed (one, the allocation order of the storage requries, as passed to __get_free_pages, and two the number of entries for the hash table). Both need to be ^2, but for different reasons, and the existing code is simply computing one order value, and using it as the basis for both, which is wrong (i.e. it assumes that ((1<<order)*PAGE_SIZE)/sizeof(bucket) is still ^2 when its not). To fix this, we change the logic slightly. We start by computing a goal allocation order (which is limited by the maximum size hash table we want to support. Then we attempt to allocate that size table, decreasing the order until a successful allocation is made. Then, with the resultant successful order we compute the number of buckets that hash table supports, which we then round down to the nearest power of two, giving us the number of entries the table actually supports. I've tested this locally here, using non-debug and spinlock-debug kernels, and the number of entries in the hashtable consistently work out to be powers of two in all cases. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> CC: Dmitry Vyukov <dvyukov@google.com> CC: Vladislav Yasevich <vyasevich@gmail.com> CC: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-22dm: fix dm_rq_target_io leak on faults with .request_fn DM w/ blk-mq pathsMike Snitzer
Using request-based DM mpath configured with the following stacking (.request_fn DM mpath ontop of scsi-mq paths): echo Y > /sys/module/scsi_mod/parameters/use_blk_mq echo N > /sys/module/dm_mod/parameters/use_blk_mq 'struct dm_rq_target_io' would leak if a request is requeued before a blk-mq clone is allocated (or fails to allocate). free_rq_tio() wasn't being called. kmemleak reported: unreferenced object 0xffff8800b90b98c0 (size 112): comm "kworker/7:1H", pid 5692, jiffies 4295056109 (age 78.589s) hex dump (first 32 bytes): 00 d0 5c 2c 03 88 ff ff 40 00 bf 01 00 c9 ff ff ..\,....@....... e0 d9 b1 34 00 88 ff ff 00 00 00 00 00 00 00 00 ...4............ backtrace: [<ffffffff81672b6e>] kmemleak_alloc+0x4e/0xb0 [<ffffffff811dbb63>] kmem_cache_alloc+0xc3/0x1e0 [<ffffffff8117eae5>] mempool_alloc_slab+0x15/0x20 [<ffffffff8117ec1e>] mempool_alloc+0x6e/0x170 [<ffffffffa00029ac>] dm_old_prep_fn+0x3c/0x180 [dm_mod] [<ffffffff812fbd78>] blk_peek_request+0x168/0x290 [<ffffffffa0003e62>] dm_request_fn+0xb2/0x1b0 [dm_mod] [<ffffffff812f66e3>] __blk_run_queue+0x33/0x40 [<ffffffff812f9585>] blk_delay_work+0x25/0x40 [<ffffffff81096fff>] process_one_work+0x14f/0x3d0 [<ffffffff81097715>] worker_thread+0x125/0x4b0 [<ffffffff8109ce88>] kthread+0xd8/0xf0 [<ffffffff8167cb8f>] ret_from_fork+0x3f/0x70 [<ffffffffffffffff>] 0xffffffffffffffff crash> struct -o dm_rq_target_io struct dm_rq_target_io { ... } SIZE: 112 Fixes: e5863d9ad7 ("dm: allocate requests in target when stacking on blk-mq devices") Cc: stable@vger.kernel.org # 4.0+ Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2016-02-21can: ems_usb: Fix possible tx overflowGerhard Uttenthaler
This patch fixes the problem that more CAN messages could be sent to the interface as could be send on the CAN bus. This was more likely for slow baud rates. The sleeping _start_xmit was woken up in the _write_bulk_callback. Under heavy TX load this produced another bulk transfer without checking the free_slots variable and hence caused the overflow in the interface. Signed-off-by: Gerhard Uttenthaler <uttenthaler@ems-wuensche.com> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>