From f48cfddc6729ef133933062320039808bafa6f45 Mon Sep 17 00:00:00 2001 From: "Eric W. Biederman" Date: Fri, 8 Nov 2013 16:31:29 -0800 Subject: vfs: In d_path don't call d_dname on a mount point Aditya Kali (adityakali@google.com) wrote: > Commit bf056bfa80596a5d14b26b17276a56a0dcb080e5: > "proc: Fix the namespace inode permission checks." converted > the namespace files into symlinks. The same commit changed > the way namespace bind mounts appear in /proc/mounts: > $ mount --bind /proc/self/ns/ipc /mnt/ipc > Originally: > $ cat /proc/mounts | grep ipc > proc /mnt/ipc proc rw,nosuid,nodev,noexec 0 0 > > After commit bf056bfa80596a5d14b26b17276a56a0dcb080e5: > $ cat /proc/mounts | grep ipc > proc ipc:[4026531839] proc rw,nosuid,nodev,noexec 0 0 > > This breaks userspace which expects the 2nd field in > /proc/mounts to be a valid path. The symlink /proc//ns/{ipc,mnt,net,pid,user,uts} point to dentries allocated with d_alloc_pseudo that we can mount, and that have interesting names printed out with d_dname. When these files are bind mounted /proc/mounts is not currently displaying the mount point correctly because d_dname is called instead of just displaying the path where the file is mounted. Solve this by adding an explicit check to distinguish mounted pseudo inodes and unmounted pseudo inodes. Unmounted pseudo inodes always use mount of their filesstem as the mnt_root in their path making these two cases easy to distinguish. CC: stable@vger.kernel.org Acked-by: Serge Hallyn Reported-by: Aditya Kali Signed-off-by: "Eric W. Biederman" diff --git a/fs/dcache.c b/fs/dcache.c index 4bdb300..f7282eb 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -3061,8 +3061,13 @@ char *d_path(const struct path *path, char *buf, int buflen) * thus don't need to be hashed. They also don't need a name until a * user wants to identify the object in /proc/pid/fd/. The little hack * below allows us to generate a name for these objects on demand: + * + * Some pseudo inodes are mountable. When they are mounted + * path->dentry == path->mnt->mnt_root. In that case don't call d_dname + * and instead have d_path return the mounted path. */ - if (path->dentry->d_op && path->dentry->d_op->d_dname) + if (path->dentry->d_op && path->dentry->d_op->d_dname && + (!IS_ROOT(path->dentry) || path->dentry != path->mnt->mnt_root)) return path->dentry->d_op->d_dname(path->dentry, buf, buflen); rcu_read_lock(); -- cgit v0.10.2 From 1f7f4dde5c945f41a7abc2285be43d918029ecc5 Mon Sep 17 00:00:00 2001 From: "Eric W. Biederman" Date: Thu, 14 Nov 2013 21:10:16 -0800 Subject: fork: Allow CLONE_PARENT after setns(CLONE_NEWPID) Serge Hallyn writes: > Hi Oleg, > > commit 40a0d32d1eaffe6aac7324ca92604b6b3977eb0e : > "fork: unify and tighten up CLONE_NEWUSER/CLONE_NEWPID checks" > breaks lxc-attach in 3.12. That code forks a child which does > setns() and then does a clone(CLONE_PARENT). That way the > grandchild can be in the right namespaces (which the child was > not) and be a child of the original task, which is the monitor. > > lxc-attach in 3.11 was working fine with no side effects that I > could see. Is there a real danger in allowing CLONE_PARENT > when current->nsproxy->pidns_for_children is not our pidns, > or was this done out of an "over-abundance of caution"? Can we > safely revert that new extra check? The two fundamental things I know we can not allow are: - A shared signal queue aka CLONE_THREAD. Because we compute the pid and uid of the signal when we place it in the queue. - Changing the pid and by extention pid_namespace of an existing process. From a parents perspective there is nothing special about the pid namespace, to deny CLONE_PARENT, because the parent simply won't know or care. From the childs perspective all that is special really are shared signal queues. User mode threading with CLONE_PARENT|CLONE_VM|CLONE_SIGHAND and tasks in different pid namespaces is almost certainly going to break because it is complicated. But shared signal handlers can look at per thread information to know which pid namespace a process is in, so I don't know of any reason not to support CLONE_PARENT|CLONE_VM|CLONE_SIGHAND threads at the kernel level. It would be absolutely stupid to implement but that is a different thing. So hmm. Because it can do no harm, and because it is a regression let's remove the CLONE_PARENT check and send it stable. Cc: stable@vger.kernel.org Acked-by: Oleg Nesterov Acked-by: Andy Lutomirski Acked-by: Serge E. Hallyn Signed-off-by: "Eric W. Biederman" diff --git a/kernel/fork.c b/kernel/fork.c index 728d5be..f82fa2e 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1171,7 +1171,7 @@ static struct task_struct *copy_process(unsigned long clone_flags, * do not allow it to share a thread group or signal handlers or * parent with the forking task. */ - if (clone_flags & (CLONE_SIGHAND | CLONE_PARENT)) { + if (clone_flags & CLONE_SIGHAND) { if ((clone_flags & (CLONE_NEWUSER | CLONE_NEWPID)) || (task_active_pid_ns(current) != current->nsproxy->pid_ns_for_children)) -- cgit v0.10.2 From 41301ae78a99ead04ea42672a1ab72c6f44cc81d Mon Sep 17 00:00:00 2001 From: "Eric W. Biederman" Date: Thu, 14 Nov 2013 21:22:25 -0800 Subject: vfs: Fix a regression in mounting proc Gao feng reported that commit e51db73532955dc5eaba4235e62b74b460709d5b userns: Better restrictions on when proc and sysfs can be mounted caused a regression on mounting a new instance of proc in a mount namespace created with user namespace privileges, when binfmt_misc is mounted on /proc/sys/fs/binfmt_misc. This is an unintended regression caused by the absolutely bogus empty directory check in fs_fully_visible. The check fs_fully_visible replaced didn't even bother to attempt to verify proc was fully visible and hiding proc files with any kind of mount is rare. So for now fix the userspace regression by allowing directory with nlink == 1 as /proc/sys/fs/binfmt_misc has. I will have a better patch but it is not stable material, or last minute kernel material. So it will have to wait. Cc: stable@vger.kernel.org Acked-by: Serge Hallyn Acked-by: Gao feng Tested-by: Gao feng Signed-off-by: "Eric W. Biederman" diff --git a/fs/namespace.c b/fs/namespace.c index ac2ce8a..be32ebc 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -2886,7 +2886,7 @@ bool fs_fully_visible(struct file_system_type *type) struct inode *inode = child->mnt_mountpoint->d_inode; if (!S_ISDIR(inode->i_mode)) goto next; - if (inode->i_nlink != 2) + if (inode->i_nlink > 2) goto next; } visible = true; -- cgit v0.10.2 From f9b0e058cbd04ada76b13afffa7e1df830543c24 Mon Sep 17 00:00:00 2001 From: Jan Kara Date: Sat, 14 Dec 2013 04:21:26 +0800 Subject: writeback: Fix data corruption on NFS Commit 4f8ad655dbc8 "writeback: Refactor writeback_single_inode()" added a condition to skip clean inode. However this is wrong in WB_SYNC_ALL mode because there we also want to wait for outstanding writeback on possibly clean inode. This was causing occasional data corruption issues on NFS because it uses sync_inode() to make sure all outstanding writes are flushed to the server before truncating the inode and with sync_inode() returning prematurely file was sometimes extended back by an outstanding write after it was truncated. So modify the test to also check for pages under writeback in WB_SYNC_ALL mode. CC: stable@vger.kernel.org # >= 3.5 Fixes: 4f8ad655dbc82cf05d2edc11e66b78a42d38bf93 Reported-and-tested-by: Dan Duval Signed-off-by: Jan Kara diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index 1f4a10e..e0259a1 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -516,13 +516,16 @@ writeback_single_inode(struct inode *inode, struct bdi_writeback *wb, } WARN_ON(inode->i_state & I_SYNC); /* - * Skip inode if it is clean. We don't want to mess with writeback - * lists in this function since flusher thread may be doing for example - * sync in parallel and if we move the inode, it could get skipped. So - * here we make sure inode is on some writeback list and leave it there - * unless we have completely cleaned the inode. + * Skip inode if it is clean and we have no outstanding writeback in + * WB_SYNC_ALL mode. We don't want to mess with writeback lists in this + * function since flusher thread may be doing for example sync in + * parallel and if we move the inode, it could get skipped. So here we + * make sure inode is on some writeback list and leave it there unless + * we have completely cleaned the inode. */ - if (!(inode->i_state & I_DIRTY)) + if (!(inode->i_state & I_DIRTY) && + (wbc->sync_mode != WB_SYNC_ALL || + !mapping_tagged(inode->i_mapping, PAGECACHE_TAG_WRITEBACK))) goto out; inode->i_state |= I_SYNC; spin_unlock(&inode->i_lock); -- cgit v0.10.2 From c1dcc927dae01dfd4904ee82ce2c00b50eab6dc3 Mon Sep 17 00:00:00 2001 From: Soren Brinkmann Date: Tue, 26 Nov 2013 17:04:50 -0800 Subject: clocksource: cadence_ttc: Fix mutex taken inside interrupt context When the kernel is compiled with: CONFIG_HIGH_RES_TIMERS=no CONFIG_HZ_PERIODIC=yes CONFIG_DEBUG_ATOMIC_SLEEP=yes The following WARN appears: WARNING: CPU: 1 PID: 0 at linux/kernel/mutex.c:856 mutex_trylock+0x70/0x1fc() DEBUG_LOCKS_WARN_ON(in_interrupt()) Modules linked in: CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.12.0-xilinx-dirty #93 [] (unwind_backtrace+0x0/0x11c) from [] (show_stack+0x10/0x14) [] (show_stack+0x10/0x14) from [] (dump_stack+0x7c/0xc0) [] (dump_stack+0x7c/0xc0) from [] (warn_slowpath_common+0x60/0x84) [] (warn_slowpath_common+0x60/0x84) from [] (warn_slowpath_fmt+0x2c/0x3c) [] (warn_slowpath_fmt+0x2c/0x3c) from [] (mutex_trylock+0x70/0x1fc) [] (mutex_trylock+0x70/0x1fc) from [] (clk_prepare_lock+0xc/0xe4) [] (clk_prepare_lock+0xc/0xe4) from [] (clk_get_rate+0xc/0x44) [] (clk_get_rate+0xc/0x44) from [] (ttc_set_mode+0x34/0x78) [] (ttc_set_mode+0x34/0x78) from [] (clockevents_set_mode+0x28/0x5c) [] (clockevents_set_mode+0x28/0x5c) from [] (tick_broadcast_on_off+0x190/0x1c0) [] (tick_broadcast_on_off+0x190/0x1c0) from [] (clockevents_notify+0x58/0x1ac) [] (clockevents_notify+0x58/0x1ac) from [] (cpuidle_setup_broadcast_timer+0x20/0x24) [] (cpuidle_setup_broadcast_timer+0x20/0x24) from [] (generic_smp_call_function_single_interrupt+0) [] (generic_smp_call_function_single_interrupt+0xe0/0x130) from [] (handle_IPI+0x88/0x118) [] (handle_IPI+0x88/0x118) from [] (gic_handle_irq+0x58/0x60) [] (gic_handle_irq+0x58/0x60) from [] (__irq_svc+0x44/0x78) Exception stack(0xef099fa0 to 0xef099fe8) 9fa0: 00000001 ef092100 00000000 ef092100 ef098000 00000015 c0399f2c c0579d74 9fc0: 0000406a 413fc090 00000000 00000000 00000000 ef099fe8 c00666ec c000f46c 9fe0: 20000113 ffffffff [] (__irq_svc+0x44/0x78) from [] (arch_cpu_idle+0x34/0x3c) [] (arch_cpu_idle+0x34/0x3c) from [] (cpu_startup_entry+0xa8/0x10c) [] (cpu_startup_entry+0xa8/0x10c) from [<000085a4>] (0x85a4) We are in an interrupt context (IPI) and we are calling clk_get_rate in the set_mode function which in turn ends up by getting a mutex... Even if that does not hang, it is a potential kernel deadlock. It is not allowed to call clk_get_rate() from interrupt context. To avoid such calls the timer input frequency is stored in the driver's data struct which makes it accessible to the driver in any context. [dlezcano] completed the changelog with the WARN trace and added a more detailed description. Tested on zync zc702. Acked-by: Daniel Lezcano Tested-by: Daniel Lezcano Signed-off-by: Soren Brinkmann Signed-off-by: Daniel Lezcano diff --git a/drivers/clocksource/cadence_ttc_timer.c b/drivers/clocksource/cadence_ttc_timer.c index b2bb3a4b..a92350b 100644 --- a/drivers/clocksource/cadence_ttc_timer.c +++ b/drivers/clocksource/cadence_ttc_timer.c @@ -67,11 +67,13 @@ * struct ttc_timer - This definition defines local timer structure * * @base_addr: Base address of timer + * @freq: Timer input clock frequency * @clk: Associated clock source * @clk_rate_change_nb Notifier block for clock rate changes */ struct ttc_timer { void __iomem *base_addr; + unsigned long freq; struct clk *clk; struct notifier_block clk_rate_change_nb; }; @@ -196,9 +198,8 @@ static void ttc_set_mode(enum clock_event_mode mode, switch (mode) { case CLOCK_EVT_MODE_PERIODIC: - ttc_set_interval(timer, - DIV_ROUND_CLOSEST(clk_get_rate(ttce->ttc.clk), - PRESCALE * HZ)); + ttc_set_interval(timer, DIV_ROUND_CLOSEST(ttce->ttc.freq, + PRESCALE * HZ)); break; case CLOCK_EVT_MODE_ONESHOT: case CLOCK_EVT_MODE_UNUSED: @@ -273,6 +274,8 @@ static void __init ttc_setup_clocksource(struct clk *clk, void __iomem *base) return; } + ttccs->ttc.freq = clk_get_rate(ttccs->ttc.clk); + ttccs->ttc.clk_rate_change_nb.notifier_call = ttc_rate_change_clocksource_cb; ttccs->ttc.clk_rate_change_nb.next = NULL; @@ -298,16 +301,14 @@ static void __init ttc_setup_clocksource(struct clk *clk, void __iomem *base) __raw_writel(CNT_CNTRL_RESET, ttccs->ttc.base_addr + TTC_CNT_CNTRL_OFFSET); - err = clocksource_register_hz(&ttccs->cs, - clk_get_rate(ttccs->ttc.clk) / PRESCALE); + err = clocksource_register_hz(&ttccs->cs, ttccs->ttc.freq / PRESCALE); if (WARN_ON(err)) { kfree(ttccs); return; } ttc_sched_clock_val_reg = base + TTC_COUNT_VAL_OFFSET; - setup_sched_clock(ttc_sched_clock_read, 16, - clk_get_rate(ttccs->ttc.clk) / PRESCALE); + setup_sched_clock(ttc_sched_clock_read, 16, ttccs->ttc.freq / PRESCALE); } static int ttc_rate_change_clockevent_cb(struct notifier_block *nb, @@ -334,6 +335,9 @@ static int ttc_rate_change_clockevent_cb(struct notifier_block *nb, ndata->new_rate / PRESCALE); local_irq_restore(flags); + /* update cached frequency */ + ttc->freq = ndata->new_rate; + /* fall through */ } case PRE_RATE_CHANGE: @@ -367,6 +371,7 @@ static void __init ttc_setup_clockevent(struct clk *clk, if (clk_notifier_register(ttcce->ttc.clk, &ttcce->ttc.clk_rate_change_nb)) pr_warn("Unable to register clock notifier.\n"); + ttcce->ttc.freq = clk_get_rate(ttcce->ttc.clk); ttcce->ttc.base_addr = base; ttcce->ce.name = "ttc_clockevent"; @@ -396,7 +401,7 @@ static void __init ttc_setup_clockevent(struct clk *clk, } clockevents_config_and_register(&ttcce->ce, - clk_get_rate(ttcce->ttc.clk) / PRESCALE, 1, 0xfffe); + ttcce->ttc.freq / PRESCALE, 1, 0xfffe); } /** -- cgit v0.10.2 From 6bcac805bacb3b637911244d95368512ae389abc Mon Sep 17 00:00:00 2001 From: Russell King Date: Tue, 7 Jan 2014 17:53:54 +0000 Subject: Revert "ARM: 7908/1: mm: Fix the arm_dma_limit calculation" This reverts commit 787b0d5c1ca7ff24feb6f92e4c7f4410ee7d81a8 since it is no longer required after 7909/1 was applied, and it causes build regressions when ARM_PATCH_PHYS_VIRT is disabled and DMA_ZONE is enabled. Signed-off-by: Russell King diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c index 1f7b19a..3e8f106e 100644 --- a/arch/arm/mm/init.c +++ b/arch/arm/mm/init.c @@ -229,7 +229,7 @@ void __init setup_dma_zone(const struct machine_desc *mdesc) #ifdef CONFIG_ZONE_DMA if (mdesc->dma_zone_size) { arm_dma_zone_size = mdesc->dma_zone_size; - arm_dma_limit = __pv_phys_offset + arm_dma_zone_size - 1; + arm_dma_limit = PHYS_OFFSET + arm_dma_zone_size - 1; } else arm_dma_limit = 0xffffffff; arm_dma_pfn_limit = arm_dma_limit >> PAGE_SHIFT; -- cgit v0.10.2 From 0882dae983707455e97479e5e904e37673517ebc Mon Sep 17 00:00:00 2001 From: Paulo Zanoni Date: Wed, 8 Jan 2014 11:12:27 -0200 Subject: drm/i915: fix DDI PLLs HW state readout code Properly zero the refcounts and crtc->ddi_pll_set so the previous HW state doesn't affect the result of reading the current HW state. This fixes WARNs about WRPLL refcount if we have an HDMI monitor on HSW and then suspend/resume. Cc: stable@vger.kernel.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64379 Tested-by: Qingshuai Tian Signed-off-by: Paulo Zanoni Signed-off-by: Daniel Vetter diff --git a/drivers/gpu/drm/i915/intel_ddi.c b/drivers/gpu/drm/i915/intel_ddi.c index 526c8de..b69dc3e 100644 --- a/drivers/gpu/drm/i915/intel_ddi.c +++ b/drivers/gpu/drm/i915/intel_ddi.c @@ -1057,12 +1057,18 @@ void intel_ddi_setup_hw_pll_state(struct drm_device *dev) enum pipe pipe; struct intel_crtc *intel_crtc; + dev_priv->ddi_plls.spll_refcount = 0; + dev_priv->ddi_plls.wrpll1_refcount = 0; + dev_priv->ddi_plls.wrpll2_refcount = 0; + for_each_pipe(pipe) { intel_crtc = to_intel_crtc(dev_priv->pipe_to_crtc_mapping[pipe]); - if (!intel_crtc->active) + if (!intel_crtc->active) { + intel_crtc->ddi_pll_sel = PORT_CLK_SEL_NONE; continue; + } intel_crtc->ddi_pll_sel = intel_ddi_get_crtc_pll(dev_priv, pipe); -- cgit v0.10.2 From 1739f09e33d8f66bf48ddbc3eca615574da6c4f6 Mon Sep 17 00:00:00 2001 From: Steven Rostedt Date: Wed, 13 Nov 2013 15:20:04 -0500 Subject: ftrace/x86: Load ftrace_ops in parameter not the variable holding it Function tracing callbacks expect to have the ftrace_ops that registered it passed to them, not the address of the variable that holds the ftrace_ops that registered it. Use a mov instead of a lea to store the ftrace_ops into the parameter of the function tracing callback. Signed-off-by: Steven Rostedt Reviewed-by: Masami Hiramatsu Link: http://lkml.kernel.org/r/20131113152004.459787f9@gandalf.local.home Signed-off-by: H. Peter Anvin Cc: # v3.8+ diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S index 51e2988..a2a4f46 100644 --- a/arch/x86/kernel/entry_32.S +++ b/arch/x86/kernel/entry_32.S @@ -1082,7 +1082,7 @@ ENTRY(ftrace_caller) pushl $0 /* Pass NULL as regs pointer */ movl 4*4(%esp), %eax movl 0x4(%ebp), %edx - leal function_trace_op, %ecx + movl function_trace_op, %ecx subl $MCOUNT_INSN_SIZE, %eax .globl ftrace_call @@ -1140,7 +1140,7 @@ ENTRY(ftrace_regs_caller) movl 12*4(%esp), %eax /* Load ip (1st parameter) */ subl $MCOUNT_INSN_SIZE, %eax /* Adjust ip */ movl 0x4(%ebp), %edx /* Load parent ip (2nd parameter) */ - leal function_trace_op, %ecx /* Save ftrace_pos in 3rd parameter */ + movl function_trace_op, %ecx /* Save ftrace_pos in 3rd parameter */ pushl %esp /* Save pt_regs as 4th parameter */ GLOBAL(ftrace_regs_call) diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S index e21b078..1e96c36 100644 --- a/arch/x86/kernel/entry_64.S +++ b/arch/x86/kernel/entry_64.S @@ -88,7 +88,7 @@ END(function_hook) MCOUNT_SAVE_FRAME \skip /* Load the ftrace_ops into the 3rd parameter */ - leaq function_trace_op, %rdx + movq function_trace_op(%rip), %rdx /* Load ip into the first parameter */ movq RIP(%rsp), %rdi -- cgit v0.10.2 From 7ad228b11ec26a820291c9f5a1168d6176580dc1 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ville=20Syrj=C3=A4l=C3=A4?= Date: Tue, 7 Jan 2014 16:15:36 +0200 Subject: drm/i915: Don't grab crtc mutexes in intel_modeset_gem_init() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When the pipe A force quirk is applied the code will attempt to grab a crtc mutex during intel_modeset_setup_hw_state(). If we're already holding all crtc mutexes this will obviously deadlock every time. So instead of using drm_modeset_lock_all() just grab the mode_config.mutex. This is enough to avoid the unlocked mutex warnings from certain lower level functions. The regression was introduced in: commit 027476642811f8559cbe00ef6cc54db230e48a20 Author: Ville Syrjälä Date: Mon Dec 2 11:08:06 2013 +0200 drm/i915: Take modeset locks around intel_modeset_setup_hw_state() Signed-off-by: Ville Syrjälä Cc: stable@vger.kernel.org [danvet: Add cc: stable since the offending commit has that, too.] Signed-off-by: Daniel Vetter diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 769b864..2bde35d 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -11053,10 +11053,10 @@ void intel_modeset_gem_init(struct drm_device *dev) intel_setup_overlay(dev); - drm_modeset_lock_all(dev); + mutex_lock(&dev->mode_config.mutex); drm_mode_config_reset(dev); intel_modeset_setup_hw_state(dev, false); - drm_modeset_unlock_all(dev); + mutex_unlock(&dev->mode_config.mutex); } void intel_modeset_cleanup(struct drm_device *dev) -- cgit v0.10.2 From 09f2344d0896eb458a639b922224c7a202c11599 Mon Sep 17 00:00:00 2001 From: Jesse Barnes Date: Fri, 10 Jan 2014 13:13:09 -0800 Subject: drm/i915/bdw: make sure south port interrupts are enabled properly v2 We were apparently relying on the defaults on BDW, which resulted in no hotplug or AUX interrupts. So be sure to call the ibx_irq_preinstall to enable all interrupts. v2: use preinstall instead of redundant SDIER write References: https://bugs.freedesktop.org/show_bug.cgi?id=72834 References: https://bugs.freedesktop.org/show_bug.cgi?id=72833 Signed-off-by: Jesse Barnes Signed-off-by: Daniel Vetter diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 5d1dedc..f13d5ed 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -2713,6 +2713,8 @@ static void gen8_irq_preinstall(struct drm_device *dev) #undef GEN8_IRQ_INIT_NDX POSTING_READ(GEN8_PCU_IIR); + + ibx_irq_preinstall(dev); } static void ibx_hpd_irq_setup(struct drm_device *dev) -- cgit v0.10.2 From e44ef891e9e68b6ce7d3fd3bac73b7d5433050ae Mon Sep 17 00:00:00 2001 From: Sudeep Holla Date: Wed, 8 Jan 2014 18:24:21 +0100 Subject: ARM: 7934/1: DT/kernel: fix arch_match_cpu_phys_id to avoid erroneous match The MPIDR contains specific bitfields(MPIDR.Aff{2..0}) which uniquely identify a CPU, in addition to some non-identifying information and reserved bits. The ARM cpu binding defines the 'reg' property to only contain the affinity bits, and any cpu nodes with other bits set in their 'reg' entry are skipped. As such it is not necessary to mask the phys_id with MPIDR_HWID_BITMASK, and doing so could lead to matching erroneous CPU nodes in the device tree. This patch removes the masking of the physical identifier. Acked-by: Mark Rutland Signed-off-by: Sudeep Holla Signed-off-by: Russell King diff --git a/arch/arm/kernel/devtree.c b/arch/arm/kernel/devtree.c index 739c3df..34d5fd5 100644 --- a/arch/arm/kernel/devtree.c +++ b/arch/arm/kernel/devtree.c @@ -171,7 +171,7 @@ void __init arm_dt_init_cpu_maps(void) bool arch_match_cpu_phys_id(int cpu, u64 phys_id) { - return (phys_id & MPIDR_HWID_BITMASK) == cpu_logical_map(cpu); + return phys_id == cpu_logical_map(cpu); } static const void * __init arch_get_next_mach(const char *const **match) -- cgit v0.10.2 From 261521f142fb426b58ed460d2476049dc8fa4fb9 Mon Sep 17 00:00:00 2001 From: Stephen Boyd Date: Fri, 10 Jan 2014 00:57:06 +0100 Subject: ARM: 7937/1: perf_event: Silence sparse warning arch/arm/kernel/perf_event_cpu.c:274:25: warning: incorrect type in assignment (different modifiers) arch/arm/kernel/perf_event_cpu.c:274:25: expected int ( *init_fn )( ... ) arch/arm/kernel/perf_event_cpu.c:274:25: got void const *const data Acked-by: Will Deacon Signed-off-by: Stephen Boyd Signed-off-by: Russell King diff --git a/arch/arm/kernel/perf_event_cpu.c b/arch/arm/kernel/perf_event_cpu.c index d85055c..20d553c 100644 --- a/arch/arm/kernel/perf_event_cpu.c +++ b/arch/arm/kernel/perf_event_cpu.c @@ -254,7 +254,7 @@ static int probe_current_pmu(struct arm_pmu *pmu) static int cpu_pmu_device_probe(struct platform_device *pdev) { const struct of_device_id *of_id; - int (*init_fn)(struct arm_pmu *); + const int (*init_fn)(struct arm_pmu *); struct device_node *node = pdev->dev.of_node; struct arm_pmu *pmu; int ret = -ENODEV; -- cgit v0.10.2 From d6cd989477e2fee29ccda257614ef7b2621d0601 Mon Sep 17 00:00:00 2001 From: Taras Kondratiuk Date: Fri, 10 Jan 2014 01:36:00 +0100 Subject: ARM: 7939/1: traps: fix opcode endianness when read from user memory Currently code has an inverted logic: opcode from user memory is swapped to a proper endianness only in case of read error. While normally opcode should be swapped only if it was read correctly from user memory. Reviewed-by: Victor Kamensky Signed-off-by: Ben Dooks Signed-off-by: Taras Kondratiuk Signed-off-by: Russell King diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c index 6eda3bf..4636d56 100644 --- a/arch/arm/kernel/traps.c +++ b/arch/arm/kernel/traps.c @@ -431,9 +431,10 @@ asmlinkage void __exception do_undefinstr(struct pt_regs *regs) instr2 = __mem_to_opcode_thumb16(instr2); instr = __opcode_thumb32_compose(instr, instr2); } - } else if (get_user(instr, (u32 __user *)pc)) { + } else { + if (get_user(instr, (u32 __user *)pc)) + goto die_sig; instr = __mem_to_opcode_arm(instr); - goto die_sig; } if (call_undef_hook(regs, instr) == 0) -- cgit v0.10.2 From 26bef1318adc1b3a530ecc807ef99346db2aa8b0 Mon Sep 17 00:00:00 2001 From: Linus Torvalds Date: Sat, 11 Jan 2014 19:15:52 -0800 Subject: x86, fpu, amd: Clear exceptions in AMD FXSAVE workaround Before we do an EMMS in the AMD FXSAVE information leak workaround we need to clear any pending exceptions, otherwise we trap with a floating-point exception inside this code. Reported-by: halfdog Tested-by: Borislav Petkov Link: http://lkml.kernel.org/r/CA%2B55aFxQnY_PCG_n4=0w-VG=YLXL-yr7oMxyy0WU2gCBAf3ydg@mail.gmail.com Signed-off-by: H. Peter Anvin diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h index c49a613..cea1c76 100644 --- a/arch/x86/include/asm/fpu-internal.h +++ b/arch/x86/include/asm/fpu-internal.h @@ -293,12 +293,13 @@ static inline int restore_fpu_checking(struct task_struct *tsk) /* AMD K7/K8 CPUs don't save/restore FDP/FIP/FOP unless an exception is pending. Clear the x87 state here by setting it to fixed values. "m" is a random variable that should be in L1 */ - alternative_input( - ASM_NOP8 ASM_NOP2, - "emms\n\t" /* clear stack tags */ - "fildl %P[addr]", /* set F?P to defined value */ - X86_FEATURE_FXSAVE_LEAK, - [addr] "m" (tsk->thread.fpu.has_fpu)); + if (unlikely(static_cpu_has(X86_FEATURE_FXSAVE_LEAK))) { + asm volatile( + "fnclex\n\t" + "emms\n\t" + "fildl %P[addr]" /* set F?P to defined value */ + : : [addr] "m" (tsk->thread.fpu.has_fpu)); + } return fpu_restore_checking(&tsk->thread.fpu); } -- cgit v0.10.2 From 9722c2dac708e9468cc0dc30218ef76946ffbc9d Mon Sep 17 00:00:00 2001 From: Rik van Riel Date: Mon, 6 Jan 2014 11:39:12 +0000 Subject: sched: Calculate effective load even if local weight is 0 Thomas Hellstrom bisected a regression where erratic 3D performance is experienced on virtual machines as measured by glxgears. It identified commit 58d081b5 ("sched/numa: Avoid overloading CPUs on a preferred NUMA node") as the problem which had modified the behaviour of effective_load. Effective load calculates the difference to the system-wide load if a scheduling entity was moved to another CPU. The task group is not heavier as a result of the move but overall system load can increase/decrease as a result of the change. Commit 58d081b5 ("sched/numa: Avoid overloading CPUs on a preferred NUMA node") changed effective_load to make it suitable for calculating if a particular NUMA node was compute overloaded. To reduce the cost of the function, it assumed that a current sched entity weight of 0 was uninteresting but that is not the case. wake_affine() uses a weight of 0 for sync wakeups on the grounds that it is assuming the waking task will sleep and not contribute to load in the near future. In this case, we still want to calculate the effective load of the sched entity hierarchy. As effective_load is no longer used by task_numa_compare since commit fb13c7ee (sched/numa: Use a system-wide search to find swap/migration candidates), this patch simply restores the historical behaviour. Reported-and-tested-by: Thomas Hellstrom Signed-off-by: Rik van Riel [ Wrote changelog] Signed-off-by: Mel Gorman Signed-off-by: Peter Zijlstra Link: http://lkml.kernel.org/r/20140106113912.GC6178@suse.de Signed-off-by: Ingo Molnar diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index c7395d9..e64b079 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3923,7 +3923,7 @@ static long effective_load(struct task_group *tg, int cpu, long wl, long wg) { struct sched_entity *se = tg->se[cpu]; - if (!tg->parent || !wl) /* the trivial, non-cgroup case */ + if (!tg->parent) /* the trivial, non-cgroup case */ return wl; for_each_sched_entity(se) { -- cgit v0.10.2 From 0c3351d451ae2fa438d5d1ed719fc43354fbffbb Mon Sep 17 00:00:00 2001 From: John Stultz Date: Thu, 2 Jan 2014 15:11:13 -0800 Subject: seqlock: Use raw_ prefix instead of _no_lockdep MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Linus disliked the _no_lockdep() naming, so instead use the more-consistent raw_* prefix to the non-lockdep enabled seqcount methods. This also adds raw_ methods for the write operations as well, which will be utilized in a following patch. Acked-by: Linus Torvalds Reviewed-by: Stephen Boyd Signed-off-by: John Stultz Signed-off-by: Peter Zijlstra Cc: Krzysztof Hałasa Cc: Uwe Kleine-König Cc: Willy Tarreau Link: http://lkml.kernel.org/r/1388704274-5278-1-git-send-email-john.stultz@linaro.org Signed-off-by: Ingo Molnar diff --git a/arch/x86/vdso/vclock_gettime.c b/arch/x86/vdso/vclock_gettime.c index 2ada505..eb5d7a5 100644 --- a/arch/x86/vdso/vclock_gettime.c +++ b/arch/x86/vdso/vclock_gettime.c @@ -178,7 +178,7 @@ notrace static int __always_inline do_realtime(struct timespec *ts) ts->tv_nsec = 0; do { - seq = read_seqcount_begin_no_lockdep(>od->seq); + seq = raw_read_seqcount_begin(>od->seq); mode = gtod->clock.vclock_mode; ts->tv_sec = gtod->wall_time_sec; ns = gtod->wall_time_snsec; @@ -198,7 +198,7 @@ notrace static int do_monotonic(struct timespec *ts) ts->tv_nsec = 0; do { - seq = read_seqcount_begin_no_lockdep(>od->seq); + seq = raw_read_seqcount_begin(>od->seq); mode = gtod->clock.vclock_mode; ts->tv_sec = gtod->monotonic_time_sec; ns = gtod->monotonic_time_snsec; @@ -214,7 +214,7 @@ notrace static int do_realtime_coarse(struct timespec *ts) { unsigned long seq; do { - seq = read_seqcount_begin_no_lockdep(>od->seq); + seq = raw_read_seqcount_begin(>od->seq); ts->tv_sec = gtod->wall_time_coarse.tv_sec; ts->tv_nsec = gtod->wall_time_coarse.tv_nsec; } while (unlikely(read_seqcount_retry(>od->seq, seq))); @@ -225,7 +225,7 @@ notrace static int do_monotonic_coarse(struct timespec *ts) { unsigned long seq; do { - seq = read_seqcount_begin_no_lockdep(>od->seq); + seq = raw_read_seqcount_begin(>od->seq); ts->tv_sec = gtod->monotonic_time_coarse.tv_sec; ts->tv_nsec = gtod->monotonic_time_coarse.tv_nsec; } while (unlikely(read_seqcount_retry(>od->seq, seq))); diff --git a/include/linux/seqlock.h b/include/linux/seqlock.h index cf87a24..535f158 100644 --- a/include/linux/seqlock.h +++ b/include/linux/seqlock.h @@ -117,15 +117,15 @@ repeat: } /** - * read_seqcount_begin_no_lockdep - start seq-read critical section w/o lockdep + * raw_read_seqcount_begin - start seq-read critical section w/o lockdep * @s: pointer to seqcount_t * Returns: count to be passed to read_seqcount_retry * - * read_seqcount_begin_no_lockdep opens a read critical section of the given + * raw_read_seqcount_begin opens a read critical section of the given * seqcount, but without any lockdep checking. Validity of the critical * section is tested by checking read_seqcount_retry function. */ -static inline unsigned read_seqcount_begin_no_lockdep(const seqcount_t *s) +static inline unsigned raw_read_seqcount_begin(const seqcount_t *s) { unsigned ret = __read_seqcount_begin(s); smp_rmb(); @@ -144,7 +144,7 @@ static inline unsigned read_seqcount_begin_no_lockdep(const seqcount_t *s) static inline unsigned read_seqcount_begin(const seqcount_t *s) { seqcount_lockdep_reader_access(s); - return read_seqcount_begin_no_lockdep(s); + return raw_read_seqcount_begin(s); } /** @@ -206,14 +206,26 @@ static inline int read_seqcount_retry(const seqcount_t *s, unsigned start) } + +static inline void raw_write_seqcount_begin(seqcount_t *s) +{ + s->sequence++; + smp_wmb(); +} + +static inline void raw_write_seqcount_end(seqcount_t *s) +{ + smp_wmb(); + s->sequence++; +} + /* * Sequence counter only version assumes that callers are using their * own mutexing. */ static inline void write_seqcount_begin_nested(seqcount_t *s, int subclass) { - s->sequence++; - smp_wmb(); + raw_write_seqcount_begin(s); seqcount_acquire(&s->dep_map, subclass, 0, _RET_IP_); } @@ -225,8 +237,7 @@ static inline void write_seqcount_begin(seqcount_t *s) static inline void write_seqcount_end(seqcount_t *s) { seqcount_release(&s->dep_map, 1, _RET_IP_); - smp_wmb(); - s->sequence++; + raw_write_seqcount_end(s); } /** -- cgit v0.10.2 From 7a06c41cbec33c6dbe7eec575c61986122617408 Mon Sep 17 00:00:00 2001 From: John Stultz Date: Thu, 2 Jan 2014 15:11:14 -0800 Subject: sched_clock: Disable seqlock lockdep usage in sched_clock() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Unfortunately the seqlock lockdep enablement can't be used in sched_clock(), since the lockdep infrastructure eventually calls into sched_clock(), which causes a deadlock. Thus, this patch changes all generic sched_clock() usage to use the raw_* methods. Acked-by: Linus Torvalds Reviewed-by: Stephen Boyd Reported-by: Krzysztof Hałasa Signed-off-by: John Stultz Cc: Uwe Kleine-König Cc: Willy Tarreau Signed-off-by: Peter Zijlstra Link: http://lkml.kernel.org/r/1388704274-5278-2-git-send-email-john.stultz@linaro.org Signed-off-by: Ingo Molnar diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c index 68b7993..0abb364 100644 --- a/kernel/time/sched_clock.c +++ b/kernel/time/sched_clock.c @@ -74,7 +74,7 @@ unsigned long long notrace sched_clock(void) return cd.epoch_ns; do { - seq = read_seqcount_begin(&cd.seq); + seq = raw_read_seqcount_begin(&cd.seq); epoch_cyc = cd.epoch_cyc; epoch_ns = cd.epoch_ns; } while (read_seqcount_retry(&cd.seq, seq)); @@ -99,10 +99,10 @@ static void notrace update_sched_clock(void) cd.mult, cd.shift); raw_local_irq_save(flags); - write_seqcount_begin(&cd.seq); + raw_write_seqcount_begin(&cd.seq); cd.epoch_ns = ns; cd.epoch_cyc = cyc; - write_seqcount_end(&cd.seq); + raw_write_seqcount_end(&cd.seq); raw_local_irq_restore(flags); } -- cgit v0.10.2 From 518d00b7498c5894be94545848d55e5b9c55749e Mon Sep 17 00:00:00 2001 From: Ming Lei Date: Thu, 26 Dec 2013 21:31:37 +0800 Subject: block: null_blk: fix queue leak inside removing device When queue_mode is NULL_Q_MQ and null_blk is being removed, blk_cleanup_queue() isn't called to cleanup queue, so the queue allocated won't be freed. This patch calls blk_cleanup_queue() for MQ to drain all pending requests first and release the reference counter of queue kobject, then blk_mq_free_queue() will be called in queue kobject's release handler when queue kobject's reference counter drops to zero. Signed-off-by: Ming Lei Signed-off-by: Jens Axboe Signed-off-by: Linus Torvalds diff --git a/drivers/block/null_blk.c b/drivers/block/null_blk.c index a2e69d2..83a598e 100644 --- a/drivers/block/null_blk.c +++ b/drivers/block/null_blk.c @@ -425,10 +425,7 @@ static void null_del_dev(struct nullb *nullb) list_del_init(&nullb->list); del_gendisk(nullb->disk); - if (queue_mode == NULL_Q_MQ) - blk_mq_free_queue(nullb->q); - else - blk_cleanup_queue(nullb->q); + blk_cleanup_queue(nullb->q); put_disk(nullb->disk); kfree(nullb); } @@ -578,10 +575,7 @@ static int null_add_dev(void) disk = nullb->disk = alloc_disk_node(1, home_node); if (!disk) { queue_fail: - if (queue_mode == NULL_Q_MQ) - blk_mq_free_queue(nullb->q); - else - blk_cleanup_queue(nullb->q); + blk_cleanup_queue(nullb->q); cleanup_queues(nullb); err: kfree(nullb); -- cgit v0.10.2 From eecc1e426d681351a6026a7d3e7d225f38955b6c Mon Sep 17 00:00:00 2001 From: Hugh Dickins Date: Sun, 12 Jan 2014 01:25:21 -0800 Subject: thp: fix copy_page_rep GPF by testing is_huge_zero_pmd once only We see General Protection Fault on RSI in copy_page_rep: that RSI is what you get from a NULL struct page pointer. RIP: 0010:[] [] copy_page_rep+0x5/0x10 RSP: 0000:ffff880136e15c00 EFLAGS: 00010286 RAX: ffff880000000000 RBX: ffff880136e14000 RCX: 0000000000000200 RDX: 6db6db6db6db6db7 RSI: db73880000000000 RDI: ffff880dd0c00000 RBP: ffff880136e15c18 R08: 0000000000000200 R09: 000000000005987c R10: 000000000005987c R11: 0000000000000200 R12: 0000000000000001 R13: ffffea00305aa000 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f195752f700(0000) GS:ffff880c7fc20000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000093010000 CR3: 00000001458e1000 CR4: 00000000000027e0 Call Trace: copy_user_huge_page+0x93/0xab do_huge_pmd_wp_page+0x710/0x815 handle_mm_fault+0x15d8/0x1d70 __do_page_fault+0x14d/0x840 do_page_fault+0x2f/0x90 page_fault+0x22/0x30 do_huge_pmd_wp_page() tests is_huge_zero_pmd(orig_pmd) four times: but since shrink_huge_zero_page() can free the huge_zero_page, and we have no hold of our own on it here (except where the fourth test holds page_table_lock and has checked pmd_same), it's possible for it to answer yes the first time, but no to the second or third test. Change all those last three to tests for NULL page. (Note: this is not the same issue as trinity's DEBUG_PAGEALLOC BUG in copy_page_rep with RSI: ffff88009c422000, reported by Sasha Levin in https://lkml.org/lkml/2013/3/29/103. I believe that one is due to the source page being split, and a tail page freed, while copy is in progress; and not a problem without DEBUG_PAGEALLOC, since the pmd_same check will prevent a miscopy from being made visible.) Fixes: 97ae17497e99 ("thp: implement refcounting for huge zero page") Signed-off-by: Hugh Dickins Cc: stable@vger.kernel.org # v3.10 v3.11 v3.12 Signed-off-by: Linus Torvalds diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 9c0b172..95d1acb 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1154,7 +1154,7 @@ alloc: new_page = NULL; if (unlikely(!new_page)) { - if (is_huge_zero_pmd(orig_pmd)) { + if (!page) { ret = do_huge_pmd_wp_zero_page_fallback(mm, vma, address, pmd, orig_pmd, haddr); } else { @@ -1181,7 +1181,7 @@ alloc: count_vm_event(THP_FAULT_ALLOC); - if (is_huge_zero_pmd(orig_pmd)) + if (!page) clear_huge_page(new_page, haddr, HPAGE_PMD_NR); else copy_user_huge_page(new_page, page, haddr, vma, HPAGE_PMD_NR); @@ -1207,7 +1207,7 @@ alloc: page_add_new_anon_rmap(new_page, vma, haddr); set_pmd_at(mm, haddr, pmd, entry); update_mmu_cache_pmd(vma, address, pmd); - if (is_huge_zero_pmd(orig_pmd)) { + if (!page) { add_mm_counter(mm, MM_ANONPAGES, HPAGE_PMD_NR); put_huge_zero_page(); } else { -- cgit v0.10.2 From 3dc91d4338d698ce77832985f9cb183d8eeaf6be Mon Sep 17 00:00:00 2001 From: Steven Rostedt Date: Thu, 9 Jan 2014 21:46:34 -0500 Subject: SELinux: Fix possible NULL pointer dereference in selinux_inode_permission() While running stress tests on adding and deleting ftrace instances I hit this bug: BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 IP: selinux_inode_permission+0x85/0x160 PGD 63681067 PUD 7ddbe067 PMD 0 Oops: 0000 [#1] PREEMPT CPU: 0 PID: 5634 Comm: ftrace-test-mki Not tainted 3.13.0-rc4-test-00033-gd2a6dde-dirty #20 Hardware name: /DG965MQ, BIOS MQ96510J.86A.0372.2006.0605.1717 06/05/2006 task: ffff880078375800 ti: ffff88007ddb0000 task.ti: ffff88007ddb0000 RIP: 0010:[] [] selinux_inode_permission+0x85/0x160 RSP: 0018:ffff88007ddb1c48 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000800000 RCX: ffff88006dd43840 RDX: 0000000000000001 RSI: 0000000000000081 RDI: ffff88006ee46000 RBP: ffff88007ddb1c88 R08: 0000000000000000 R09: ffff88007ddb1c54 R10: 6e6576652f6f6f66 R11: 0000000000000003 R12: 0000000000000000 R13: 0000000000000081 R14: ffff88006ee46000 R15: 0000000000000000 FS: 00007f217b5b6700(0000) GS:ffffffff81e21000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033^M CR2: 0000000000000020 CR3: 000000006a0fe000 CR4: 00000000000007f0 Call Trace: security_inode_permission+0x1c/0x30 __inode_permission+0x41/0xa0 inode_permission+0x18/0x50 link_path_walk+0x66/0x920 path_openat+0xa6/0x6c0 do_filp_open+0x43/0xa0 do_sys_open+0x146/0x240 SyS_open+0x1e/0x20 system_call_fastpath+0x16/0x1b Code: 84 a1 00 00 00 81 e3 00 20 00 00 89 d8 83 c8 02 40 f6 c6 04 0f 45 d8 40 f6 c6 08 74 71 80 cf 02 49 8b 46 38 4c 8d 4d cc 45 31 c0 <0f> b7 50 20 8b 70 1c 48 8b 41 70 89 d9 8b 78 04 e8 36 cf ff ff RIP selinux_inode_permission+0x85/0x160 CR2: 0000000000000020 Investigating, I found that the inode->i_security was NULL, and the dereference of it caused the oops. in selinux_inode_permission(): isec = inode->i_security; rc = avc_has_perm_noaudit(sid, isec->sid, isec->sclass, perms, 0, &avd); Note, the crash came from stressing the deletion and reading of debugfs files. I was not able to recreate this via normal files. But I'm not sure they are safe. It may just be that the race window is much harder to hit. What seems to have happened (and what I have traced), is the file is being opened at the same time the file or directory is being deleted. As the dentry and inode locks are not held during the path walk, nor is the inodes ref counts being incremented, there is nothing saving these structures from being discarded except for an rcu_read_lock(). The rcu_read_lock() protects against freeing of the inode, but it does not protect freeing of the inode_security_struct. Now if the freeing of the i_security happens with a call_rcu(), and the i_security field of the inode is not changed (it gets freed as the inode gets freed) then there will be no issue here. (Linus Torvalds suggested not setting the field to NULL such that we do not need to check if it is NULL in the permission check). Note, this is a hack, but it fixes the problem at hand. A real fix is to restructure the destroy_inode() to call all the destructor handlers from the RCU callback. But that is a major job to do, and requires a lot of work. For now, we just band-aid this bug with this fix (it works), and work on a more maintainable solution in the future. Link: http://lkml.kernel.org/r/20140109101932.0508dec7@gandalf.local.home Link: http://lkml.kernel.org/r/20140109182756.17abaaa8@gandalf.local.home Cc: stable@vger.kernel.org Signed-off-by: Steven Rostedt Signed-off-by: Linus Torvalds diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c index 6625699..57b0b49 100644 --- a/security/selinux/hooks.c +++ b/security/selinux/hooks.c @@ -234,6 +234,14 @@ static int inode_alloc_security(struct inode *inode) return 0; } +static void inode_free_rcu(struct rcu_head *head) +{ + struct inode_security_struct *isec; + + isec = container_of(head, struct inode_security_struct, rcu); + kmem_cache_free(sel_inode_cache, isec); +} + static void inode_free_security(struct inode *inode) { struct inode_security_struct *isec = inode->i_security; @@ -244,8 +252,16 @@ static void inode_free_security(struct inode *inode) list_del_init(&isec->list); spin_unlock(&sbsec->isec_lock); - inode->i_security = NULL; - kmem_cache_free(sel_inode_cache, isec); + /* + * The inode may still be referenced in a path walk and + * a call to selinux_inode_permission() can be made + * after inode_free_security() is called. Ideally, the VFS + * wouldn't do this, but fixing that is a much harder + * job. For now, simply free the i_security via RCU, and + * leave the current inode->i_security pointer intact. + * The inode will be freed after the RCU grace period too. + */ + call_rcu(&isec->rcu, inode_free_rcu); } static int file_alloc_security(struct file *file) diff --git a/security/selinux/include/objsec.h b/security/selinux/include/objsec.h index b1dfe10..078e553 100644 --- a/security/selinux/include/objsec.h +++ b/security/selinux/include/objsec.h @@ -38,7 +38,10 @@ struct task_security_struct { struct inode_security_struct { struct inode *inode; /* back pointer to inode object */ - struct list_head list; /* list of inode_security_struct */ + union { + struct list_head list; /* list of inode_security_struct */ + struct rcu_head rcu; /* for freeing the inode_security_struct */ + }; u32 task_sid; /* SID of creating task */ u32 sid; /* SID of this object */ u16 sclass; /* security class of this object */ -- cgit v0.10.2 From 7e22e91102c6b9df7c4ae2168910e19d2bb14cd6 Mon Sep 17 00:00:00 2001 From: Linus Torvalds Date: Sun, 12 Jan 2014 17:04:18 +0700 Subject: Linux 3.13-rc8 diff --git a/Makefile b/Makefile index ae1a55a..eeec740 100644 --- a/Makefile +++ b/Makefile @@ -1,7 +1,7 @@ VERSION = 3 PATCHLEVEL = 13 SUBLEVEL = 0 -EXTRAVERSION = -rc7 +EXTRAVERSION = -rc8 NAME = One Giant Leap for Frogkind # *DOCUMENTATION* -- cgit v0.10.2 From b25f3e1c358434bf850220e04f28eebfc45eb634 Mon Sep 17 00:00:00 2001 From: Taras Kondratiuk Date: Fri, 10 Jan 2014 01:27:08 +0100 Subject: ARM: 7938/1: OMAP4/highbank: Flush L2 cache before disabling Kexec disables outer cache before jumping to reboot code, but it doesn't flush it explicitly. Flush is done implicitly inside of l2x0_disable(). But some SoC's override default .disable handler and don't flush cache. This may lead to a corrupted memory during Kexec reboot on these platforms. This patch adds cache flush inside of OMAP4 and Highbank outer_cache.disable() handlers to make it consistent with default l2x0_disable(). Acked-by: Rob Herring Acked-by: Santosh Shilimkar Acked-by: Tony Lindgren Signed-off-by: Taras Kondratiuk Signed-off-by: Russell King diff --git a/arch/arm/mach-highbank/highbank.c b/arch/arm/mach-highbank/highbank.c index b3d7e56..ae17150 100644 --- a/arch/arm/mach-highbank/highbank.c +++ b/arch/arm/mach-highbank/highbank.c @@ -50,6 +50,7 @@ static void __init highbank_scu_map_io(void) static void highbank_l2x0_disable(void) { + outer_flush_all(); /* Disable PL310 L2 Cache controller */ highbank_smc1(0x102, 0x0); } diff --git a/arch/arm/mach-omap2/omap4-common.c b/arch/arm/mach-omap2/omap4-common.c index 5791143..3f44b16 100644 --- a/arch/arm/mach-omap2/omap4-common.c +++ b/arch/arm/mach-omap2/omap4-common.c @@ -163,6 +163,7 @@ void __iomem *omap4_get_l2cache_base(void) static void omap4_l2x0_disable(void) { + outer_flush_all(); /* Disable PL310 L2 Cache controller */ omap_smc1(0x102, 0x0); } -- cgit v0.10.2 From 10348f5976830e5d8f74e8abb04a9a057a5e8478 Mon Sep 17 00:00:00 2001 From: Benjamin Herrenschmidt Date: Mon, 13 Jan 2014 09:49:17 +1100 Subject: powerpc: Check return value of instance-to-package OF call On PA-Semi firmware, the instance-to-package callback doesn't seem to be implemented. We didn't check for error, however, thus subsequently passed the -1 value returned into stdout_node to thins like prom_getprop etc... Thus caused the firmware to load values around 0 (physical) internally as node structures. It somewhat "worked" as long as we had a NULL in the right place (address 8) at the beginning of the kernel, we didn't "see" the bug. But commit 5c0484e25ec03243d4c2f2d4416d4a13efc77f6a "powerpc: Endian safe trampoline" changed the kernel entry point causing that old bug to now cause a crash early during boot. This fixes booting on PA-Semi board by properly checking the return value from instance-to-package. Signed-off-by: Benjamin Herrenschmidt Tested-by: Olof Johansson --- diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c index cb64a6e..078145a 100644 --- a/arch/powerpc/kernel/prom_init.c +++ b/arch/powerpc/kernel/prom_init.c @@ -1986,19 +1986,23 @@ static void __init prom_init_stdout(void) /* Get the full OF pathname of the stdout device */ memset(path, 0, 256); call_prom("instance-to-path", 3, 1, prom.stdout, path, 255); - stdout_node = call_prom("instance-to-package", 1, 1, prom.stdout); - val = cpu_to_be32(stdout_node); - prom_setprop(prom.chosen, "/chosen", "linux,stdout-package", - &val, sizeof(val)); prom_printf("OF stdout device is: %s\n", of_stdout_device); prom_setprop(prom.chosen, "/chosen", "linux,stdout-path", path, strlen(path) + 1); - /* If it's a display, note it */ - memset(type, 0, sizeof(type)); - prom_getprop(stdout_node, "device_type", type, sizeof(type)); - if (strcmp(type, "display") == 0) - prom_setprop(stdout_node, path, "linux,boot-display", NULL, 0); + /* instance-to-package fails on PA-Semi */ + stdout_node = call_prom("instance-to-package", 1, 1, prom.stdout); + if (stdout_node != PROM_ERROR) { + val = cpu_to_be32(stdout_node); + prom_setprop(prom.chosen, "/chosen", "linux,stdout-package", + &val, sizeof(val)); + + /* If it's a display, note it */ + memset(type, 0, sizeof(type)); + prom_getprop(stdout_node, "device_type", type, sizeof(type)); + if (strcmp(type, "display") == 0) + prom_setprop(stdout_node, path, "linux,boot-display", NULL, 0); + } } static int __init prom_find_machine_type(void) -- cgit v0.10.2 From abce1ec9b08a4f318f431e6b9a12524227ae7109 Mon Sep 17 00:00:00 2001 From: Dave Airlie Date: Tue, 14 Jan 2014 12:50:49 +1000 Subject: Revert "drm: copy mode type in drm_mode_connector_list_update()" This reverts commit 3fbd6439e4639ecaeaae6c079e0aa497a1ac3482. This caused some strange booting lockup issues on an Intel G33 belonging to Daniel Vetter, very unusual, I was hoping Daniel would track this down, but it looks like instead I'll have to hack a different fix for -next. Signed-off-by: Dave Airlie diff --git a/drivers/gpu/drm/drm_modes.c b/drivers/gpu/drm/drm_modes.c index 85071a1..b073315 100644 --- a/drivers/gpu/drm/drm_modes.c +++ b/drivers/gpu/drm/drm_modes.c @@ -1041,7 +1041,7 @@ void drm_mode_connector_list_update(struct drm_connector *connector) /* if equal delete the probed mode */ mode->status = pmode->status; /* Merge type bits together */ - mode->type = pmode->type; + mode->type |= pmode->type; list_del(&pmode->head); drm_mode_destroy(connector->dev, pmode); break; -- cgit v0.10.2 From 1cc03eb93245e63b0b7a7832165efdc52e25b4e6 Mon Sep 17 00:00:00 2001 From: NeilBrown Date: Mon, 6 Jan 2014 13:19:42 +1100 Subject: md/raid5: Fix possible confusion when multiple write errors occur. commit 5d8c71f9e5fbdd95650be00294d238e27a363b5c md: raid5 crash during degradation Fixed a crash in an overly simplistic way which could leave R5_WriteError or R5_MadeGood set in the stripe cache for devices for which it is no longer relevant. When those devices are removed and spares added the flags are still set and can cause incorrect behaviour. commit 14a75d3e07c784c004b4b44b34af996b8e4ac453 md/raid5: preferentially read from replacement device if possible. Fixed the same bug if a more effective way, so we can now revert the original commit. Reported-and-tested-by: Alexander Lyakas Cc: stable@vger.kernel.org (3.2+ - 3.2 will need a different fix though) Fixes: 5d8c71f9e5fbdd95650be00294d238e27a363b5c Signed-off-by: NeilBrown diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index cc055da..9168173 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -3608,7 +3608,7 @@ static void analyse_stripe(struct stripe_head *sh, struct stripe_head_state *s) */ set_bit(R5_Insync, &dev->flags); - if (rdev && test_bit(R5_WriteError, &dev->flags)) { + if (test_bit(R5_WriteError, &dev->flags)) { /* This flag does not apply to '.replacement' * only to .rdev, so make sure to check that*/ struct md_rdev *rdev2 = rcu_dereference( @@ -3621,7 +3621,7 @@ static void analyse_stripe(struct stripe_head *sh, struct stripe_head_state *s) } else clear_bit(R5_WriteError, &dev->flags); } - if (rdev && test_bit(R5_MadeGood, &dev->flags)) { + if (test_bit(R5_MadeGood, &dev->flags)) { /* This flag does not apply to '.replacement' * only to .rdev, so make sure to check that*/ struct md_rdev *rdev2 = rcu_dereference( -- cgit v0.10.2 From b50c259e25d9260b9108dc0c2964c26e5ecbe1c1 Mon Sep 17 00:00:00 2001 From: NeilBrown Date: Tue, 14 Jan 2014 10:38:09 +1100 Subject: md/raid10: fix two bugs in handling of known-bad-blocks. If we discover a bad block when reading we split the request and potentially read some of it from a different device. The code path of this has two bugs in RAID10. 1/ we get a spin_lock with _irq, but unlock without _irq!! 2/ The calculation of 'sectors_handled' is wrong, as can be clearly seen by comparison with raid1.c This leads to at least 2 warnings and a probable crash is a RAID10 ever had known bad blocks. Cc: stable@vger.kernel.org (v3.1+) Fixes: 856e08e23762dfb92ffc68fd0a8d228f9e152160 Reported-by: Damian Nowak URL: https://bugzilla.kernel.org/show_bug.cgi?id=68181 Signed-off-by: NeilBrown diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index c504e83..6528521 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -1319,7 +1319,7 @@ read_again: /* Could not read all from this device, so we will * need another r10_bio. */ - sectors_handled = (r10_bio->sectors + max_sectors + sectors_handled = (r10_bio->sector + max_sectors - bio->bi_sector); r10_bio->sectors = max_sectors; spin_lock_irq(&conf->device_lock); @@ -1327,7 +1327,7 @@ read_again: bio->bi_phys_segments = 2; else bio->bi_phys_segments++; - spin_unlock(&conf->device_lock); + spin_unlock_irq(&conf->device_lock); /* Cannot call generic_make_request directly * as that will be queued in __generic_make_request * and subsequent mempool_alloc might block -- cgit v0.10.2 From 41a336e011887f73e7c879b60e1e3544045435cb Mon Sep 17 00:00:00 2001 From: NeilBrown Date: Tue, 14 Jan 2014 11:56:14 +1100 Subject: md/raid1: fix request counting bug in new 'barrier' code. The new iobarrier implementation in raid1 (which keeps normal writes and resync activity separate) counts every request what is not before the current resync point in either next_window_requests or current_window_requests. It flags that the request is counted by setting ->start_next_window. allow_barrier follows this model exactly and decrements one of the *_window_requests if and only if ->start_next_window is set. However wait_barrier(), which increments *_window_requests uses a slightly different test for setting -.start_next_window (which is set from the return value of this function). So there is a possibility of the counts getting out of sync, and this leads to the resync hanging. So change wait_barrier() to return a non-zero value in exactly the same cases that it increments *_window_requests. But was introduced in 3.13-rc1. Reported-by: Bruno Wolff III URL: https://bugzilla.kernel.org/show_bug.cgi?id=68061 Fixes: 79ef3a8aa1cb1523cc231c9a90a278333c21f761 Cc: majianpeng Signed-off-by: NeilBrown diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index 1e5a540..a49cfcc 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -924,9 +924,8 @@ static sector_t wait_barrier(struct r1conf *conf, struct bio *bio) conf->next_window_requests++; else conf->current_window_requests++; - } - if (bio->bi_sector >= conf->start_next_window) sector = conf->start_next_window; + } } conf->nr_pending++; -- cgit v0.10.2 From 5af9bef72c074dbe946da8b74eabd79cd5a9ff19 Mon Sep 17 00:00:00 2001 From: NeilBrown Date: Tue, 14 Jan 2014 15:16:10 +1100 Subject: md/raid5: fix a recently broken BUG_ON(). commit 6d183de4077191d1201283a9035ce57a9b05254d md/raid5: fix newly-broken locking in get_active_stripe. simplified a BUG_ON, but removed too much so now it sometimes fires when it shouldn't. When the STRIPE_EXPANDING flag is set, the stripe_head might be on a special list while multiple stripe_heads are collected, or it might not be on any list, even a 'free' list when the refcount is zero. As long as STRIPE_EXPANDING is set, it will be found and added back to a list eventually. So both of the BUG_ONs which test for the ->lru being empty or not need to avoid the case where STRIPE_EXPANDING is set. The patch which broke this was marked for -stable, so this patch needs to be applied to any branch that received 6d183de4 Fixes: 6d183de4077191d1201283a9035ce57a9b05254d Cc: stable@vger.kernel.org (any release to which above was applied) Signed-off-by: NeilBrown diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index 9168173..cbb1571 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -687,7 +687,8 @@ get_active_stripe(struct r5conf *conf, sector_t sector, } else { if (!test_bit(STRIPE_HANDLE, &sh->state)) atomic_inc(&conf->active_stripes); - BUG_ON(list_empty(&sh->lru)); + BUG_ON(list_empty(&sh->lru) && + !test_bit(STRIPE_EXPANDING, &sh->state)); list_del_init(&sh->lru); if (sh->group) { sh->group->stripes_cnt--; -- cgit v0.10.2 From e8b849158508565e0cd6bc80061124afc5879160 Mon Sep 17 00:00:00 2001 From: NeilBrown Date: Mon, 6 Jan 2014 10:35:34 +1100 Subject: md/raid10: fix bug when raid10 recovery fails to recover a block. commit e875ecea266a543e643b19e44cf472f1412708f9 md/raid10 record bad blocks as needed during recovery. added code to the "cannot recover this block" path to record a bad block rather than fail the whole recovery. Unfortunately this new case was placed *after* r10bio was freed rather than *before*, yet it still uses r10bio. This is will crash with a null dereference. So move the freeing of r10bio down where it is safe. Cc: stable@vger.kernel.org (v3.1+) Fixes: e875ecea266a543e643b19e44cf472f1412708f9 Reported-by: Damian Nowak URL: https://bugzilla.kernel.org/show_bug.cgi?id=68181 Signed-off-by: NeilBrown diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 6528521..06eeb99 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -3218,10 +3218,6 @@ static sector_t sync_request(struct mddev *mddev, sector_t sector_nr, if (j == conf->copies) { /* Cannot recover, so abort the recovery or * record a bad block */ - put_buf(r10_bio); - if (rb2) - atomic_dec(&rb2->remaining); - r10_bio = rb2; if (any_working) { /* problem is that there are bad blocks * on other device(s) @@ -3253,6 +3249,10 @@ static sector_t sync_request(struct mddev *mddev, sector_t sector_nr, mirror->recovery_disabled = mddev->recovery_disabled; } + put_buf(r10_bio); + if (rb2) + atomic_dec(&rb2->remaining); + r10_bio = rb2; break; } } -- cgit v0.10.2 From 8313b8e57f55b15e5b7f7fc5d1630bbf686a9a97 Mon Sep 17 00:00:00 2001 From: NeilBrown Date: Thu, 12 Dec 2013 10:13:33 +1100 Subject: md: fix problem when adding device to read-only array with bitmap. If an array is started degraded, and then the missing device is found it can be re-added and a minimal bitmap-based recovery will bring it fully up-to-date. If the array is read-only a recovery would not be allowed. But also if the array is read-only and the missing device was present very recently, then there could be no need for any recovery at all, so we simply include the device in the read-only array without any recovery. However... if the missing device was removed a little longer ago it could be missing some updates, but if a bitmap is present it will be conditionally accepted pending a bitmap-based update. We don't currently detect this case properly and will include that old device into the read-only array with no recovery even though it really needs a recovery. This patch keeps track of whether a bitmap-based-recovery is really needed or not in the new Bitmap_sync rdev flag. If that is set, then the device will not be added to a read-only array. Cc: Andrei Warkentin Fixes: d70ed2e4fafdbef0800e73942482bb075c21578b Cc: stable@vger.kernel.org (3.2+) Signed-off-by: NeilBrown diff --git a/drivers/md/md.c b/drivers/md/md.c index e60cebf..2a456a5 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -1087,6 +1087,7 @@ static int super_90_validate(struct mddev *mddev, struct md_rdev *rdev) rdev->raid_disk = -1; clear_bit(Faulty, &rdev->flags); clear_bit(In_sync, &rdev->flags); + clear_bit(Bitmap_sync, &rdev->flags); clear_bit(WriteMostly, &rdev->flags); if (mddev->raid_disks == 0) { @@ -1165,6 +1166,8 @@ static int super_90_validate(struct mddev *mddev, struct md_rdev *rdev) */ if (ev1 < mddev->bitmap->events_cleared) return 0; + if (ev1 < mddev->events) + set_bit(Bitmap_sync, &rdev->flags); } else { if (ev1 < mddev->events) /* just a hot-add of a new device, leave raid_disk at -1 */ @@ -1573,6 +1576,7 @@ static int super_1_validate(struct mddev *mddev, struct md_rdev *rdev) rdev->raid_disk = -1; clear_bit(Faulty, &rdev->flags); clear_bit(In_sync, &rdev->flags); + clear_bit(Bitmap_sync, &rdev->flags); clear_bit(WriteMostly, &rdev->flags); if (mddev->raid_disks == 0) { @@ -1655,6 +1659,8 @@ static int super_1_validate(struct mddev *mddev, struct md_rdev *rdev) */ if (ev1 < mddev->bitmap->events_cleared) return 0; + if (ev1 < mddev->events) + set_bit(Bitmap_sync, &rdev->flags); } else { if (ev1 < mddev->events) /* just a hot-add of a new device, leave raid_disk at -1 */ @@ -2798,6 +2804,7 @@ slot_store(struct md_rdev *rdev, const char *buf, size_t len) else rdev->saved_raid_disk = -1; clear_bit(In_sync, &rdev->flags); + clear_bit(Bitmap_sync, &rdev->flags); err = rdev->mddev->pers-> hot_add_disk(rdev->mddev, rdev); if (err) { @@ -5770,6 +5777,7 @@ static int add_new_disk(struct mddev * mddev, mdu_disk_info_t *info) info->raid_disk < mddev->raid_disks) { rdev->raid_disk = info->raid_disk; set_bit(In_sync, &rdev->flags); + clear_bit(Bitmap_sync, &rdev->flags); } else rdev->raid_disk = -1; } else @@ -7716,7 +7724,8 @@ static int remove_and_add_spares(struct mddev *mddev, if (test_bit(Faulty, &rdev->flags)) continue; if (mddev->ro && - rdev->saved_raid_disk < 0) + ! (rdev->saved_raid_disk >= 0 && + !test_bit(Bitmap_sync, &rdev->flags))) continue; rdev->recovery_offset = 0; @@ -7797,9 +7806,12 @@ void md_check_recovery(struct mddev *mddev) * As we only add devices that are already in-sync, * we can activate the spares immediately. */ - clear_bit(MD_RECOVERY_NEEDED, &mddev->recovery); remove_and_add_spares(mddev, NULL); - mddev->pers->spare_active(mddev); + /* There is no thread, but we need to call + * ->spare_active and clear saved_raid_disk + */ + md_reap_sync_thread(mddev); + clear_bit(MD_RECOVERY_NEEDED, &mddev->recovery); goto unlock; } diff --git a/drivers/md/md.h b/drivers/md/md.h index 2f5cc8a..0095ec8 100644 --- a/drivers/md/md.h +++ b/drivers/md/md.h @@ -129,6 +129,9 @@ struct md_rdev { enum flag_bits { Faulty, /* device is known to have a fault */ In_sync, /* device is in_sync with rest of array */ + Bitmap_sync, /* ..actually, not quite In_sync. Need a + * bitmap-based recovery to get fully in sync + */ Unmerged, /* device is being added to array and should * be considerred for bvec_merge_fn but not * yet for actual IO -- cgit v0.10.2 From 2fac2b891f287691c27ee8d2eeecf39571b27fea Mon Sep 17 00:00:00 2001 From: Stephen Warren Date: Mon, 13 Jan 2014 14:29:04 -0700 Subject: i2c: Re-instate body of i2c_parent_is_i2c_adapter() The body of i2c_parent_is_i2c_adapter() is currently guarded by I2C_MUX. It should be CONFIG_I2C_MUX instead. Among potentially other problems, this resulted in i2c_lock_adapter() only locking I2C mux child adapters, and not the parent adapter. In turn, this could allow inter-mingling of mux child selection and I2C transactions, which could result in I2C transactions being directed to the wrong I2C bus, and possibly even switching between busses in the middle of a transaction. One concrete issue caused by this bug was corrupted HDMI EDID reads during boot on the NVIDIA Tegra Seaboard system, although this only became apparent in recent linux-next, when the boot timing was changed just enough to trigger the race condition. Fixes: 3923172b3d70 ("i2c: reduce parent checking to a NOOP in non-I2C_MUX case") Cc: Phil Carmody Cc: Signed-off-by: Stephen Warren Signed-off-by: Wolfram Sang diff --git a/include/linux/i2c.h b/include/linux/i2c.h index eff50e0..d9c8dbd3 100644 --- a/include/linux/i2c.h +++ b/include/linux/i2c.h @@ -445,7 +445,7 @@ static inline void i2c_set_adapdata(struct i2c_adapter *dev, void *data) static inline struct i2c_adapter * i2c_parent_is_i2c_adapter(const struct i2c_adapter *adapter) { -#if IS_ENABLED(I2C_MUX) +#if IS_ENABLED(CONFIG_I2C_MUX) struct device *parent = adapter->dev.parent; if (parent != NULL && parent->type == &i2c_adapter_type) -- cgit v0.10.2 From 3f9aec7610b39521c7c69d754de7265f6994c194 Mon Sep 17 00:00:00 2001 From: Jean Delvare Date: Tue, 14 Jan 2014 15:59:55 +0100 Subject: hwmon: (coretemp) Fix truncated name of alarm attributes When the core number exceeds 9, the size of the buffer storing the alarm attribute name is insufficient and the attribute name is truncated. This causes libsensors to skip these attributes as the truncated name is not recognized. Reported-by: Andreas Hollmann Signed-off-by: Jean Delvare Cc: stable@vger.kernel.org Signed-off-by: Guenter Roeck diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c index 78be661..9425098 100644 --- a/drivers/hwmon/coretemp.c +++ b/drivers/hwmon/coretemp.c @@ -52,7 +52,7 @@ MODULE_PARM_DESC(tjmax, "TjMax value in degrees Celsius"); #define BASE_SYSFS_ATTR_NO 2 /* Sysfs Base attr no for coretemp */ #define NUM_REAL_CORES 32 /* Number of Real cores per cpu */ -#define CORETEMP_NAME_LENGTH 17 /* String Length of attrs */ +#define CORETEMP_NAME_LENGTH 19 /* String Length of attrs */ #define MAX_CORE_ATTRS 4 /* Maximum no of basic attrs */ #define TOTAL_ATTRS (MAX_CORE_ATTRS + 1) #define MAX_CORE_DATA (NUM_REAL_CORES + BASE_SYSFS_ATTR_NO) -- cgit v0.10.2 From 267d29a69c6af39445f36102a832b25ed483f299 Mon Sep 17 00:00:00 2001 From: Christian Engelmayer Date: Sat, 11 Jan 2014 22:19:30 +0100 Subject: ieee802154: Fix memory leak in ieee802154_add_iface() Fix a memory leak in the ieee802154_add_iface() error handling path. Detected by Coverity: CID 710490. Signed-off-by: Christian Engelmayer Signed-off-by: David S. Miller diff --git a/net/ieee802154/nl-phy.c b/net/ieee802154/nl-phy.c index d08c7a4..89b265a 100644 --- a/net/ieee802154/nl-phy.c +++ b/net/ieee802154/nl-phy.c @@ -221,8 +221,10 @@ int ieee802154_add_iface(struct sk_buff *skb, struct genl_info *info) if (info->attrs[IEEE802154_ATTR_DEV_TYPE]) { type = nla_get_u8(info->attrs[IEEE802154_ATTR_DEV_TYPE]); - if (type >= __IEEE802154_DEV_MAX) - return -EINVAL; + if (type >= __IEEE802154_DEV_MAX) { + rc = -EINVAL; + goto nla_put_failure; + } } dev = phy->add_iface(phy, devname, type); -- cgit v0.10.2 From a02bbb1ccfe81d3e859ab2ddde6a2eff9c8b39fd Mon Sep 17 00:00:00 2001 From: "Michael S. Tsirkin" Date: Sun, 12 Jan 2014 13:37:56 +0200 Subject: MAINTAINERS: add virtio-dev ML for virtio Since virtio is an OASIS standard draft now, virtio implementation discussions are taking place on the virtio-dev OASIS mailing list. Update MAINTAINERS. Signed-off-by: Michael S. Tsirkin Signed-off-by: David S. Miller diff --git a/MAINTAINERS b/MAINTAINERS index 31a0462..6a6e4ac 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -9231,6 +9231,7 @@ F: include/media/videobuf2-* VIRTIO CONSOLE DRIVER M: Amit Shah +L: virtio-dev@lists.oasis-open.org L: virtualization@lists.linux-foundation.org S: Maintained F: drivers/char/virtio_console.c @@ -9240,6 +9241,7 @@ F: include/uapi/linux/virtio_console.h VIRTIO CORE, NET AND BLOCK DRIVERS M: Rusty Russell M: "Michael S. Tsirkin" +L: virtio-dev@lists.oasis-open.org L: virtualization@lists.linux-foundation.org S: Maintained F: drivers/virtio/ @@ -9252,6 +9254,7 @@ F: include/uapi/linux/virtio_*.h VIRTIO HOST (VHOST) M: "Michael S. Tsirkin" L: kvm@vger.kernel.org +L: virtio-dev@lists.oasis-open.org L: virtualization@lists.linux-foundation.org L: netdev@vger.kernel.org S: Maintained -- cgit v0.10.2 From 7c4b5175f65f31e0cd9867a6ddc3171007dfc110 Mon Sep 17 00:00:00 2001 From: Peter Korsgaard Date: Sun, 12 Jan 2014 23:15:51 +0100 Subject: dm9601: add USB IDs for new dm96xx variants A number of new dm96xx variants now exist. Reported-by: Joseph Chang Signed-off-by: Peter Korsgaard Signed-off-by: David S. Miller diff --git a/drivers/net/usb/dm9601.c b/drivers/net/usb/dm9601.c index 14aa48f..e802198 100644 --- a/drivers/net/usb/dm9601.c +++ b/drivers/net/usb/dm9601.c @@ -614,6 +614,18 @@ static const struct usb_device_id products[] = { USB_DEVICE(0x0a46, 0x9621), /* DM9621A USB to Fast Ethernet Adapter */ .driver_info = (unsigned long)&dm9601_info, }, + { + USB_DEVICE(0x0a46, 0x9622), /* DM9622 USB to Fast Ethernet Adapter */ + .driver_info = (unsigned long)&dm9601_info, + }, + { + USB_DEVICE(0x0a46, 0x0269), /* DM962OA USB to Fast Ethernet Adapter */ + .driver_info = (unsigned long)&dm9601_info, + }, + { + USB_DEVICE(0x0a46, 0x1269), /* DM9621A USB to Fast Ethernet Adapter */ + .driver_info = (unsigned long)&dm9601_info, + }, {}, // END }; -- cgit v0.10.2 From 95f4a45de1a0f172b35451fc52283290adb21f6e Mon Sep 17 00:00:00 2001 From: Hannes Frederic Sowa Date: Mon, 13 Jan 2014 02:45:22 +0100 Subject: net: avoid reference counter overflows on fib_rules in multicast forwarding Bob Falken reported that after 4G packets, multicast forwarding stopped working. This was because of a rule reference counter overflow which freed the rule as soon as the overflow happend. This patch solves this by adding the FIB_LOOKUP_NOREF flag to fib_rules_lookup calls. This is safe even from non-rcu locked sections as in this case the flag only implies not taking a reference to the rule, which we don't need at all. Rules only hold references to the namespace, which are guaranteed to be available during the call of the non-rcu protected function reg_vif_xmit because of the interface reference which itself holds a reference to the net namespace. Fixes: f0ad0860d01e47 ("ipv4: ipmr: support multiple tables") Fixes: d1db275dd3f6e4 ("ipv6: ip6mr: support multiple tables") Reported-by: Bob Falken Cc: Patrick McHardy Cc: Thomas Graf Cc: Julian Anastasov Cc: Eric Dumazet Signed-off-by: Hannes Frederic Sowa Acked-by: Eric Dumazet Signed-off-by: David S. Miller diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c index 62212c7..1672409 100644 --- a/net/ipv4/ipmr.c +++ b/net/ipv4/ipmr.c @@ -157,9 +157,12 @@ static struct mr_table *ipmr_get_table(struct net *net, u32 id) static int ipmr_fib_lookup(struct net *net, struct flowi4 *flp4, struct mr_table **mrt) { - struct ipmr_result res; - struct fib_lookup_arg arg = { .result = &res, }; int err; + struct ipmr_result res; + struct fib_lookup_arg arg = { + .result = &res, + .flags = FIB_LOOKUP_NOREF, + }; err = fib_rules_lookup(net->ipv4.mr_rules_ops, flowi4_to_flowi(flp4), 0, &arg); diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c index f365310..0eb4038 100644 --- a/net/ipv6/ip6mr.c +++ b/net/ipv6/ip6mr.c @@ -141,9 +141,12 @@ static struct mr6_table *ip6mr_get_table(struct net *net, u32 id) static int ip6mr_fib_lookup(struct net *net, struct flowi6 *flp6, struct mr6_table **mrt) { - struct ip6mr_result res; - struct fib_lookup_arg arg = { .result = &res, }; int err; + struct ip6mr_result res; + struct fib_lookup_arg arg = { + .result = &res, + .flags = FIB_LOOKUP_NOREF, + }; err = fib_rules_lookup(net->ipv6.mr6_rules_ops, flowi6_to_flowi(flp6), 0, &arg); -- cgit v0.10.2 From 51bb352f15595f2dee42b599680809de3d08999d Mon Sep 17 00:00:00 2001 From: Jitendra Kalsaria Date: Tue, 14 Jan 2014 13:57:25 -0500 Subject: qlge: Fix vlan netdev features. vlan gets the same netdev features except vlan filter. Signed-off-by: Jitendra Kalsaria Signed-off-by: David S. Miller diff --git a/drivers/net/ethernet/qlogic/qlge/qlge_main.c b/drivers/net/ethernet/qlogic/qlge/qlge_main.c index 449f506..f705aee 100644 --- a/drivers/net/ethernet/qlogic/qlge/qlge_main.c +++ b/drivers/net/ethernet/qlogic/qlge/qlge_main.c @@ -4765,6 +4765,8 @@ static int qlge_probe(struct pci_dev *pdev, NETIF_F_RXCSUM; ndev->features = ndev->hw_features; ndev->vlan_features = ndev->hw_features; + /* vlan gets same features (except vlan filter) */ + ndev->vlan_features &= ~NETIF_F_HW_VLAN_CTAG_FILTER; if (test_bit(QL_DMA64, &qdev->flags)) ndev->features |= NETIF_F_HIGHDMA; -- cgit v0.10.2 From fdd239ac99a0cc298b382c5ab5e7bcd09e8933d7 Mon Sep 17 00:00:00 2001 From: Ben Skeggs Date: Tue, 14 Jan 2014 14:56:22 +1000 Subject: drm/nouveau: fix null ptr dereferences on some boards Regression from "device: populate master subdev pointer only when fully constructed" Reported-by: Bob Gleitsmann Signed-off-by: Ben Skeggs diff --git a/drivers/gpu/drm/nouveau/core/include/subdev/i2c.h b/drivers/gpu/drm/nouveau/core/include/subdev/i2c.h index 9fa5da7..7f50a85 100644 --- a/drivers/gpu/drm/nouveau/core/include/subdev/i2c.h +++ b/drivers/gpu/drm/nouveau/core/include/subdev/i2c.h @@ -73,7 +73,7 @@ struct nouveau_i2c { int (*identify)(struct nouveau_i2c *, int index, const char *what, struct nouveau_i2c_board_info *, bool (*match)(struct nouveau_i2c_port *, - struct i2c_board_info *)); + struct i2c_board_info *, void *), void *); struct list_head ports; }; diff --git a/drivers/gpu/drm/nouveau/core/include/subdev/instmem.h b/drivers/gpu/drm/nouveau/core/include/subdev/instmem.h index ec7a54e..4aca338 100644 --- a/drivers/gpu/drm/nouveau/core/include/subdev/instmem.h +++ b/drivers/gpu/drm/nouveau/core/include/subdev/instmem.h @@ -50,6 +50,13 @@ struct nouveau_instmem { static inline struct nouveau_instmem * nouveau_instmem(void *obj) { + /* nv04/nv40 impls need to create objects in their constructor, + * which is before the subdev pointer is valid + */ + if (nv_iclass(obj, NV_SUBDEV_CLASS) && + nv_subidx(obj) == NVDEV_SUBDEV_INSTMEM) + return obj; + return (void *)nv_device(obj)->subdev[NVDEV_SUBDEV_INSTMEM]; } diff --git a/drivers/gpu/drm/nouveau/core/subdev/i2c/base.c b/drivers/gpu/drm/nouveau/core/subdev/i2c/base.c index 041fd5e..c33c03d 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/i2c/base.c +++ b/drivers/gpu/drm/nouveau/core/subdev/i2c/base.c @@ -197,7 +197,7 @@ static int nouveau_i2c_identify(struct nouveau_i2c *i2c, int index, const char *what, struct nouveau_i2c_board_info *info, bool (*match)(struct nouveau_i2c_port *, - struct i2c_board_info *)) + struct i2c_board_info *, void *), void *data) { struct nouveau_i2c_port *port = nouveau_i2c_find(i2c, index); int i; @@ -221,7 +221,7 @@ nouveau_i2c_identify(struct nouveau_i2c *i2c, int index, const char *what, } if (nv_probe_i2c(port, info[i].dev.addr) && - (!match || match(port, &info[i].dev))) { + (!match || match(port, &info[i].dev, data))) { nv_info(i2c, "detected %s: %s\n", what, info[i].dev.type); return i; diff --git a/drivers/gpu/drm/nouveau/core/subdev/therm/ic.c b/drivers/gpu/drm/nouveau/core/subdev/therm/ic.c index 13b8500..30c384e 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/therm/ic.c +++ b/drivers/gpu/drm/nouveau/core/subdev/therm/ic.c @@ -29,9 +29,9 @@ static bool probe_monitoring_device(struct nouveau_i2c_port *i2c, - struct i2c_board_info *info) + struct i2c_board_info *info, void *data) { - struct nouveau_therm_priv *priv = (void *)nouveau_therm(i2c); + struct nouveau_therm_priv *priv = data; struct nvbios_therm_sensor *sensor = &priv->bios_sensor; struct i2c_client *client; @@ -95,7 +95,7 @@ nouveau_therm_ic_ctor(struct nouveau_therm *therm) }; i2c->identify(i2c, NV_I2C_DEFAULT(0), "monitoring device", - board, probe_monitoring_device); + board, probe_monitoring_device, therm); if (priv->ic) return; } @@ -107,7 +107,7 @@ nouveau_therm_ic_ctor(struct nouveau_therm *therm) }; i2c->identify(i2c, NV_I2C_DEFAULT(0), "monitoring device", - board, probe_monitoring_device); + board, probe_monitoring_device, therm); if (priv->ic) return; } @@ -116,5 +116,5 @@ nouveau_therm_ic_ctor(struct nouveau_therm *therm) device. Let's try our static list. */ i2c->identify(i2c, NV_I2C_DEFAULT(0), "monitoring device", - nv_board_infos, probe_monitoring_device); + nv_board_infos, probe_monitoring_device, therm); } diff --git a/drivers/gpu/drm/nouveau/dispnv04/dfp.c b/drivers/gpu/drm/nouveau/dispnv04/dfp.c index 936a71c..7fdc51e 100644 --- a/drivers/gpu/drm/nouveau/dispnv04/dfp.c +++ b/drivers/gpu/drm/nouveau/dispnv04/dfp.c @@ -643,7 +643,7 @@ static void nv04_tmds_slave_init(struct drm_encoder *encoder) get_tmds_slave(encoder)) return; - type = i2c->identify(i2c, 2, "TMDS transmitter", info, NULL); + type = i2c->identify(i2c, 2, "TMDS transmitter", info, NULL, NULL); if (type < 0) return; diff --git a/drivers/gpu/drm/nouveau/dispnv04/tvnv04.c b/drivers/gpu/drm/nouveau/dispnv04/tvnv04.c index cc4b208..244822d 100644 --- a/drivers/gpu/drm/nouveau/dispnv04/tvnv04.c +++ b/drivers/gpu/drm/nouveau/dispnv04/tvnv04.c @@ -59,7 +59,7 @@ int nv04_tv_identify(struct drm_device *dev, int i2c_index) struct nouveau_i2c *i2c = nouveau_i2c(drm->device); return i2c->identify(i2c, i2c_index, "TV encoder", - nv04_tv_encoder_info, NULL); + nv04_tv_encoder_info, NULL, NULL); } -- cgit v0.10.2 From 70f2fe3a26248724d8a5019681a869abdaf3e89a Mon Sep 17 00:00:00 2001 From: Andreas Rohner Date: Tue, 14 Jan 2014 17:56:36 -0800 Subject: nilfs2: fix segctor bug that causes file system corruption There is a bug in the function nilfs_segctor_collect, which results in active data being written to a segment, that is marked as clean. It is possible, that this segment is selected for a later segment construction, whereby the old data is overwritten. The problem shows itself with the following kernel log message: nilfs_sufile_do_cancel_free: segment 6533 must be clean Usually a few hours later the file system gets corrupted: NILFS: bad btree node (blocknr=8748107): level = 0, flags = 0x0, nchildren = 0 NILFS error (device sdc1): nilfs_bmap_last_key: broken bmap (inode number=114660) The issue can be reproduced with a file system that is nearly full and with the cleaner running, while some IO intensive task is running. Although it is quite hard to reproduce. This is what happens: 1. The cleaner starts the segment construction 2. nilfs_segctor_collect is called 3. sc_stage is on NILFS_ST_SUFILE and segments are freed 4. sc_stage is on NILFS_ST_DAT current segment is full 5. nilfs_segctor_extend_segments is called, which allocates a new segment 6. The new segment is one of the segments freed in step 3 7. nilfs_sufile_cancel_freev is called and produces an error message 8. Loop around and the collection starts again 9. sc_stage is on NILFS_ST_SUFILE and segments are freed including the newly allocated segment, which will contain active data and can be allocated at a later time 10. A few hours later another segment construction allocates the segment and causes file system corruption This can be prevented by simply reordering the statements. If nilfs_sufile_cancel_freev is called before nilfs_segctor_extend_segments the freed segments are marked as dirty and cannot be allocated any more. Signed-off-by: Andreas Rohner Reviewed-by: Ryusuke Konishi Tested-by: Andreas Rohner Signed-off-by: Ryusuke Konishi Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c index 9f6b486..a1a1916 100644 --- a/fs/nilfs2/segment.c +++ b/fs/nilfs2/segment.c @@ -1440,17 +1440,19 @@ static int nilfs_segctor_collect(struct nilfs_sc_info *sci, nilfs_clear_logs(&sci->sc_segbufs); - err = nilfs_segctor_extend_segments(sci, nilfs, nadd); - if (unlikely(err)) - return err; - if (sci->sc_stage.flags & NILFS_CF_SUFREED) { err = nilfs_sufile_cancel_freev(nilfs->ns_sufile, sci->sc_freesegs, sci->sc_nfreesegs, NULL); WARN_ON(err); /* do not happen */ + sci->sc_stage.flags &= ~NILFS_CF_SUFREED; } + + err = nilfs_segctor_extend_segments(sci, nilfs, nadd); + if (unlikely(err)) + return err; + nadd = min_t(int, nadd << 1, SC_MAX_SEGDELTA); sci->sc_stage = prev_stage; } -- cgit v0.10.2 From bad009fe354a00e6b2bf87328995ec76e59ab970 Mon Sep 17 00:00:00 2001 From: Huacai Chen Date: Tue, 14 Jan 2014 17:56:37 -0800 Subject: MIPS: fix case mismatch in local_r4k_flush_icache_range() Currently, Loongson-2 call protected_blast_icache_range() and others call protected_loongson23_blast_icache_range(), but I think the correct behavior should be the opposite. BTW, Loongson-3's cache-ops is compatible with MIPS64, but not compatible with Loongson-2. So, rename xxx_loongson23_yyy things to xxx_loongson2_yyy. The patch fixes early boot hang with 3.13-rc1, introduced in commit 14bd8c082016 ("MIPS: Loongson: Get rid of Loongson 2 #ifdefery all over arch/mips"). Signed-off-by: Huacai Chen Signed-off-by: Aaro Koskinen Reviewed-by: Aurelien Jarno Acked-by: John Crispin Cc: Ralf Baechle Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds diff --git a/arch/mips/include/asm/cacheops.h b/arch/mips/include/asm/cacheops.h index c75025f..06b9bc7 100644 --- a/arch/mips/include/asm/cacheops.h +++ b/arch/mips/include/asm/cacheops.h @@ -83,6 +83,6 @@ /* * Loongson2-specific cacheops */ -#define Hit_Invalidate_I_Loongson23 0x00 +#define Hit_Invalidate_I_Loongson2 0x00 #endif /* __ASM_CACHEOPS_H */ diff --git a/arch/mips/include/asm/r4kcache.h b/arch/mips/include/asm/r4kcache.h index 34d1a19..91d20b0 100644 --- a/arch/mips/include/asm/r4kcache.h +++ b/arch/mips/include/asm/r4kcache.h @@ -165,7 +165,7 @@ static inline void flush_icache_line(unsigned long addr) __iflush_prologue switch (boot_cpu_type()) { case CPU_LOONGSON2: - cache_op(Hit_Invalidate_I_Loongson23, addr); + cache_op(Hit_Invalidate_I_Loongson2, addr); break; default: @@ -219,7 +219,7 @@ static inline void protected_flush_icache_line(unsigned long addr) { switch (boot_cpu_type()) { case CPU_LOONGSON2: - protected_cache_op(Hit_Invalidate_I_Loongson23, addr); + protected_cache_op(Hit_Invalidate_I_Loongson2, addr); break; default: @@ -452,8 +452,8 @@ static inline void prot##extra##blast_##pfx##cache##_range(unsigned long start, __BUILD_BLAST_CACHE_RANGE(d, dcache, Hit_Writeback_Inv_D, protected_, ) __BUILD_BLAST_CACHE_RANGE(s, scache, Hit_Writeback_Inv_SD, protected_, ) __BUILD_BLAST_CACHE_RANGE(i, icache, Hit_Invalidate_I, protected_, ) -__BUILD_BLAST_CACHE_RANGE(i, icache, Hit_Invalidate_I_Loongson23, \ - protected_, loongson23_) +__BUILD_BLAST_CACHE_RANGE(i, icache, Hit_Invalidate_I_Loongson2, \ + protected_, loongson2_) __BUILD_BLAST_CACHE_RANGE(d, dcache, Hit_Writeback_Inv_D, , ) __BUILD_BLAST_CACHE_RANGE(s, scache, Hit_Writeback_Inv_SD, , ) /* blast_inv_dcache_range */ diff --git a/arch/mips/mm/c-r4k.c b/arch/mips/mm/c-r4k.c index 62ffd20..73f02da6 100644 --- a/arch/mips/mm/c-r4k.c +++ b/arch/mips/mm/c-r4k.c @@ -580,11 +580,11 @@ static inline void local_r4k_flush_icache_range(unsigned long start, unsigned lo else { switch (boot_cpu_type()) { case CPU_LOONGSON2: - protected_blast_icache_range(start, end); + protected_loongson2_blast_icache_range(start, end); break; default: - protected_loongson23_blast_icache_range(start, end); + protected_blast_icache_range(start, end); break; } } -- cgit v0.10.2 From 43a06847b9d277e9f2c3bf8052b44b74e17526c7 Mon Sep 17 00:00:00 2001 From: Aaro Koskinen Date: Tue, 14 Jan 2014 17:56:38 -0800 Subject: MIPS: fix blast_icache32 on loongson2 Commit 14bd8c082016 ("MIPS: Loongson: Get rid of Loongson 2 #ifdefery all over arch/mips") failed to add Loongson2 specific blast_icache32 functions. Fix that. The patch fixes the following crash seen with 3.13-rc1: Reserved instruction in kernel code[#1]: [...] Call Trace: blast_icache32_page+0x8/0xb0 r4k_flush_cache_page+0x19c/0x200 do_wp_page.isra.97+0x47c/0xe08 handle_mm_fault+0x938/0x1118 __do_page_fault+0x140/0x540 resume_userspace_check+0x0/0x10 Code: 00200825 64834000 00200825 bc900020 bc900040 bc900060 bc900080 bc9000a0 Signed-off-by: Aaro Koskinen Reviewed-by: Aurelien Jarno Acked-by: John Crispin Cc: Ralf Baechle Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds diff --git a/arch/mips/include/asm/r4kcache.h b/arch/mips/include/asm/r4kcache.h index 91d20b0..c84cadd 100644 --- a/arch/mips/include/asm/r4kcache.h +++ b/arch/mips/include/asm/r4kcache.h @@ -357,8 +357,8 @@ static inline void invalidate_tcache_page(unsigned long addr) "i" (op)); /* build blast_xxx, blast_xxx_page, blast_xxx_page_indexed */ -#define __BUILD_BLAST_CACHE(pfx, desc, indexop, hitop, lsize) \ -static inline void blast_##pfx##cache##lsize(void) \ +#define __BUILD_BLAST_CACHE(pfx, desc, indexop, hitop, lsize, extra) \ +static inline void extra##blast_##pfx##cache##lsize(void) \ { \ unsigned long start = INDEX_BASE; \ unsigned long end = start + current_cpu_data.desc.waysize; \ @@ -376,7 +376,7 @@ static inline void blast_##pfx##cache##lsize(void) \ __##pfx##flush_epilogue \ } \ \ -static inline void blast_##pfx##cache##lsize##_page(unsigned long page) \ +static inline void extra##blast_##pfx##cache##lsize##_page(unsigned long page) \ { \ unsigned long start = page; \ unsigned long end = page + PAGE_SIZE; \ @@ -391,7 +391,7 @@ static inline void blast_##pfx##cache##lsize##_page(unsigned long page) \ __##pfx##flush_epilogue \ } \ \ -static inline void blast_##pfx##cache##lsize##_page_indexed(unsigned long page) \ +static inline void extra##blast_##pfx##cache##lsize##_page_indexed(unsigned long page) \ { \ unsigned long indexmask = current_cpu_data.desc.waysize - 1; \ unsigned long start = INDEX_BASE + (page & indexmask); \ @@ -410,23 +410,24 @@ static inline void blast_##pfx##cache##lsize##_page_indexed(unsigned long page) __##pfx##flush_epilogue \ } -__BUILD_BLAST_CACHE(d, dcache, Index_Writeback_Inv_D, Hit_Writeback_Inv_D, 16) -__BUILD_BLAST_CACHE(i, icache, Index_Invalidate_I, Hit_Invalidate_I, 16) -__BUILD_BLAST_CACHE(s, scache, Index_Writeback_Inv_SD, Hit_Writeback_Inv_SD, 16) -__BUILD_BLAST_CACHE(d, dcache, Index_Writeback_Inv_D, Hit_Writeback_Inv_D, 32) -__BUILD_BLAST_CACHE(i, icache, Index_Invalidate_I, Hit_Invalidate_I, 32) -__BUILD_BLAST_CACHE(s, scache, Index_Writeback_Inv_SD, Hit_Writeback_Inv_SD, 32) -__BUILD_BLAST_CACHE(d, dcache, Index_Writeback_Inv_D, Hit_Writeback_Inv_D, 64) -__BUILD_BLAST_CACHE(i, icache, Index_Invalidate_I, Hit_Invalidate_I, 64) -__BUILD_BLAST_CACHE(s, scache, Index_Writeback_Inv_SD, Hit_Writeback_Inv_SD, 64) -__BUILD_BLAST_CACHE(s, scache, Index_Writeback_Inv_SD, Hit_Writeback_Inv_SD, 128) - -__BUILD_BLAST_CACHE(inv_d, dcache, Index_Writeback_Inv_D, Hit_Invalidate_D, 16) -__BUILD_BLAST_CACHE(inv_d, dcache, Index_Writeback_Inv_D, Hit_Invalidate_D, 32) -__BUILD_BLAST_CACHE(inv_s, scache, Index_Writeback_Inv_SD, Hit_Invalidate_SD, 16) -__BUILD_BLAST_CACHE(inv_s, scache, Index_Writeback_Inv_SD, Hit_Invalidate_SD, 32) -__BUILD_BLAST_CACHE(inv_s, scache, Index_Writeback_Inv_SD, Hit_Invalidate_SD, 64) -__BUILD_BLAST_CACHE(inv_s, scache, Index_Writeback_Inv_SD, Hit_Invalidate_SD, 128) +__BUILD_BLAST_CACHE(d, dcache, Index_Writeback_Inv_D, Hit_Writeback_Inv_D, 16, ) +__BUILD_BLAST_CACHE(i, icache, Index_Invalidate_I, Hit_Invalidate_I, 16, ) +__BUILD_BLAST_CACHE(s, scache, Index_Writeback_Inv_SD, Hit_Writeback_Inv_SD, 16, ) +__BUILD_BLAST_CACHE(d, dcache, Index_Writeback_Inv_D, Hit_Writeback_Inv_D, 32, ) +__BUILD_BLAST_CACHE(i, icache, Index_Invalidate_I, Hit_Invalidate_I, 32, ) +__BUILD_BLAST_CACHE(i, icache, Index_Invalidate_I, Hit_Invalidate_I_Loongson2, 32, loongson2_) +__BUILD_BLAST_CACHE(s, scache, Index_Writeback_Inv_SD, Hit_Writeback_Inv_SD, 32, ) +__BUILD_BLAST_CACHE(d, dcache, Index_Writeback_Inv_D, Hit_Writeback_Inv_D, 64, ) +__BUILD_BLAST_CACHE(i, icache, Index_Invalidate_I, Hit_Invalidate_I, 64, ) +__BUILD_BLAST_CACHE(s, scache, Index_Writeback_Inv_SD, Hit_Writeback_Inv_SD, 64, ) +__BUILD_BLAST_CACHE(s, scache, Index_Writeback_Inv_SD, Hit_Writeback_Inv_SD, 128, ) + +__BUILD_BLAST_CACHE(inv_d, dcache, Index_Writeback_Inv_D, Hit_Invalidate_D, 16, ) +__BUILD_BLAST_CACHE(inv_d, dcache, Index_Writeback_Inv_D, Hit_Invalidate_D, 32, ) +__BUILD_BLAST_CACHE(inv_s, scache, Index_Writeback_Inv_SD, Hit_Invalidate_SD, 16, ) +__BUILD_BLAST_CACHE(inv_s, scache, Index_Writeback_Inv_SD, Hit_Invalidate_SD, 32, ) +__BUILD_BLAST_CACHE(inv_s, scache, Index_Writeback_Inv_SD, Hit_Invalidate_SD, 64, ) +__BUILD_BLAST_CACHE(inv_s, scache, Index_Writeback_Inv_SD, Hit_Invalidate_SD, 128, ) /* build blast_xxx_range, protected_blast_xxx_range */ #define __BUILD_BLAST_CACHE_RANGE(pfx, desc, hitop, prot, extra) \ diff --git a/arch/mips/mm/c-r4k.c b/arch/mips/mm/c-r4k.c index 73f02da6..49e572d 100644 --- a/arch/mips/mm/c-r4k.c +++ b/arch/mips/mm/c-r4k.c @@ -237,6 +237,8 @@ static void r4k_blast_icache_page_setup(void) r4k_blast_icache_page = (void *)cache_noop; else if (ic_lsize == 16) r4k_blast_icache_page = blast_icache16_page; + else if (ic_lsize == 32 && current_cpu_type() == CPU_LOONGSON2) + r4k_blast_icache_page = loongson2_blast_icache32_page; else if (ic_lsize == 32) r4k_blast_icache_page = blast_icache32_page; else if (ic_lsize == 64) @@ -261,6 +263,9 @@ static void r4k_blast_icache_page_indexed_setup(void) else if (TX49XX_ICACHE_INDEX_INV_WAR) r4k_blast_icache_page_indexed = tx49_blast_icache32_page_indexed; + else if (current_cpu_type() == CPU_LOONGSON2) + r4k_blast_icache_page_indexed = + loongson2_blast_icache32_page_indexed; else r4k_blast_icache_page_indexed = blast_icache32_page_indexed; @@ -284,6 +289,8 @@ static void r4k_blast_icache_setup(void) r4k_blast_icache = blast_r4600_v1_icache32; else if (TX49XX_ICACHE_INDEX_INV_WAR) r4k_blast_icache = tx49_blast_icache32; + else if (current_cpu_type() == CPU_LOONGSON2) + r4k_blast_icache = loongson2_blast_icache32; else r4k_blast_icache = blast_icache32; } else if (ic_lsize == 64) -- cgit v0.10.2 From 03e5ac2fc3bf6f4140db0371e8bb4243b24e3e02 Mon Sep 17 00:00:00 2001 From: Mikulas Patocka Date: Tue, 14 Jan 2014 17:56:40 -0800 Subject: mm: fix crash when using XFS on loopback Commit 8456a648cf44 ("slab: use struct page for slab management") causes a crash in the LVM2 testsuite on PA-RISC (the crashing test is fsadm.sh). The testsuite doesn't crash on 3.12, crashes on 3.13-rc1 and later. Bad Address (null pointer deref?): Code=15 regs=000000413edd89a0 (Addr=000006202224647d) CPU: 3 PID: 24008 Comm: loop0 Not tainted 3.13.0-rc6 #5 task: 00000001bf3c0048 ti: 000000413edd8000 task.ti: 000000413edd8000 YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI PSW: 00001000000001101111100100001110 Not tainted r00-03 000000ff0806f90e 00000000405c8de0 000000004013e6c0 000000413edd83f0 r04-07 00000000405a95e0 0000000000000200 00000001414735f0 00000001bf349e40 r08-11 0000000010fe3d10 0000000000000001 00000040829c7778 000000413efd9000 r12-15 0000000000000000 000000004060d800 0000000010fe3000 0000000010fe3000 r16-19 000000413edd82a0 00000041078ddbc0 0000000000000010 0000000000000001 r20-23 0008f3d0d83a8000 0000000000000000 00000040829c7778 0000000000000080 r24-27 00000001bf349e40 00000001bf349e40 202d66202224640d 00000000405a95e0 r28-31 202d662022246465 000000413edd88f0 000000413edd89a0 0000000000000001 sr00-03 000000000532c000 0000000000000000 0000000000000000 000000000532c000 sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000 IASQ: 0000000000000000 0000000000000000 IAOQ: 00000000401fe42c 00000000401fe430 IIR: 539c0030 ISR: 00000000202d6000 IOR: 000006202224647d CPU: 3 CR30: 000000413edd8000 CR31: 0000000000000000 ORIG_R28: 00000000405a95e0 IAOQ[0]: vma_interval_tree_iter_first+0x14/0x48 IAOQ[1]: vma_interval_tree_iter_first+0x18/0x48 RP(r2): flush_dcache_page+0x128/0x388 Backtrace: flush_dcache_page+0x128/0x388 lo_splice_actor+0x90/0x148 [loop] splice_from_pipe_feed+0xc0/0x1d0 __splice_from_pipe+0xac/0xc0 lo_direct_splice_actor+0x1c/0x70 [loop] splice_direct_to_actor+0xec/0x228 lo_receive+0xe4/0x298 [loop] loop_thread+0x478/0x640 [loop] kthread+0x134/0x168 end_fault_vector+0x20/0x28 xfs_setsize_buftarg+0x0/0x90 [xfs] Kernel panic - not syncing: Bad Address (null pointer deref?) Commit 8456a648cf44 changes the page structure so that the slab subsystem reuses the page->mapping field. The crash happens in the following way: * XFS allocates some memory from slab and issues a bio to read data into it. * the bio is sent to the loopback device. * lo_receive creates an actor and calls splice_direct_to_actor. * lo_splice_actor copies data to the target page. * lo_splice_actor calls flush_dcache_page because the page may be mapped by userspace. In that case we need to flush the kernel cache. * flush_dcache_page asks for the list of userspace mappings, however that page->mapping field is reused by the slab subsystem for a different purpose. This causes the crash. Note that other architectures without coherent caches (sparc, arm, mips) also call page_mapping from flush_dcache_page, so they may crash in the same way. This patch fixes this bug by testing if the page is a slab page in page_mapping and returning NULL if it is. The patch also fixes VM_BUG_ON(PageSlab(page)) that could happen in earlier kernels in the same scenario on architectures without cache coherence when CONFIG_DEBUG_VM is enabled - so it should be backported to stable kernels. In the old kernels, the function page_mapping is placed in include/linux/mm.h, so you should modify the patch accordingly when backporting it. Signed-off-by: Mikulas Patocka Cc: John David Anglin ] Cc: Andi Kleen Cc: Christoph Lameter Acked-by: Pekka Enberg Reviewed-by: Joonsoo Kim Cc: Helge Deller Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds diff --git a/mm/util.c b/mm/util.c index f7bc209..808f375 100644 --- a/mm/util.c +++ b/mm/util.c @@ -390,7 +390,10 @@ struct address_space *page_mapping(struct page *page) { struct address_space *mapping = page->mapping; - VM_BUG_ON(PageSlab(page)); + /* This happens if someone calls flush_dcache_page on slab page */ + if (unlikely(PageSlab(page))) + return NULL; + if (unlikely(PageSwapCache(page))) { swp_entry_t entry; -- cgit v0.10.2 From 5a610fcc7390ee60308deaf09426ada87a1eeec2 Mon Sep 17 00:00:00 2001 From: Qais Yousef Date: Tue, 14 Jan 2014 17:56:41 -0800 Subject: crash_dump: fix compilation error (on MIPS at least) In file included from kernel/crash_dump.c:2:0: include/linux/crash_dump.h:22:27: error: unknown type name `pgprot_t' when CONFIG_CRASH_DUMP=y The error was traced back to commit 9cb218131de1 ("vmcore: introduce remap_oldmem_pfn_range()") include to get the missing definition Signed-off-by: Qais Yousef Reviewed-by: James Hogan Cc: Michael Holzheu Acked-by: Vivek Goyal Cc: [3.12+] Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds diff --git a/include/linux/crash_dump.h b/include/linux/crash_dump.h index fe68a5a..7032518 100644 --- a/include/linux/crash_dump.h +++ b/include/linux/crash_dump.h @@ -6,6 +6,8 @@ #include #include +#include /* for pgprot_t */ + #define ELFCORE_ADDR_MAX (-1ULL) #define ELFCORE_ADDR_ERR (-2ULL) -- cgit v0.10.2 From 74e72f894d56eb9d2e1218530c658e7d297e002b Mon Sep 17 00:00:00 2001 From: Ming Lei Date: Tue, 14 Jan 2014 17:56:42 -0800 Subject: lib/percpu_counter.c: fix __percpu_counter_add() __percpu_counter_add() may be called in softirq/hardirq handler (such as, blk_mq_queue_exit() is typically called in hardirq/softirq handler), so we need to call this_cpu_add()(irq safe helper) to update percpu counter, otherwise counts may be lost. This fixes the problem that 'rmmod null_blk' hangs in blk_cleanup_queue() because of miscounting of request_queue->mq_usage_counter. This patch is the v1 of previous one of "lib/percpu_counter.c: disable local irq when updating percpu couter", and takes Andrew's approach which may be more efficient for ARCHs(x86, s390) that have optimized this_cpu_add(). Signed-off-by: Ming Lei Cc: Paul Gortmaker Cc: Shaohua Li Cc: Jens Axboe Cc: Fan Du Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds diff --git a/lib/percpu_counter.c b/lib/percpu_counter.c index 7473ee3..1da85bb 100644 --- a/lib/percpu_counter.c +++ b/lib/percpu_counter.c @@ -82,10 +82,10 @@ void __percpu_counter_add(struct percpu_counter *fbc, s64 amount, s32 batch) unsigned long flags; raw_spin_lock_irqsave(&fbc->lock, flags); fbc->count += count; + __this_cpu_sub(*fbc->counters, count); raw_spin_unlock_irqrestore(&fbc->lock, flags); - __this_cpu_write(*fbc->counters, 0); } else { - __this_cpu_write(*fbc->counters, count); + this_cpu_add(*fbc->counters, amount); } preempt_enable(); } -- cgit v0.10.2 From 0dce7cd67fd9055c4a2ff278f8af1431e646d346 Mon Sep 17 00:00:00 2001 From: Andrew Jones Date: Wed, 15 Jan 2014 13:39:59 +0100 Subject: kvm: x86: fix apic_base enable check Commit e66d2ae7c67bd moved the assignment vcpu->arch.apic_base = value above a condition with (vcpu->arch.apic_base ^ value), causing that check to always fail. Use old_value, vcpu->arch.apic_base's old value, in the condition instead. Signed-off-by: Andrew Jones Signed-off-by: Paolo Bonzini diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 1673940..775702f 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -1355,7 +1355,7 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value) vcpu->arch.apic_base = value; /* update jump label if enable bit changes */ - if ((vcpu->arch.apic_base ^ value) & MSR_IA32_APICBASE_ENABLE) { + if ((old_value ^ value) & MSR_IA32_APICBASE_ENABLE) { if (value & MSR_IA32_APICBASE_ENABLE) static_key_slow_dec_deferred(&apic_hw_disabled); else -- cgit v0.10.2 From 1df0cbd509bc21b0c331358c1f9d9a6fc94bada8 Mon Sep 17 00:00:00 2001 From: Marek Lindner Date: Wed, 15 Jan 2014 20:31:18 +0800 Subject: batman-adv: fix batman-adv header overhead calculation MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Batman-adv prepends a full ethernet header in addition to its own header. This has to be reflected in the MTU calculation, especially since the value is used to set dev->hard_header_len. Introduced by 411d6ed93a5d0601980d3e5ce75de07c98e3a7de ("batman-adv: consider network coding overhead when calculating required mtu") Reported-by: cmsv Reported-by: Martin Hundebøll Signed-off-by: Marek Lindner Signed-off-by: Antonio Quartulli diff --git a/net/batman-adv/main.c b/net/batman-adv/main.c index 1511f64..faba0f6 100644 --- a/net/batman-adv/main.c +++ b/net/batman-adv/main.c @@ -277,7 +277,7 @@ int batadv_max_header_len(void) sizeof(struct batadv_coded_packet)); #endif - return header_len; + return header_len + ETH_HLEN; } /** -- cgit v0.10.2 From a926592f5e4e900f3fa903298c4619a131e60963 Mon Sep 17 00:00:00 2001 From: Richard Weinberger Date: Tue, 14 Jan 2014 22:46:36 +0100 Subject: net,via-rhine: Fix tx_timeout handling rhine_reset_task() misses to disable the tx scheduler upon reset, this can lead to a crash if work is still scheduled while we're resetting the tx queue. Fixes: [ 93.591707] BUG: unable to handle kernel NULL pointer dereference at 0000004c [ 93.595514] IP: [] rhine_napipoll+0x491/0x6 Signed-off-by: Richard Weinberger Signed-off-by: David S. Miller diff --git a/drivers/net/ethernet/via/via-rhine.c b/drivers/net/ethernet/via/via-rhine.c index cce6c4b..ef312bc 100644 --- a/drivers/net/ethernet/via/via-rhine.c +++ b/drivers/net/ethernet/via/via-rhine.c @@ -1618,6 +1618,7 @@ static void rhine_reset_task(struct work_struct *work) goto out_unlock; napi_disable(&rp->napi); + netif_tx_disable(dev); spin_lock_bh(&rp->lock); /* clear all descriptors */ -- cgit v0.10.2 From d9aee591b0f06bd44cd577b757d3f267bc35fe4d Mon Sep 17 00:00:00 2001 From: Yuval Mintz Date: Wed, 15 Jan 2014 12:05:30 +0200 Subject: bnx2x: Don't release PCI bars on shutdown The bnx2x driver in its pci shutdown() callback releases its pci bars (in the same manner it does during its pci remove() callback). During a system reboot while VFs are enabled, its possible for the VF's remove to be called (as a result of pci_disable_sriov()) after its shutdown callback has already finished running; This will cause a paging request fault as the VF tries to access the pci bar which it has previously released, crashing the system. This patch further differentiates the shutdown and remove callbacks, preventing the pci release procedures from being called during shutdown. Signed-off-by: Yuval Mintz Signed-off-by: Ariel Elior Signed-off-by: David S. Miller diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c index 8b3107b..0067b97 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c @@ -12942,25 +12942,26 @@ static void __bnx2x_remove(struct pci_dev *pdev, pci_set_power_state(pdev, PCI_D3hot); } - if (bp->regview) - iounmap(bp->regview); + if (remove_netdev) { + if (bp->regview) + iounmap(bp->regview); - /* for vf doorbells are part of the regview and were unmapped along with - * it. FW is only loaded by PF. - */ - if (IS_PF(bp)) { - if (bp->doorbells) - iounmap(bp->doorbells); + /* For vfs, doorbells are part of the regview and were unmapped + * along with it. FW is only loaded by PF. + */ + if (IS_PF(bp)) { + if (bp->doorbells) + iounmap(bp->doorbells); - bnx2x_release_firmware(bp); - } - bnx2x_free_mem_bp(bp); + bnx2x_release_firmware(bp); + } + bnx2x_free_mem_bp(bp); - if (remove_netdev) free_netdev(dev); - if (atomic_read(&pdev->enable_cnt) == 1) - pci_release_regions(pdev); + if (atomic_read(&pdev->enable_cnt) == 1) + pci_release_regions(pdev); + } pci_disable_device(pdev); } -- cgit v0.10.2 From ba42fad0964a41f0830e80c1b6be49c1e6bfcc01 Mon Sep 17 00:00:00 2001 From: Ivan Vecera Date: Wed, 15 Jan 2014 11:11:34 +0100 Subject: be2net: add dma_mapping_error() check for dma_map_page() The driver does not check value returned by dma_map_page. The patch fixes this. v2: Removed the bugfix for non-bug ;-) (thanks Sathya) Signed-off-by: Ivan Vecera Acked-by: Sathya Perla Signed-off-by: David S. Miller diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c index bf40fda..a37039d 100644 --- a/drivers/net/ethernet/emulex/benet/be_main.c +++ b/drivers/net/ethernet/emulex/benet/be_main.c @@ -1776,6 +1776,7 @@ static void be_post_rx_frags(struct be_rx_obj *rxo, gfp_t gfp) struct be_rx_page_info *page_info = NULL, *prev_page_info = NULL; struct be_queue_info *rxq = &rxo->q; struct page *pagep = NULL; + struct device *dev = &adapter->pdev->dev; struct be_eth_rx_d *rxd; u64 page_dmaaddr = 0, frag_dmaaddr; u32 posted, page_offset = 0; @@ -1788,9 +1789,15 @@ static void be_post_rx_frags(struct be_rx_obj *rxo, gfp_t gfp) rx_stats(rxo)->rx_post_fail++; break; } - page_dmaaddr = dma_map_page(&adapter->pdev->dev, pagep, - 0, adapter->big_page_size, + page_dmaaddr = dma_map_page(dev, pagep, 0, + adapter->big_page_size, DMA_FROM_DEVICE); + if (dma_mapping_error(dev, page_dmaaddr)) { + put_page(pagep); + pagep = NULL; + rx_stats(rxo)->rx_post_fail++; + break; + } page_info->page_offset = 0; } else { get_page(pagep); -- cgit v0.10.2 From aee636c4809fa54848ff07a899b326eb1f9987a2 Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Wed, 15 Jan 2014 06:50:07 -0800 Subject: bpf: do not use reciprocal divide At first Jakub Zawadzki noticed that some divisions by reciprocal_divide were not correct. (off by one in some cases) http://www.wireshark.org/~darkjames/reciprocal-buggy.c He could also show this with BPF: http://www.wireshark.org/~darkjames/set-and-dump-filter-k-bug.c The reciprocal divide in linux kernel is not generic enough, lets remove its use in BPF, as it is not worth the pain with current cpus. Signed-off-by: Eric Dumazet Reported-by: Jakub Zawadzki Cc: Mircea Gherzan Cc: Daniel Borkmann Cc: Hannes Frederic Sowa Cc: Matt Evans Cc: Martin Schwidefsky Cc: Heiko Carstens Cc: David S. Miller Signed-off-by: David S. Miller diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c index 9ed155a..271b5e9 100644 --- a/arch/arm/net/bpf_jit_32.c +++ b/arch/arm/net/bpf_jit_32.c @@ -641,10 +641,10 @@ load_ind: emit(ARM_MUL(r_A, r_A, r_X), ctx); break; case BPF_S_ALU_DIV_K: - /* current k == reciprocal_value(userspace k) */ + if (k == 1) + break; emit_mov_i(r_scratch, k, ctx); - /* A = top 32 bits of the product */ - emit(ARM_UMULL(r_scratch, r_A, r_A, r_scratch), ctx); + emit_udiv(r_A, r_A, r_scratch, ctx); break; case BPF_S_ALU_DIV_X: update_on_xread(ctx); diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c index ac3c2a1..555034f 100644 --- a/arch/powerpc/net/bpf_jit_comp.c +++ b/arch/powerpc/net/bpf_jit_comp.c @@ -223,10 +223,11 @@ static int bpf_jit_build_body(struct sk_filter *fp, u32 *image, } PPC_DIVWU(r_A, r_A, r_X); break; - case BPF_S_ALU_DIV_K: /* A = reciprocal_divide(A, K); */ + case BPF_S_ALU_DIV_K: /* A /= K */ + if (K == 1) + break; PPC_LI32(r_scratch1, K); - /* Top 32 bits of 64bit result -> A */ - PPC_MULHWU(r_A, r_A, r_scratch1); + PPC_DIVWU(r_A, r_A, r_scratch1); break; case BPF_S_ALU_AND_X: ctx->seen |= SEEN_XREG; diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c index 16871da..fc0fa77 100644 --- a/arch/s390/net/bpf_jit_comp.c +++ b/arch/s390/net/bpf_jit_comp.c @@ -371,11 +371,13 @@ static int bpf_jit_insn(struct bpf_jit *jit, struct sock_filter *filter, /* dr %r4,%r12 */ EMIT2(0x1d4c); break; - case BPF_S_ALU_DIV_K: /* A = reciprocal_divide(A, K) */ - /* m %r4,(%r13) */ - EMIT4_DISP(0x5c40d000, EMIT_CONST(K)); - /* lr %r5,%r4 */ - EMIT2(0x1854); + case BPF_S_ALU_DIV_K: /* A /= K */ + if (K == 1) + break; + /* lhi %r4,0 */ + EMIT4(0xa7480000); + /* d %r4,(%r13) */ + EMIT4_DISP(0x5d40d000, EMIT_CONST(K)); break; case BPF_S_ALU_MOD_X: /* A %= X */ jit->seen |= SEEN_XREG | SEEN_RET0; @@ -391,6 +393,11 @@ static int bpf_jit_insn(struct bpf_jit *jit, struct sock_filter *filter, EMIT2(0x1854); break; case BPF_S_ALU_MOD_K: /* A %= K */ + if (K == 1) { + /* lhi %r5,0 */ + EMIT4(0xa7580000); + break; + } /* lhi %r4,0 */ EMIT4(0xa7480000); /* d %r4,(%r13) */ diff --git a/arch/sparc/net/bpf_jit_comp.c b/arch/sparc/net/bpf_jit_comp.c index 218b6b2..01fe994 100644 --- a/arch/sparc/net/bpf_jit_comp.c +++ b/arch/sparc/net/bpf_jit_comp.c @@ -497,9 +497,20 @@ void bpf_jit_compile(struct sk_filter *fp) case BPF_S_ALU_MUL_K: /* A *= K */ emit_alu_K(MUL, K); break; - case BPF_S_ALU_DIV_K: /* A /= K */ - emit_alu_K(MUL, K); - emit_read_y(r_A); + case BPF_S_ALU_DIV_K: /* A /= K with K != 0*/ + if (K == 1) + break; + emit_write_y(G0); +#ifdef CONFIG_SPARC32 + /* The Sparc v8 architecture requires + * three instructions between a %y + * register write and the first use. + */ + emit_nop(); + emit_nop(); + emit_nop(); +#endif + emit_alu_K(DIV, K); break; case BPF_S_ALU_DIV_X: /* A /= X; */ emit_cmpi(r_X, 0); diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index 26328e8..4ed75dd 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -359,15 +359,21 @@ void bpf_jit_compile(struct sk_filter *fp) EMIT2(0x89, 0xd0); /* mov %edx,%eax */ break; case BPF_S_ALU_MOD_K: /* A %= K; */ + if (K == 1) { + CLEAR_A(); + break; + } EMIT2(0x31, 0xd2); /* xor %edx,%edx */ EMIT1(0xb9);EMIT(K, 4); /* mov imm32,%ecx */ EMIT2(0xf7, 0xf1); /* div %ecx */ EMIT2(0x89, 0xd0); /* mov %edx,%eax */ break; - case BPF_S_ALU_DIV_K: /* A = reciprocal_divide(A, K); */ - EMIT3(0x48, 0x69, 0xc0); /* imul imm32,%rax,%rax */ - EMIT(K, 4); - EMIT4(0x48, 0xc1, 0xe8, 0x20); /* shr $0x20,%rax */ + case BPF_S_ALU_DIV_K: /* A /= K */ + if (K == 1) + break; + EMIT2(0x31, 0xd2); /* xor %edx,%edx */ + EMIT1(0xb9);EMIT(K, 4); /* mov imm32,%ecx */ + EMIT2(0xf7, 0xf1); /* div %ecx */ break; case BPF_S_ALU_AND_X: seen |= SEEN_XREG; diff --git a/net/core/filter.c b/net/core/filter.c index 01b7808..ad30d62 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -36,7 +36,6 @@ #include #include #include -#include #include #include #include @@ -166,7 +165,7 @@ unsigned int sk_run_filter(const struct sk_buff *skb, A /= X; continue; case BPF_S_ALU_DIV_K: - A = reciprocal_divide(A, K); + A /= K; continue; case BPF_S_ALU_MOD_X: if (X == 0) @@ -553,11 +552,6 @@ int sk_chk_filter(struct sock_filter *filter, unsigned int flen) /* Some instructions need special checks */ switch (code) { case BPF_S_ALU_DIV_K: - /* check for division by zero */ - if (ftest->k == 0) - return -EINVAL; - ftest->k = reciprocal_value(ftest->k); - break; case BPF_S_ALU_MOD_K: /* check for division by zero */ if (ftest->k == 0) @@ -853,27 +847,7 @@ void sk_decode_filter(struct sock_filter *filt, struct sock_filter *to) to->code = decodes[code]; to->jt = filt->jt; to->jf = filt->jf; - - if (code == BPF_S_ALU_DIV_K) { - /* - * When loaded this rule user gave us X, which was - * translated into R = r(X). Now we calculate the - * RR = r(R) and report it back. If next time this - * value is loaded and RRR = r(RR) is calculated - * then the R == RRR will be true. - * - * One exception. X == 1 translates into R == 0 and - * we can't calculate RR out of it with r(). - */ - - if (filt->k == 0) - to->k = 1; - else - to->k = reciprocal_value(filt->k); - - BUG_ON(reciprocal_value(to->k) != filt->k); - } else - to->k = filt->k; + to->k = filt->k; } int sk_get_filter(struct sock *sk, struct sock_filter __user *ubuf, unsigned int len) -- cgit v0.10.2 From 4ce00dfcf19c473f3dbf23d5b1372639f0c334f6 Mon Sep 17 00:00:00 2001 From: Catalin Marinas Date: Thu, 16 Jan 2014 18:32:25 +0000 Subject: Revert "arm64: Fix memory shareability attribute for ioremap_wc/cache" This reverts commit 2f7dc6027522499582a520807cb9ffda589de47e. The above commit breaks the mapping type for Device memory because pgprot_default already contains a Normal memory type. pgprot_default is also not initialised early enough for earlyprintk resulting in an inconsistent memory mapping with 64K PAGE_SIZE configuration. Signed-off-by: Catalin Marinas Reported-by: Will Deacon Acked-by: Will Deacon diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h index 5727697..4cc813e 100644 --- a/arch/arm64/include/asm/io.h +++ b/arch/arm64/include/asm/io.h @@ -229,7 +229,7 @@ extern void __iomem *__ioremap(phys_addr_t phys_addr, size_t size, pgprot_t prot extern void __iounmap(volatile void __iomem *addr); extern void __iomem *ioremap_cache(phys_addr_t phys_addr, size_t size); -#define PROT_DEFAULT (pgprot_default | PTE_DIRTY) +#define PROT_DEFAULT (PTE_TYPE_PAGE | PTE_AF | PTE_DIRTY) #define PROT_DEVICE_nGnRE (PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_ATTRINDX(MT_DEVICE_nGnRE)) #define PROT_NORMAL_NC (PROT_DEFAULT | PTE_ATTRINDX(MT_NORMAL_NC)) #define PROT_NORMAL (PROT_DEFAULT | PTE_ATTRINDX(MT_NORMAL)) -- cgit v0.10.2 From 38a529b5d42e4cfc5ac94844e61335a00eb2d320 Mon Sep 17 00:00:00 2001 From: Mika Westerberg Date: Thu, 16 Jan 2014 14:39:39 +0200 Subject: e1000e: Fix compilation warning when !CONFIG_PM_SLEEP MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Commit 7509963c703b (e1000e: Fix a compile flag mis-match for suspend/resume) moved suspend and resume hooks to be available when CONFIG_PM is set. However, it can be set even if CONFIG_PM_SLEEP is not set causing following warnings to be emitted: drivers/net/ethernet/intel/e1000e/netdev.c:6178:12: warning: ‘e1000_suspend’ defined but not used [-Wunused-function] drivers/net/ethernet/intel/e1000e/netdev.c:6185:12: warning: ‘e1000_resume’ defined but not used [-Wunused-function] To fix this make the hooks to be available only when CONFIG_PM_SLEEP is set and remove CONFIG_PM wrapping from driver ops because this is already handled by SET_SYSTEM_SLEEP_PM_OPS() and SET_RUNTIME_PM_OPS(). Signed-off-by: Mika Westerberg Cc: Dave Ertman Cc: Aaron Brown Cc: Jeff Kirsher Signed-off-by: David S. Miller diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c index c30d41d..6d14eea 100644 --- a/drivers/net/ethernet/intel/e1000e/netdev.c +++ b/drivers/net/ethernet/intel/e1000e/netdev.c @@ -6174,7 +6174,7 @@ static int __e1000_resume(struct pci_dev *pdev) return 0; } -#ifdef CONFIG_PM +#ifdef CONFIG_PM_SLEEP static int e1000_suspend(struct device *dev) { struct pci_dev *pdev = to_pci_dev(dev); @@ -6193,7 +6193,7 @@ static int e1000_resume(struct device *dev) return __e1000_resume(pdev); } -#endif /* CONFIG_PM */ +#endif /* CONFIG_PM_SLEEP */ #ifdef CONFIG_PM_RUNTIME static int e1000_runtime_suspend(struct device *dev) @@ -7015,13 +7015,11 @@ static DEFINE_PCI_DEVICE_TABLE(e1000_pci_tbl) = { }; MODULE_DEVICE_TABLE(pci, e1000_pci_tbl); -#ifdef CONFIG_PM static const struct dev_pm_ops e1000_pm_ops = { SET_SYSTEM_SLEEP_PM_OPS(e1000_suspend, e1000_resume) SET_RUNTIME_PM_OPS(e1000_runtime_suspend, e1000_runtime_resume, e1000_idle) }; -#endif /* PCI Device API Driver */ static struct pci_driver e1000_driver = { @@ -7029,11 +7027,9 @@ static struct pci_driver e1000_driver = { .id_table = e1000_pci_tbl, .probe = e1000_probe, .remove = e1000_remove, -#ifdef CONFIG_PM .driver = { .pm = &e1000_pm_ops, }, -#endif .shutdown = e1000_shutdown, .err_handler = &e1000_err_handler }; -- cgit v0.10.2 From d1969a84dd6a44d375aa82bba7d6c38713a429c3 Mon Sep 17 00:00:00 2001 From: Hugh Dickins Date: Thu, 16 Jan 2014 15:26:48 -0800 Subject: percpu_counter: unbreak __percpu_counter_add() Commit 74e72f894d56 ("lib/percpu_counter.c: fix __percpu_counter_add()") looked very plausible, but its arithmetic was badly wrong: obvious once you see the fix, but maddening to get there from the weird tmpfs ENOSPCs Signed-off-by: Hugh Dickins Cc: Ming Lei Cc: Paul Gortmaker Cc: Shaohua Li Cc: Jens Axboe Cc: Fan Du Cc: Andrew Morton Signed-off-by: Linus Torvalds diff --git a/lib/percpu_counter.c b/lib/percpu_counter.c index 1da85bb..8280a5d 100644 --- a/lib/percpu_counter.c +++ b/lib/percpu_counter.c @@ -82,7 +82,7 @@ void __percpu_counter_add(struct percpu_counter *fbc, s64 amount, s32 batch) unsigned long flags; raw_spin_lock_irqsave(&fbc->lock, flags); fbc->count += count; - __this_cpu_sub(*fbc->counters, count); + __this_cpu_sub(*fbc->counters, count - amount); raw_spin_unlock_irqrestore(&fbc->lock, flags); } else { this_cpu_add(*fbc->counters, amount); -- cgit v0.10.2 From c196403b79aa241c3fefb3ee5bb328aa7c5cc860 Mon Sep 17 00:00:00 2001 From: Gerald Schaefer Date: Thu, 16 Jan 2014 16:54:48 +0100 Subject: net: rds: fix per-cpu helper usage commit ae4b46e9d "net: rds: use this_cpu_* per-cpu helper" broke per-cpu handling for rds. chpfirst is the result of __this_cpu_read(), so it is an absolute pointer and not __percpu. Therefore, __this_cpu_write() should not operate on chpfirst, but rather on cache->percpu->first, just like __this_cpu_read() did before. Cc: # 3.8+ Signed-off-byd Gerald Schaefer Signed-off-by: David S. Miller diff --git a/net/rds/ib_recv.c b/net/rds/ib_recv.c index 8eb9501..b7ebe23 100644 --- a/net/rds/ib_recv.c +++ b/net/rds/ib_recv.c @@ -421,8 +421,7 @@ static void rds_ib_recv_cache_put(struct list_head *new_item, struct rds_ib_refill_cache *cache) { unsigned long flags; - struct list_head *old; - struct list_head __percpu *chpfirst; + struct list_head *old, *chpfirst; local_irq_save(flags); @@ -432,7 +431,7 @@ static void rds_ib_recv_cache_put(struct list_head *new_item, else /* put on front */ list_add_tail(new_item, chpfirst); - __this_cpu_write(chpfirst, new_item); + __this_cpu_write(cache->percpu->first, new_item); __this_cpu_inc(cache->percpu->count); if (__this_cpu_read(cache->percpu->count) < RDS_IB_RECYCLE_BATCH_COUNT) @@ -452,7 +451,7 @@ static void rds_ib_recv_cache_put(struct list_head *new_item, } while (old); - __this_cpu_write(chpfirst, NULL); + __this_cpu_write(cache->percpu->first, NULL); __this_cpu_write(cache->percpu->count, 0); end: local_irq_restore(flags); -- cgit v0.10.2 From 77f99ad16a07aa062c2d30fae57b1fee456f6ef6 Mon Sep 17 00:00:00 2001 From: Christoph Paasch Date: Thu, 16 Jan 2014 20:01:21 +0100 Subject: tcp: metrics: Avoid duplicate entries with the same destination-IP Because the tcp-metrics is an RCU-list, it may be that two soft-interrupts are inside __tcp_get_metrics() for the same destination-IP at the same time. If this destination-IP is not yet part of the tcp-metrics, both soft-interrupts will end up in tcpm_new and create a new entry for this IP. So, we will have two tcp-metrics with the same destination-IP in the list. This patch checks twice __tcp_get_metrics(). First without holding the lock, then while holding the lock. The second one is there to confirm that the entry has not been added by another soft-irq while waiting for the spin-lock. Fixes: 51c5d0c4b169b (tcp: Maintain dynamic metrics in local cache.) Signed-off-by: Christoph Paasch Reviewed-by: Eric Dumazet Signed-off-by: David S. Miller diff --git a/net/ipv4/tcp_metrics.c b/net/ipv4/tcp_metrics.c index 0649373..098b3a2 100644 --- a/net/ipv4/tcp_metrics.c +++ b/net/ipv4/tcp_metrics.c @@ -22,6 +22,9 @@ int sysctl_tcp_nometrics_save __read_mostly; +static struct tcp_metrics_block *__tcp_get_metrics(const struct inetpeer_addr *addr, + struct net *net, unsigned int hash); + struct tcp_fastopen_metrics { u16 mss; u16 syn_loss:10; /* Recurring Fast Open SYN losses */ @@ -130,16 +133,41 @@ static void tcpm_suck_dst(struct tcp_metrics_block *tm, struct dst_entry *dst, } } +#define TCP_METRICS_TIMEOUT (60 * 60 * HZ) + +static void tcpm_check_stamp(struct tcp_metrics_block *tm, struct dst_entry *dst) +{ + if (tm && unlikely(time_after(jiffies, tm->tcpm_stamp + TCP_METRICS_TIMEOUT))) + tcpm_suck_dst(tm, dst, false); +} + +#define TCP_METRICS_RECLAIM_DEPTH 5 +#define TCP_METRICS_RECLAIM_PTR (struct tcp_metrics_block *) 0x1UL + static struct tcp_metrics_block *tcpm_new(struct dst_entry *dst, struct inetpeer_addr *addr, - unsigned int hash, - bool reclaim) + unsigned int hash) { struct tcp_metrics_block *tm; struct net *net; + bool reclaim = false; spin_lock_bh(&tcp_metrics_lock); net = dev_net(dst->dev); + + /* While waiting for the spin-lock the cache might have been populated + * with this entry and so we have to check again. + */ + tm = __tcp_get_metrics(addr, net, hash); + if (tm == TCP_METRICS_RECLAIM_PTR) { + reclaim = true; + tm = NULL; + } + if (tm) { + tcpm_check_stamp(tm, dst); + goto out_unlock; + } + if (unlikely(reclaim)) { struct tcp_metrics_block *oldest; @@ -169,17 +197,6 @@ out_unlock: return tm; } -#define TCP_METRICS_TIMEOUT (60 * 60 * HZ) - -static void tcpm_check_stamp(struct tcp_metrics_block *tm, struct dst_entry *dst) -{ - if (tm && unlikely(time_after(jiffies, tm->tcpm_stamp + TCP_METRICS_TIMEOUT))) - tcpm_suck_dst(tm, dst, false); -} - -#define TCP_METRICS_RECLAIM_DEPTH 5 -#define TCP_METRICS_RECLAIM_PTR (struct tcp_metrics_block *) 0x1UL - static struct tcp_metrics_block *tcp_get_encode(struct tcp_metrics_block *tm, int depth) { if (tm) @@ -282,7 +299,6 @@ static struct tcp_metrics_block *tcp_get_metrics(struct sock *sk, struct inetpeer_addr addr; unsigned int hash; struct net *net; - bool reclaim; addr.family = sk->sk_family; switch (addr.family) { @@ -304,13 +320,10 @@ static struct tcp_metrics_block *tcp_get_metrics(struct sock *sk, hash = hash_32(hash, net->ipv4.tcp_metrics_hash_log); tm = __tcp_get_metrics(&addr, net, hash); - reclaim = false; - if (tm == TCP_METRICS_RECLAIM_PTR) { - reclaim = true; + if (tm == TCP_METRICS_RECLAIM_PTR) tm = NULL; - } if (!tm && create) - tm = tcpm_new(dst, &addr, hash, reclaim); + tm = tcpm_new(dst, &addr, hash); else tcpm_check_stamp(tm, dst); -- cgit v0.10.2 From 11ffff752c6a5adc86f7dd397b2f75af8f917c51 Mon Sep 17 00:00:00 2001 From: Hannes Frederic Sowa Date: Thu, 16 Jan 2014 20:13:04 +0100 Subject: ipv6: simplify detection of first operational link-local address on interface In commit 1ec047eb4751e3 ("ipv6: introduce per-interface counter for dad-completed ipv6 addresses") I build the detection of the first operational link-local address much to complex. Additionally this code now has a race condition. Replace it with a much simpler variant, which just scans the address list when duplicate address detection completes, to check if this is the first valid link local address and send RS and MLD reports then. Fixes: 1ec047eb4751e3 ("ipv6: introduce per-interface counter for dad-completed ipv6 addresses") Reported-by: Jiri Pirko Cc: Flavio Leitner Signed-off-by: Hannes Frederic Sowa Acked-by: Flavio Leitner Acked-by: Jiri Pirko Signed-off-by: David S. Miller diff --git a/include/net/if_inet6.h b/include/net/if_inet6.h index 76d5427..65bb130 100644 --- a/include/net/if_inet6.h +++ b/include/net/if_inet6.h @@ -165,7 +165,6 @@ struct inet6_dev { struct net_device *dev; struct list_head addr_list; - int valid_ll_addr_cnt; struct ifmcaddr6 *mc_list; struct ifmcaddr6 *mc_tomb; diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index abe46a4..4b6b720 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -3189,6 +3189,22 @@ out: in6_ifa_put(ifp); } +/* ifp->idev must be at least read locked */ +static bool ipv6_lonely_lladdr(struct inet6_ifaddr *ifp) +{ + struct inet6_ifaddr *ifpiter; + struct inet6_dev *idev = ifp->idev; + + list_for_each_entry(ifpiter, &idev->addr_list, if_list) { + if (ifp != ifpiter && ifpiter->scope == IFA_LINK && + (ifpiter->flags & (IFA_F_PERMANENT|IFA_F_TENTATIVE| + IFA_F_OPTIMISTIC|IFA_F_DADFAILED)) == + IFA_F_PERMANENT) + return false; + } + return true; +} + static void addrconf_dad_completed(struct inet6_ifaddr *ifp) { struct net_device *dev = ifp->idev->dev; @@ -3208,14 +3224,11 @@ static void addrconf_dad_completed(struct inet6_ifaddr *ifp) */ read_lock_bh(&ifp->idev->lock); - spin_lock(&ifp->lock); - send_mld = ipv6_addr_type(&ifp->addr) & IPV6_ADDR_LINKLOCAL && - ifp->idev->valid_ll_addr_cnt == 1; + send_mld = ifp->scope == IFA_LINK && ipv6_lonely_lladdr(ifp); send_rs = send_mld && ipv6_accept_ra(ifp->idev) && ifp->idev->cnf.rtr_solicits > 0 && (dev->flags&IFF_LOOPBACK) == 0; - spin_unlock(&ifp->lock); read_unlock_bh(&ifp->idev->lock); /* While dad is in progress mld report's source address is in6_addrany. @@ -4512,19 +4525,6 @@ errout: rtnl_set_sk_err(net, RTNLGRP_IPV6_PREFIX, err); } -static void update_valid_ll_addr_cnt(struct inet6_ifaddr *ifp, int count) -{ - write_lock_bh(&ifp->idev->lock); - spin_lock(&ifp->lock); - if (((ifp->flags & (IFA_F_PERMANENT|IFA_F_TENTATIVE|IFA_F_OPTIMISTIC| - IFA_F_DADFAILED)) == IFA_F_PERMANENT) && - (ipv6_addr_type(&ifp->addr) & IPV6_ADDR_LINKLOCAL)) - ifp->idev->valid_ll_addr_cnt += count; - WARN_ON(ifp->idev->valid_ll_addr_cnt < 0); - spin_unlock(&ifp->lock); - write_unlock_bh(&ifp->idev->lock); -} - static void __ipv6_ifa_notify(int event, struct inet6_ifaddr *ifp) { struct net *net = dev_net(ifp->idev->dev); @@ -4533,8 +4533,6 @@ static void __ipv6_ifa_notify(int event, struct inet6_ifaddr *ifp) switch (event) { case RTM_NEWADDR: - update_valid_ll_addr_cnt(ifp, 1); - /* * If the address was optimistic * we inserted the route at the start of @@ -4550,8 +4548,6 @@ static void __ipv6_ifa_notify(int event, struct inet6_ifaddr *ifp) ifp->idev->dev, 0, 0); break; case RTM_DELADDR: - update_valid_ll_addr_cnt(ifp, -1); - if (ifp->idev->cnf.forwarding) addrconf_leave_anycast(ifp); addrconf_leave_solict(ifp->idev, &ifp->addr); -- cgit v0.10.2 From 75b99dbd634f9caf7b4e48ec5c2dcec8c1837372 Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Thu, 16 Jan 2014 11:15:12 -0800 Subject: parisc: fix SO_MAX_PACING_RATE typo SO_MAX_PACING_RATE definition on parisc got a typo. Its not too late to fix it, before 3.13 is official. Fixes: 62748f32d501 ("net: introduce SO_MAX_PACING_RATE") Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller diff --git a/arch/parisc/include/uapi/asm/socket.h b/arch/parisc/include/uapi/asm/socket.h index f33113a..70b3674 100644 --- a/arch/parisc/include/uapi/asm/socket.h +++ b/arch/parisc/include/uapi/asm/socket.h @@ -75,6 +75,6 @@ #define SO_BUSY_POLL 0x4027 -#define SO_MAX_PACING_RATE 0x4048 +#define SO_MAX_PACING_RATE 0x4028 #endif /* _UAPI_ASM_SOCKET_H */ -- cgit v0.10.2 From 3af57f78c38131b7a66e2b01e06fdacae01992a3 Mon Sep 17 00:00:00 2001 From: Heiko Carstens Date: Fri, 17 Jan 2014 09:37:15 +0100 Subject: s390/bpf,jit: fix 32 bit divisions, use unsigned divide instructions The s390 bpf jit compiler emits the signed divide instructions "dr" and "d" for unsigned divisions. This can cause problems: the dividend will be zero extended to a 64 bit value and the divisor is the 32 bit signed value as specified A or X accumulator, even though A and X are supposed to be treated as unsigned values. The divide instrunctions will generate an exception if the result cannot be expressed with a 32 bit signed value. This is the case if e.g. the dividend is 0xffffffff and the divisor either 1 or also 0xffffffff (signed: -1). To avoid all these issues simply use unsigned divide instructions. Signed-off-by: Heiko Carstens Signed-off-by: David S. Miller diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c index fc0fa77..708d60e 100644 --- a/arch/s390/net/bpf_jit_comp.c +++ b/arch/s390/net/bpf_jit_comp.c @@ -368,16 +368,16 @@ static int bpf_jit_insn(struct bpf_jit *jit, struct sock_filter *filter, EMIT4_PCREL(0xa7840000, (jit->ret0_ip - jit->prg)); /* lhi %r4,0 */ EMIT4(0xa7480000); - /* dr %r4,%r12 */ - EMIT2(0x1d4c); + /* dlr %r4,%r12 */ + EMIT4(0xb997004c); break; case BPF_S_ALU_DIV_K: /* A /= K */ if (K == 1) break; /* lhi %r4,0 */ EMIT4(0xa7480000); - /* d %r4,(%r13) */ - EMIT4_DISP(0x5d40d000, EMIT_CONST(K)); + /* dl %r4,(%r13) */ + EMIT6_DISP(0xe340d000, 0x0097, EMIT_CONST(K)); break; case BPF_S_ALU_MOD_X: /* A %= X */ jit->seen |= SEEN_XREG | SEEN_RET0; @@ -387,8 +387,8 @@ static int bpf_jit_insn(struct bpf_jit *jit, struct sock_filter *filter, EMIT4_PCREL(0xa7840000, (jit->ret0_ip - jit->prg)); /* lhi %r4,0 */ EMIT4(0xa7480000); - /* dr %r4,%r12 */ - EMIT2(0x1d4c); + /* dlr %r4,%r12 */ + EMIT4(0xb997004c); /* lr %r5,%r4 */ EMIT2(0x1854); break; @@ -400,8 +400,8 @@ static int bpf_jit_insn(struct bpf_jit *jit, struct sock_filter *filter, } /* lhi %r4,0 */ EMIT4(0xa7480000); - /* d %r4,(%r13) */ - EMIT4_DISP(0x5d40d000, EMIT_CONST(K)); + /* dl %r4,(%r13) */ + EMIT6_DISP(0xe340d000, 0x0097, EMIT_CONST(K)); /* lr %r5,%r4 */ EMIT2(0x1854); break; -- cgit v0.10.2