Age | Commit message (Collapse) | Author |
|
nsown_capable is a special case of ns_capable essentially for just CAP_SETUID and
CAP_SETGID. For the existing users it doesn't noticably simplify things and
from the suggested patches I have seen it encourages people to do the wrong
thing. So remove nsown_capable.
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
We allow task A to change B's nice level if it has a supserset of
B's privileges, or of it has CAP_SYS_NICE. Also allow it if A has
CAP_SYS_NICE with respect to B - meaning it is root in the same
namespace, or it created B's namespace.
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
|
I goofed when I made unshare(CLONE_NEWPID) only work in a
single-threaded process. There is no need for that requirement and in
fact I analyzied things right for setns. The hard requirement
is for tasks that share a VM to all be in the pid namespace and
we properly prevent that in do_fork.
Just to be certain I took a look through do_wait and
forget_original_parent and there are no cases that make it any harder
for children to be in the multiple pid namespaces than it is for
children to be in the same pid namespace. I also performed a check to
see if there were in uses of task->nsproxy_pid_ns I was not familiar
with, but it is only used when allocating a new pid for a new task,
and in checks to prevent craziness from happening.
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
As the capabilites and capability bounding set are per user namespace
properties it is safe to allow changing them with just CAP_SETPCAP
permission in the user namespace.
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Tested-by: Richard Weinberger <richard@nod.at>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
Remove the test for the impossible case where tsk->nsproxy == NULL. Fork
will never be called with tsk->nsproxy == NULL.
Only call get_nsproxy when we don't need to generate a new_nsproxy,
and mark the case where we don't generate a new nsproxy as likely.
Remove the code to drop an unnecessarily acquired nsproxy value.
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
Serge Hallyn <serge.hallyn@ubuntu.com> writes:
> Since commit af4b8a83add95ef40716401395b44a1b579965f4 it's been
> possible to get into a situation where a pidns reaper is
> <defunct>, reparented to host pid 1, but never reaped. How to
> reproduce this is documented at
>
> https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/1168526
> (and see
> https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/1168526/comments/13)
> In short, run repeated starts of a container whose init is
>
> Process.exit(0);
>
> sysrq-t when such a task is playing zombie shows:
>
> [ 131.132978] init x ffff88011fc14580 0 2084 2039 0x00000000
> [ 131.132978] ffff880116e89ea8 0000000000000002 ffff880116e89fd8 0000000000014580
> [ 131.132978] ffff880116e89fd8 0000000000014580 ffff8801172a0000 ffff8801172a0000
> [ 131.132978] ffff8801172a0630 ffff88011729fff0 ffff880116e14650 ffff88011729fff0
> [ 131.132978] Call Trace:
> [ 131.132978] [<ffffffff816f6159>] schedule+0x29/0x70
> [ 131.132978] [<ffffffff81064591>] do_exit+0x6e1/0xa40
> [ 131.132978] [<ffffffff81071eae>] ? signal_wake_up_state+0x1e/0x30
> [ 131.132978] [<ffffffff8106496f>] do_group_exit+0x3f/0xa0
> [ 131.132978] [<ffffffff810649e4>] SyS_exit_group+0x14/0x20
> [ 131.132978] [<ffffffff8170102f>] tracesys+0xe1/0xe6
>
> Further debugging showed that every time this happened, zap_pid_ns_processes()
> started with nr_hashed being 3, while we were expecting it to drop to 2.
> Any time it didn't happen, nr_hashed was 1 or 2. So the reaper was
> waiting for nr_hashed to become 2, but free_pid() only wakes the reaper
> if nr_hashed hits 1.
The issue is that when the task group leader of an init process exits
before other tasks of the init process when the init process finally
exits it will be a secondary task sleeping in zap_pid_ns_processes and
waiting to wake up when the number of hashed pids drops to two. This
case waits forever as free_pid only sends a wake up when the number of
hashed pids drops to 1.
To correct this the simple strategy of sending a possibly unncessary
wake up when the number of hashed pids drops to 2 is adopted.
Sending one extraneous wake up is relatively harmless, at worst we
waste a little cpu time in the rare case when a pid namespace
appropaches exiting.
We can detect the case when the pid namespace drops to just two pids
hashed race free in free_pid.
Dereferencing pid_ns->child_reaper with the pidmap_lock held is safe
without out the tasklist_lock because it is guaranteed that the
detach_pid will be called on the child_reaper before it is freed and
detach_pid calls __change_pid which calls free_pid which takes the
pidmap_lock. __change_pid only calls free_pid if this is the
last use of the pid. For a thread that is not the thread group leader
the threads pid will only ever have one user because a threads pid
is not allowed to be the pid of a process, of a process group or
a session. For a thread that is a thread group leader all of
the other threads of that process will be reaped before it is allowed
for the thread group leader to be reaped ensuring there will only
be one user of the threads pid as a process pid. Furthermore
because the thread is the init process of a pid namespace all of the
other processes in the pid namespace will have also been already freed
leading to the fact that the pid will not be used as a session pid or
a process group pid for any other running process.
CC: stable@vger.kernel.org
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Tested-by: Serge Hallyn <serge.hallyn@canonical.com>
Reported-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
Don't allow mounting sysfs unless the caller has CAP_SYS_ADMIN rights
over the net namespace. The principle here is if you create or have
capabilities over it you can mount it, otherwise you get to live with
what other people have mounted.
Instead of testing this with a straight forward ns_capable call,
perform this check the long and torturous way with kobject helpers,
this keeps direct knowledge of namespaces out of sysfs, and preserves
the existing sysfs abstractions.
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
Rely on the fact that another flavor of the filesystem is already
mounted and do not rely on state in the user namespace.
Verify that the mounted filesystem is not covered in any significant
way. I would love to verify that the previously mounted filesystem
has no mounts on top but there are at least the directories
/proc/sys/fs/binfmt_misc and /sys/fs/cgroup/ that exist explicitly
for other filesystems to mount on top of.
Refactor the test into a function named fs_fully_visible and call that
function from the mount routines of proc and sysfs. This makes this
test local to the filesystems involved and the results current of when
the mounts take place, removing a weird threading of the user
namespace, the mount namespace and the filesystems themselves.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
Don't copy bind mounts of /proc/<pid>/ns/mnt between namespaces.
These files hold references to a mount namespace and copying them
between namespaces could result in a reference counting loop.
The current mnt_ns_loop test prevents loops on the assumption that
mounts don't cross between namespaces. Unfortunately unsharing a
mount namespace and shared substrees can both cause mounts to
propogate between mount namespaces.
Add two flags CL_COPY_UNBINDABLE and CL_COPY_MNT_NS_FILE are added to
control this behavior, and CL_COPY_ALL is redefined as both of them.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
It seems GCC generates a better code in that way, so I changed that statement.
Btw, they have the same semantic, so I'm sending this patch due to performance issues.
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: Raphael S.Carvalho <raphael.scarv@gmail.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
|
Don't allow mounting the proc filesystem unless the caller has
CAP_SYS_ADMIN rights over the pid namespace. The principle here is if
you create or have capabilities over it you can mount it, otherwise
you get to live with what other people have mounted.
Andy pointed out that this is needed to prevent users in a user
namespace from remounting proc and specifying different hidepid and gid
options on already existing proc mounts.
Cc: stable@vger.kernel.org
Reported-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
When creating a less privileged mount namespace or propogating mounts
from a more privileged to a less privileged mount namespace lock the
submounts so they may not be unmounted individually in the child mount
namespace revealing what is under them.
This enforces the reasonable expectation that it is not possible to
see under a mount point. Most of the time mounts are on empty
directories and revealing that does not matter, however I have seen an
occassionaly sloppy configuration where there were interesting things
concealed under a mount point that probably should not be revealed.
Expirable submounts are not locked because they will eventually
unmount automatically so whatever is under them already needs
to be safe for unprivileged users to access.
From a practical standpoint these restrictions do not appear to be
significant for unprivileged users of the mount namespace. Recursive
bind mounts and pivot_root continues to work, and mounts that are
created in a mount namespace may be unmounted there. All of which
means that the common idiom of keeping a directory of interesting
files and using pivot_root to throw everything else away continues to
work just fine.
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Acked-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux
Pull slab update from Pekka Enberg:
"Highlights:
- Fix for boot-time problems on some architectures due to
init_lock_keys() not respecting kmalloc_caches boundaries
(Christoph Lameter)
- CONFIG_SLUB_CPU_PARTIAL requested by RT folks (Joonsoo Kim)
- Fix for excessive slab freelist draining (Wanpeng Li)
- SLUB and SLOB cleanups and fixes (various people)"
I ended up editing the branch, and this avoids two commits at the end
that were immediately reverted, and I instead just applied the oneliner
fix in between myself.
* 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux
slub: Check for page NULL before doing the node_match check
mm/slab: Give s_next and s_stop slab-specific names
slob: Check for NULL pointer before calling ctor()
slub: Make cpu partial slab support configurable
slab: add kmalloc() to kernel API documentation
slab: fix init_lock_keys
slob: use DIV_ROUND_UP where possible
slub: do not put a slab to cpu partial list when cpu_partial is 0
mm/slub: Use node_nr_slabs and node_nr_objs in get_slabinfo
mm/slub: Drop unnecessary nr_partials
mm/slab: Fix /proc/slabinfo unwriteable for slab
mm/slab: Sharing s_next and s_stop between slab and slub
mm/slab: Fix drain freelist excessively
slob: Rework #ifdeffery in slab.h
mm, slab: moved kmem_cache_alloc_node comment to correct place
|
|
In the -rt kernel (mrg), we hit the following dump:
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffff811573f1>] kmem_cache_alloc_node+0x51/0x180
PGD a2d39067 PUD b1641067 PMD 0
Oops: 0000 [#1] PREEMPT SMP
Modules linked in: sunrpc cpufreq_ondemand ipv6 tg3 joydev sg serio_raw pcspkr k8temp amd64_edac_mod edac_core i2c_piix4 e100 mii shpchp ext4 mbcache jbd2 sd_mod crc_t10dif sr_mod cdrom sata_svw ata_generic pata_acpi pata_serverworks radeon ttm drm_kms_helper drm hwmon i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod
CPU 3
Pid: 20878, comm: hackbench Not tainted 3.6.11-rt25.14.el6rt.x86_64 #1 empty empty/Tyan Transport GT24-B3992
RIP: 0010:[<ffffffff811573f1>] [<ffffffff811573f1>] kmem_cache_alloc_node+0x51/0x180
RSP: 0018:ffff8800a9b17d70 EFLAGS: 00010213
RAX: 0000000000000000 RBX: 0000000001200011 RCX: ffff8800a06d8000
RDX: 0000000004d92a03 RSI: 00000000000000d0 RDI: ffff88013b805500
RBP: ffff8800a9b17dc0 R08: ffff88023fd14d10 R09: ffffffff81041cbd
R10: 00007f4e3f06e9d0 R11: 0000000000000246 R12: ffff88013b805500
R13: ffff8801ff46af40 R14: 0000000000000001 R15: 0000000000000000
FS: 00007f4e3f06e700(0000) GS:ffff88023fd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 00000000a2d3a000 CR4: 00000000000007e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process hackbench (pid: 20878, threadinfo ffff8800a9b16000, task ffff8800a06d8000)
Stack:
ffff8800a9b17da0 ffffffff81202e08 ffff8800a9b17de0 000000d001200011
0000000001200011 0000000001200011 0000000000000000 0000000000000000
00007f4e3f06e9d0 0000000000000000 ffff8800a9b17e60 ffffffff81041cbd
Call Trace:
[<ffffffff81202e08>] ? current_has_perm+0x68/0x80
[<ffffffff81041cbd>] copy_process+0xdd/0x15b0
[<ffffffff810a2125>] ? rt_up_read+0x25/0x30
[<ffffffff8104369a>] do_fork+0x5a/0x360
[<ffffffff8107c66b>] ? migrate_enable+0xeb/0x220
[<ffffffff8100b068>] sys_clone+0x28/0x30
[<ffffffff81527423>] stub_clone+0x13/0x20
[<ffffffff81527152>] ? system_call_fastpath+0x16/0x1b
Code: 89 fc 89 75 cc 41 89 d6 4d 8b 04 24 65 4c 03 04 25 48 ae 00 00 49 8b 50 08 4d 8b 28 49 8b 40 10 4d 85 ed 74 12 41 83 fe ff 74 27 <48> 8b 00 48 c1 e8 3a 41 39 c6 74 1b 8b 75 cc 4c 89 c9 44 89 f2
RIP [<ffffffff811573f1>] kmem_cache_alloc_node+0x51/0x180
RSP <ffff8800a9b17d70>
CR2: 0000000000000000
---[ end trace 0000000000000002 ]---
Now, this uses SLUB pretty much unmodified, but as it is the -rt kernel
with CONFIG_PREEMPT_RT set, spinlocks are mutexes, although they do
disable migration. But the SLUB code is relatively lockless, and the
spin_locks there are raw_spin_locks (not converted to mutexes), thus I
believe this bug can happen in mainline without -rt features. The -rt
patch is just good at triggering mainline bugs ;-)
Anyway, looking at where this crashed, it seems that the page variable
can be NULL when passed to the node_match() function (which does not
check if it is NULL). When this happens we get the above panic.
As page is only used in slab_alloc() to check if the node matches, if
it's NULL I'm assuming that we can say it doesn't and call the
__slab_alloc() code. Is this a correct assumption?
Acked-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull more vfs stuff from Al Viro:
"O_TMPFILE ABI changes, Oleg's fput() series, misc cleanups, including
making simple_lookup() usable for filesystems with non-NULL s_d_op,
which allows us to get rid of quite a bit of ugliness"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
sunrpc: now we can just set ->s_d_op
cgroup: we can use simple_lookup() now
efivarfs: we can use simple_lookup() now
make simple_lookup() usable for filesystems that set ->s_d_op
configfs: don't open-code d_alloc_name()
__rpc_lookup_create_exclusive: pass string instead of qstr
rpc_create_*_dir: don't bother with qstr
llist: llist_add() can use llist_add_batch()
llist: fix/simplify llist_add() and llist_add_batch()
fput: turn "list_head delayed_fput_list" into llist_head
fs/file_table.c:fput(): add comment
Safer ABI for O_TMPFILE
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
... and use d_hash_and_lookup() instead of open-coding it, for fsck sake...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
just pass the name
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Pull x86 platform driver updates from Matthew Garrett:
"Nothing overly exciting here - a couple of new drivers that don't do a
great deal, along with some miscellaneous fixes and a couple of small
feature enablement patches"
* 'for_linus' of git://cavan.codon.org.uk/platform-drivers-x86:
x86 platform drivers: fix gpio leak
toshiba_acpi: Add dependency on SERIO_I8042
asus-nb-wmi: set wapf=4 for ASUSTeK COMPUTER INC. 1015E/U
Add trivial driver to disable Intel Smart Connect
Add support driver for Intel Rapid Start Technology
hp-wmi: add supports for POST code error
asus-wmi: control wlan-led only if wapf == 4
drivers/platform/x86/intel_ips: Convert to module_pci_driver
asus-nb-wmi: ignore ALS notification key code
asus-wmi: append newline to messages
x86: asus-laptop: fix invalid point access
x86: msi-laptop: fix memleak
amilo-rfkill: Add dependency on SERIO_I8042
dell-laptop: fix error return code in dell_init()
hp-wmi: Enable hotkeys on some systems
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull second round of input updates from Dmitry Torokhov:
"An update to Elantech driver to support hardware v7, fix to the new
cyttsp4 driver to use proper addressing, ads7846 device tree support
and nspire-keypad got a small cleanup."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: nspire-keypad - replace magic offset with define
Input: elantech - fix for newer hardware versions (v7)
Input: cyttsp4 - use 16bit address for I2C/SPI communication
Input: ads7846 - add device tree bindings
Input: ads7846 - make sure we do not change platform data
|
|
Pull networking fixes from David Miller:
"Just a bunch of small fixes and tidy ups:
1) Finish the "busy_poll" renames, from Eliezer Tamir.
2) Fix RCU stalls in IFB driver, from Ding Tianhong.
3) Linearize buffers properly in tun/macvtap zerocopy code.
4) Don't crash on rmmod in vxlan, from Pravin B Shelar.
5) Spinlock used before init in alx driver, from Maarten Lankhorst.
6) A sparse warning fix in bnx2x broke TSO checksums, fix from Dmitry
Kravkov.
7) Dummy and ifb driver load failure paths can oops, fixes from Tan
Xiaojun and Ding Tianhong.
8) Correct MTU calculations in IP tunnels, from Alexander Duyck.
9) Account all TCP retransmits in SNMP stats properly, from Yuchung
Cheng.
10) atl1e and via-rhine do not handle DMA mapping failures properly,
from Neil Horman.
11) Various equal-cost multipath route fixes in ipv6 from Hannes
Frederic Sowa"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (36 commits)
ipv6: only static routes qualify for equal cost multipathing
via-rhine: fix dma mapping errors
atl1e: fix dma mapping warnings
tcp: account all retransmit failures
usb/net/r815x: fix cast to restricted __le32
usb/net/r8152: fix integer overflow in expression
net: access page->private by using page_private
net: strict_strtoul is obsolete, use kstrtoul instead
drivers/net/ieee802154: don't use devm_pinctrl_get_select_default() in probe
drivers/net/ethernet/cadence: don't use devm_pinctrl_get_select_default() in probe
drivers/net/can/c_can: don't use devm_pinctrl_get_select_default() in probe
net/usb: add relative mii functions for r815x
net/tipc: use %*phC to dump small buffers in hex form
qlcnic: Adding Maintainers.
gre: Fix MTU sizing check for gretap tunnels
pkt_sched: sch_qfq: remove forward declaration of qfq_update_agg_ts
pkt_sched: sch_qfq: improve efficiency of make_eligible
gso: Update tunnel segmentation to support Tx checksum offload
inet: fix spacing in assignment
ifb: fix oops when loading the ifb failed
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull final round of SCSI updates from James Bottomley:
"This is the remaining set of SCSI patches for the merge window. It's
mostly driver updates (scsi_debug, qla2xxx, storvsc, mp3sas). There
are also several bug fixes in fcoe, libfc, and megaraid_sas. We also
have a couple of core changes to try to make device destruction more
deterministic"
* tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (46 commits)
[SCSI] scsi constants: command, sense key + additional sense strings
fcoe: Reduce number of sparse warnings
fcoe: Stop fc_rport_priv structure leak
libfcoe: Fix meaningless log statement
libfc: Differentiate echange timer cancellation debug statements
libfc: Remove extra space in fc_exch_timer_cancel definition
fcoe: fix the link error status block sparse warnings
fcoe: Fix smatch warning in fcoe_fdmi_info function
libfc: Reject PLOGI from nodes with incompatible role
[SCSI] enable destruction of blocked devices which fail LUN scanning
[SCSI] Fix race between starved list and device removal
[SCSI] megaraid_sas: fix a bug for 64 bit arches
[SCSI] scsi_debug: reduce duplication between prot_verify_read and prot_verify_write
[SCSI] scsi_debug: simplify offset calculation for dif_storep
[SCSI] scsi_debug: invalidate protection info for unmapped region
[SCSI] scsi_debug: fix NULL pointer dereference with parameters dif=0 dix=1
[SCSI] scsi_debug: fix incorrectly nested kmap_atomic()
[SCSI] scsi_debug: fix invalid address passed to kunmap_atomic()
[SCSI] mpt3sas: Bump driver version to v02.100.00.00
[SCSI] mpt3sas: when async scanning is enabled then while scanning, devices are removed but their transport layer entries are not removed
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fix from Thomas Gleixner:
"Fix a potential deadlock versus hrtimers"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: Fix HRTICK
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
- core fix for missing round up in the generic irq chip implementation
- new irq chip for MOXA SoCs
- a few fixes and cleanups in the irqchip drivers
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip: Add support for MOXA ART SoCs
genirq: generic chip: Use DIV_ROUND_UP to calculate numchips
irqchip: nvic: Fix wrong num_ct argument for irq_alloc_domain_generic_chips()
irqchip: sun4i: Staticize sun4i_irq_ack()
irqchip: vt8500: Staticize local symbols
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Thomas Gleixner:
- watchdog fixes for full dynticks
- improved debug output for full dynticks
- remove an obsolete full dynticks check
- two ARM SoC clocksource drivers for sharing across SoCs
- tick broadcast fix for CPU hotplug
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tick: broadcast: Check broadcast mode on CPU hotplug
clocksource: arm_global_timer: Add ARM global timer support
clocksource: Add Marvell Orion SoC timer
nohz: Remove obsolete check for full dynticks CPUs to be RCU nocbs
watchdog: Boot-disable by default on full dynticks
watchdog: Rename confusing state variable
watchdog: Register / unregister watchdog kthreads on sysctl control
nohz: Warn if the machine can not perform nohz_full
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:
- fix for do_div() abuse on x86
- locking fix in perf core
- a pile of (build) fixes and cleanups in perf tools
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
perf/x86: Fix incorrect use of do_div() in NMI warning
perf: Fix perf_lock_task_context() vs RCU
perf: Remove WARN_ON_ONCE() check in __perf_event_enable() for valid scenario
perf: Clone child context from parent context pmu
perf script: Fix broken include in Context.xs
perf tools: Fix -ldw/-lelf link test when static linking
perf tools: Revert regression in configuration of Python support
perf tools: Fix perf version generation
perf stat: Fix per-socket output bug for uncore events
perf symbols: Fix vdso list searching
perf evsel: Fix missing increment in sample parsing
perf tools: Update symbol_conf.nr_events when processing attribute events
perf tools: Fix new_term() missing free on error path
perf tools: Fix parse_events_terms() segfault on error path
perf evsel: Fix count parameter to read call in event_format__new
perf tools: fix a typo of a Power7 event name
perf tools: Fix -x/--exclude-other option for report command
perf evlist: Enhance perf_evlist__start_workload()
perf record: Remove -f/--force option
perf record: Remove -A/--append option
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core locking updates from Thomas Gleixner:
"Header cleanup as requested by Linus"
(This is the "don't include support for ww_mutex in a header file that
everybody wants, when almost nobody wants the ww part" change)
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
mutex: Move ww_mutex definitions to ww_mutex.h
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"This is our first set of fixes from arm-soc for 3.11.
- A handful of build and warning fixes from Arnd
- A collection of OMAP fixes
- defconfig updates to make the default configs more useful for real
use (and testing) out of the box on hardware
And a couple of other small fixes. Some of these have been recently
applied but it's normally how we deal with fixes, with less bake time
in -next needed"
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (32 commits)
arm: multi_v7_defconfig: Tweaks for omap and sunxi
arm: multi_v7_defconfig: add i.MX options and NFS root
ARM: omap2: add select of TI_PRIV_EDMA
ARM: exynos: select PM_GENERIC_DOMAINS only when used
ARM: ixp4xx: avoid circular header dependency
ARM: OMAP: omap_common_late_init may be unused
ARM: sti: move DEBUG_STI_UART into alphabetical order
ARM: OMAP: build mach-omap code only if needed
ARM: zynq: use DT_MACHINE_START
ARM: omap5: omap5 has SCU and TWD
ARM: OMAP2+: omap2plus_defconfig: Enable appended DTB support
ARM: OMAP2+: Enable TI_EDMA in omap2plus_defconfig
ARM: OMAP2+: omap2plus_defconfig: enable DRA752 thermal support by default
ARM: OMAP2+: omap2plus_defconfig: enable TI bandgap driver
ARM: OMAP2+: devices: remove duplicated include from devices.c
ARM: OMAP3: igep0020: Set DSS pins in correct mux mode.
ARM: OMAP2+: N900: enable N900-specific drivers even if device tree is enabled
ARM: OMAP2+: Cocci spatch "ptr_ret.spatch"
ARM: OMAP2+: Remove obsolete Makefile line
ARM: OMAP5: Enable Cortex A15 errata 798181
...
|
|
Pull ARM fixes from Russell King:
"A few fixes for ARM, mostly just one liners with the exception of the
missing section specification. We decided not to rely on .previous to
fix this but to explicitly state the section we want the code to be
in."
* 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
ARM: 7778/1: smp_twd: twd_update_frequency need be run on all online CPUs
ARM: 7782/1: Kconfig: Let ARM_ERRATA_364296 not depend on CONFIG_SMP
ARM: mm: fix boot on SA1110 Assabet
ARM: 7781/1: mmu: Add debug_ll_io_init() mappings to early mappings
ARM: 7780/1: add missing linker section markup to head-common.S
|
|
Pull MIPS updates from Ralf Baechle:
"MIPS updates:
- All the things that didn't make 3.10.
- Removes the Windriver PPMC platform. Nobody will miss it.
- Remove a workaround from kernel/irq/irqdomain.c which was there
exclusivly for MIPS. Patch by Grant Likely.
- More small improvments for the SEAD 3 platform
- Improvments on the BMIPS / SMP support for the BCM63xx series.
- Various cleanups of dead leftovers.
- Platform support for the Cavium Octeon-based EdgeRouter Lite.
Two large KVM patchsets didn't make it for this pull request because
their respective authors are vacationing"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (124 commits)
MIPS: Kconfig: Add missing MODULES dependency to VPE_LOADER
MIPS: BCM63xx: CLK: Add dummy clk_{set,round}_rate() functions
MIPS: SEAD3: Disable L2 cache on SEAD-3.
MIPS: BCM63xx: Enable second core SMP on BCM6328 if available
MIPS: BCM63xx: Add SMP support to prom.c
MIPS: define write{b,w,l,q}_relaxed
MIPS: Expose missing pci_io{map,unmap} declarations
MIPS: Malta: Update GCMP detection.
Revert "MIPS: make CAC_ADDR and UNCAC_ADDR account for PHYS_OFFSET"
MIPS: APSP: Remove <asm/kspd.h>
SSB: Kconfig: Amend SSB_EMBEDDED dependencies
MIPS: microMIPS: Fix improper definition of ISA exception bit.
MIPS: Don't try to decode microMIPS branch instructions where they cannot exist.
MIPS: Declare emulate_load_store_microMIPS as a static function.
MIPS: Fix typos and cleanup comment
MIPS: Cleanup indentation and whitespace
MIPS: BMIPS: support booting from physical CPU other than 0
MIPS: Only set cpu_has_mmips if SYS_SUPPORTS_MICROMIPS
MIPS: GIC: Fix gic_set_affinity infinite loop
MIPS: Don't save/restore OCTEON wide multiplier state on syscalls.
...
|
|
Pull watchdog updates from Wim Van Sebroeck:
- lots of devm_ conversions and cleanup
- platform_set_drvdata cleanups
- s3c2410: dev_err/dev_info + dev_pm_ops
- watchdog_core: don't try to stop device if not running fix
- wdrtas: use print_hex_dump
- xilinx cleanups
- orion_wdt fixes
- softdog cleanup
- hpwdt: check on UEFI bits
- deletion of mpcore_wdt driver
- addition of broadcom BCM2835 watchdog timer driver
- addition of MEN A21 watcdog devices
* git://www.linux-watchdog.org/linux-watchdog: (38 commits)
watchdog: hpwdt: Add check for UEFI bits
watchdog: softdog: remove replaceable ping operation
watchdog: New watchdog driver for MEN A21 watchdogs
Watchdog: fix clearing of the watchdog interrupt
Watchdog: allow orion_wdt to be built for Dove
watchdog: Add Broadcom BCM2835 watchdog timer driver
watchdog: delete mpcore_wdt driver
watchdog: xilinx: Setup the origin compatible string
watchdog: xilinx: Fix driver header
watchdog: wdrtas: don't use custom version of print_hex_dump
watchdog: core: don't try to stop device if not running
watchdog: jz4740: Pass device to clk_get
watchdog: twl4030: Remove redundant platform_set_drvdata()
watchdog: mpcore: Remove redundant platform_set_drvdata()
watchdog: da9055: use platform_{get,set}_drvdata()
watchdog: da9052: use platform_{get,set}_drvdata()
watchdog: cpwd: use platform_{get,set}_drvdata()
watchdog: s3c2410_wdt: convert s3c2410wdt to dev_pm_ops
watchdog: s3c2410_wdt: use dev_err()/dev_info() instead of pr_err()/pr_info()
watchdog: wm831x: use platform_{get,set}_drvdata()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
Pull InfiniBand/RDMA changes from Roland Dreier:
- AF_IB (native IB addressing) for CMA from Sean Hefty
- new mlx5 driver for Mellanox Connect-IB adapters (including post
merge request fixes)
- SRP fixes from Bart Van Assche (including fix to first merge request)
- qib HW driver updates
- resurrection of ocrdma HW driver development
- uverbs conversion to create fds with O_CLOEXEC set
- other small changes and fixes
* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (66 commits)
mlx5: Return -EFAULT instead of -EPERM
IB/qib: Log all SDMA errors unconditionally
IB/qib: Fix module-level leak
mlx5_core: Adjust hca_cap.uar_page_sz to conform to Connect-IB spec
IB/srp: Let srp_abort() return FAST_IO_FAIL if TL offline
IB/uverbs: Use get_unused_fd_flags(O_CLOEXEC) instead of get_unused_fd()
mlx5_core: Fixes for sparse warnings
IB/mlx5: Make profile[] static in main.c
mlx5: Fix parameter type of health_handler_t
mlx5: Add driver for Mellanox Connect-IB adapters
IB/core: Add reserved values to enums for low-level driver use
IB/srp: Bump driver version and release date
IB/srp: Make HCA completion vector configurable
IB/srp: Maintain a single connection per I_T nexus
IB/srp: Fail I/O fast if target offline
IB/srp: Skip host settle delay
IB/srp: Avoid skipping srp_reset_host() after a transport error
IB/srp: Fix remove_one crash due to resource exhaustion
IB/qib: New transmitter tunning settings for Dell 1.1 backplane
IB/core: Fix error return code in add_port()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media updates from Mauro Carvalho Chehab:
"This series contain:
- new i2c video drivers: ml86v7667 (video decoder),
ths8200 (video encoder)
- a new video driver for EasyCap cards based on Fushicai USBTV007
- Improved support for OF and embedded systems, with V4L2 async
initialization and a better support for clocks
- API cleanups on the ioctls used by the v4l2 debug tool
- Lots of cleanups
- As usual, several driver improvements and new cards additions
- Revert two changesets that change the minimal symbol rate for
stv0399, as request by Manu
- Update MAINTAINERS and other files to point to my new e-mail"
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (378 commits)
MAINTAINERS & ABI: Update to point to my new email
[media] stb0899: restore minimal rate to 5Mbauds
[media] exynos4-is: Correct colorspace handling at FIMC-LITE
[media] exynos4-is: Set valid initial format on FIMC.n subdevs
[media] exynos4-is: Set valid initial format on FIMC-IS-ISP subdev pads
[media] exynos4-is: Fix format propagation on FIMC-IS-ISP subdev
[media] exynos4-is: Set valid initial format at FIMC-LITE
[media] exynos4-is: Fix format propagation on FIMC-LITE.n subdevs
[media] MAINTAINERS: Update S5P/Exynos FIMC driver entry
[media] Documentation: Update driver's directory in video4linux/fimc.txt
[media] exynos4-is: Change fimc-is firmware file names
[media] exynos4-is: Add support for Exynos5250 MIPI-CSIS
[media] exynos4-is: Add Exynos5250 SoC support to fimc-lite driver
[media] exynos4-is: Drop drvdata handling in fimc-lite for non-dt platforms
[media] media: i2c: tvp514x: remove manual setting of subdev name
[media] media: i2c: tvp7002: remove manual setting of subdev name
[media] mem2mem: set missing v4l2_dev pointer
[media] wl128x: add missing struct v4l2_device
[media] tvp514x: Fix init seqeunce
[media] saa7134: Fix sparse warnings by adding __user annotation
...
|
|
Pull more xfs updates from Ben Myers:
"Here are a fix for xfs_fsr, a cleanup in bulkstat, a cleanup in
xfs_open_by_handle, updated mount options documentation, a cleanup in
xfs_bmapi_write, a fix for the size of dquot log reservations, a fix
for sgid inheritance when acls are in use, a fix for cleaning up
quotainfo structures, and some more of the work which allows group and
project quotas to be used together.
We had a few more in this last quota category that we might have liked
to get in, but it looks there are still a few items that need to be
addressed.
- fix for xfs_fsr returning -EINVAL
- cleanup in xfs_bulkstat
- cleanup in xfs_open_by_handle
- update mount options documentation
- clean up local format handling in xfs_bmapi_write
- fix dquot log reservations which were too small
- fix sgid inheritance for subdirectories when default acls are in use
- add project quota fields to various structures
- fix teardown of quotainfo structures when quotas are turned off"
* tag 'for-linus-v3.11-rc1-2' of git://oss.sgi.com/xfs/xfs:
xfs: Fix the logic check for all quotas being turned off
xfs: Add pquota fields where gquota is used.
xfs: fix sgid inheritance for subdirectories inheriting default acls [V3]
xfs: dquot log reservations are too small
xfs: remove local fork format handling from xfs_bmapi_write()
xfs: update mount options documentation
xfs: use get_unused_fd_flags(0) instead of get_unused_fd()
xfs: clean up unused codes at xfs_bulkstat()
xfs: use XFS_BMAP_BMDR_SPACE vs. XFS_BROOT_SIZE_ADJ
|
|
Pull cifs fixes from Steve French:
"Fixes for 4 cifs bugs, including a reconnect problem, a problem
parsing responses to SMB2 open request, and setting nlink incorrectly
to some servers which don't report it properly on the wire. Also
improves data integrity on reconnect with series from Pavel which adds
durable handle support for SMB2."
* 'for-linus' of git://git.samba.org/sfrench/cifs-2.6:
CIFS: Fix a deadlock when a file is reopened
CIFS: Reopen the file if reconnect durable handle failed
[CIFS] Fix minor endian error in durable handle patch series
CIFS: Reconnect durable handles for SMB2
CIFS: Make SMB2_open use cifs_open_parms struct
CIFS: Introduce cifs_open_parms struct
CIFS: Request durable open for SMB2 opens
CIFS: Simplify SMB2 create context handling
CIFS: Simplify SMB2_open code path
CIFS: Respect create_options in smb2_open_file
CIFS: Fix lease context buffer parsing
[CIFS] use sensible file nlink values if unprovided
Limit allocation of crypto mechanisms to dialect which requires
|
|
llist_add(new, head) can simply use llist_add_batch(new, new, head),
no need to duplicate the code.
This obviously uninlines llist_add() and to me this is a win. But we
can make llist_add_batch() inline if this is desirable, in this case
gcc can notice that new_first == new_last if the caller is llist_add().
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrey Vagin <avagin@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
1. This is mostly theoretical, but llist_add*() need ACCESS_ONCE().
Otherwise it is not guaranteed that the first cmpxchg() uses the
same value for old_entry and new_last->next.
2. These helpers cache the result of cmpxchg() and read the initial
value of head->first before the main loop. I do not think this
makes sense. In the likely case cmpxchg() succeeds, otherwise
it doesn't hurt to reload head->first.
I think it would be better to simplify the code and simply read
->first before cmpxchg().
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrey Vagin <avagin@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
fput() and delayed_fput() can use llist and avoid the locking.
This is unlikely path, it is not that this change can improve
the performance, but this way the code looks simpler.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrey Vagin <avagin@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
A missed update to "fput: task_work_add() can fail if the caller has
passed exit_task_work()".
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrey Vagin <avagin@openvz.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
[suggested by Rasmus Villemoes] make O_DIRECTORY | O_RDWR part of O_TMPFILE;
that will fail on old kernels in a lot more cases than what I came up with.
And make sure O_CREAT doesn't get there...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
OMAP recently changed how the platforms are configured, so OMAP2/3/4 SoC
support is no longer enabled by default. Add them back.
Enable new ethernet driver for sunxi.
The i.MX console options moved due to resorting, no functional change.
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
Bring in second round of updates for 3.11.
|
|
A short series of fixes to libfc, libfcoe and fcoe.
Most patches fix formatting problems, one changes
the behavior of which discovered ports can/will be
logged into and another fixes a memory leak.
|
|
Static routes in this case are non-expiring routes which did not get
configured by autoconf or by icmpv6 redirects.
To make sure we actually get an ecmp route while searching for the first
one in this fib6_node's leafs, also make sure it matches the ecmp route
assumptions.
v2:
a) Removed RTF_EXPIRE check in dst.from chain. The check of RTF_ADDRCONF
already ensures that this route, even if added again without
RTF_EXPIRES (in case of a RA announcement with infinite timeout),
does not cause the rt6i_nsiblings logic to go wrong if a later RA
updates the expiration time later.
v3:
a) Allow RTF_EXPIRES routes to enter the ecmp route set. We have to do so,
because an pmtu event could update the RTF_EXPIRES flag and we would
not count this route, if another route joins this set. We now filter
only for RTF_GATEWAY|RTF_ADDRCONF|RTF_DYNAMIC, which are flags that
don't get changed after rt6_info construction.
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
this bug:
https://bugzilla.redhat.com/show_bug.cgi?id=951695
Reported a dma debug backtrace:
WARNING: at lib/dma-debug.c:937 check_unmap+0x47d/0x930()
Hardware name: To Be Filled By O.E.M.
via-rhine 0000:00:12.0: DMA-API: device driver failed to check map error[device
address=0x0000000075a837b2] [size=90 bytes] [mapped as single]
Modules linked in: ip6_tables gspca_spca561 gspca_main videodev media
snd_hda_codec_realtek snd_hda_intel i2c_viapro snd_hda_codec snd_hwdep snd_seq
ppdev mperf via_rhine coretemp snd_pcm mii microcode snd_page_alloc snd_timer
snd_mpu401 snd_mpu401_uart snd_rawmidi snd_seq_device snd soundcore parport_pc
parport shpchp ata_generic pata_acpi radeon i2c_algo_bit drm_kms_helper ttm drm
pata_via sata_via i2c_core uinput
Pid: 295, comm: systemd-journal Not tainted 3.9.0-0.rc6.git2.1.fc20.x86_64 #1
Call Trace:
<IRQ> [<ffffffff81068dd0>] warn_slowpath_common+0x70/0xa0
[<ffffffff81068e4c>] warn_slowpath_fmt+0x4c/0x50
[<ffffffff8137ec6d>] check_unmap+0x47d/0x930
[<ffffffff810ace9f>] ? local_clock+0x5f/0x70
[<ffffffff8137f17f>] debug_dma_unmap_page+0x5f/0x70
[<ffffffffa0225edc>] ? rhine_ack_events.isra.14+0x3c/0x50 [via_rhine]
[<ffffffffa02275f8>] rhine_napipoll+0x1d8/0xd80 [via_rhine]
[<ffffffff815d3d51>] ? net_rx_action+0xa1/0x380
[<ffffffff815d3e22>] net_rx_action+0x172/0x380
[<ffffffff8107345f>] __do_softirq+0xff/0x400
[<ffffffff81073925>] irq_exit+0xb5/0xc0
[<ffffffff81724cd6>] do_IRQ+0x56/0xc0
[<ffffffff81719ff2>] common_interrupt+0x72/0x72
<EOI> [<ffffffff8170ff57>] ? __slab_alloc+0x4c2/0x526
[<ffffffff811992e0>] ? mmap_region+0x2b0/0x5a0
[<ffffffff810d5807>] ? __lock_is_held+0x57/0x80
[<ffffffff811992e0>] ? mmap_region+0x2b0/0x5a0
[<ffffffff811bf1bf>] kmem_cache_alloc+0x2df/0x360
[<ffffffff811992e0>] mmap_region+0x2b0/0x5a0
[<ffffffff811998e6>] do_mmap_pgoff+0x316/0x3d0
[<ffffffff81183ca0>] vm_mmap_pgoff+0x90/0xc0
[<ffffffff81197d6c>] sys_mmap_pgoff+0x4c/0x190
[<ffffffff81367d7e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[<ffffffff8101eb42>] sys_mmap+0x22/0x30
[<ffffffff81722fd9>] system_call_fastpath+0x16/0x1b
Usual problem with the usual fix, add the appropriate calls to dma_mapping_error
where appropriate
Untested, as I don't have hardware, but its pretty straightforward
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: David S. Miller <davem@davemloft.net>
CC: Roger Luethi <rl@hellgate.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|