Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs pile 1 from Al Viro:
"Unfortunately, this merge window it'll have a be a lot of small piles -
my fault, actually, for not keeping #for-next in anything that would
resemble a sane shape ;-/
This pile: assorted fixes (the first 3 are -stable fodder, IMO) and
cleanups + %pd/%pD formats (dentry/file pathname, up to 4 last
components) + several long-standing patches from various folks.
There definitely will be a lot more (starting with Miklos'
check_submount_and_drop() series)"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (26 commits)
direct-io: Handle O_(D)SYNC AIO
direct-io: Implement generic deferred AIO completions
add formats for dentry/file pathnames
kvm eventfd: switch to fdget
powerpc kvm: use fdget
switch fchmod() to fdget
switch epoll_ctl() to fdget
switch copy_module_from_fd() to fdget
git simplify nilfs check for busy subtree
ibmasmfs: don't bother passing superblock when not needed
don't pass superblock to hypfs_{mkdir,create*}
don't pass superblock to hypfs_diag_create_files
don't pass superblock to hypfs_vm_create_files()
oprofile: get rid of pointless forward declarations of struct super_block
oprofilefs_create_...() do not need superblock argument
oprofilefs_mkdir() doesn't need superblock argument
don't bother with passing superblock to oprofile_create_stats_files()
oprofile: don't bother with passing superblock to ->create_files()
don't bother passing sb to oprofile_create_files()
coh901318: don't open-code simple_read_from_buffer()
...
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
DEBUG_KOBJECT_RELEASE helps to find the issue attached below.
After some investigation, it seems the reason is:
The mod->mkobj.kobj(ffffffffa01600d0 below) is freed together with mod
itself in free_module(). However, its children still hold references to
it, as the delay caused by DEBUG_KOBJECT_RELEASE. So when the
child(holders below) tries to decrease the reference count to its parent
in kobject_del(), BUG happens as it tries to access already freed memory.
This patch tries to fix it by waiting for the mod->mkobj.kobj to be
really released in the module removing process (and some error code
paths).
[ 1844.175287] kobject: 'holders' (ffff88007c1f1600): kobject_release, parent ffffffffa01600d0 (delayed)
[ 1844.178991] kobject: 'notes' (ffff8800370b2a00): kobject_release, parent ffffffffa01600d0 (delayed)
[ 1845.180118] kobject: 'holders' (ffff88007c1f1600): kobject_cleanup, parent ffffffffa01600d0
[ 1845.182130] kobject: 'holders' (ffff88007c1f1600): auto cleanup kobject_del
[ 1845.184120] BUG: unable to handle kernel paging request at ffffffffa01601d0
[ 1845.185026] IP: [<ffffffff812cda81>] kobject_put+0x11/0x60
[ 1845.185026] PGD 1a13067 PUD 1a14063 PMD 7bd30067 PTE 0
[ 1845.185026] Oops: 0000 [#1] PREEMPT
[ 1845.185026] Modules linked in: xfs libcrc32c [last unloaded: kprobe_example]
[ 1845.185026] CPU: 0 PID: 18 Comm: kworker/0:1 Tainted: G O 3.11.0-rc6-next-20130819+ #1
[ 1845.185026] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[ 1845.185026] Workqueue: events kobject_delayed_cleanup
[ 1845.185026] task: ffff88007ca51f00 ti: ffff88007ca5c000 task.ti: ffff88007ca5c000
[ 1845.185026] RIP: 0010:[<ffffffff812cda81>] [<ffffffff812cda81>] kobject_put+0x11/0x60
[ 1845.185026] RSP: 0018:ffff88007ca5dd08 EFLAGS: 00010282
[ 1845.185026] RAX: 0000000000002000 RBX: ffffffffa01600d0 RCX: ffffffff8177d638
[ 1845.185026] RDX: ffff88007ca5dc18 RSI: 0000000000000000 RDI: ffffffffa01600d0
[ 1845.185026] RBP: ffff88007ca5dd18 R08: ffffffff824e9810 R09: ffffffffffffffff
[ 1845.185026] R10: ffff8800ffffffff R11: dead4ead00000001 R12: ffffffff81a95040
[ 1845.185026] R13: ffff88007b27a960 R14: ffff88007c1f1600 R15: 0000000000000000
[ 1845.185026] FS: 0000000000000000(0000) GS:ffffffff81a23000(0000) knlGS:0000000000000000
[ 1845.185026] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1845.185026] CR2: ffffffffa01601d0 CR3: 0000000037207000 CR4: 00000000000006b0
[ 1845.185026] Stack:
[ 1845.185026] ffff88007c1f1600 ffff88007c1f1600 ffff88007ca5dd38 ffffffff812cdb7e
[ 1845.185026] 0000000000000000 ffff88007c1f1640 ffff88007ca5dd68 ffffffff812cdbfe
[ 1845.185026] ffff88007c974800 ffff88007c1f1640 ffff88007ff61a00 0000000000000000
[ 1845.185026] Call Trace:
[ 1845.185026] [<ffffffff812cdb7e>] kobject_del+0x2e/0x40
[ 1845.185026] [<ffffffff812cdbfe>] kobject_delayed_cleanup+0x6e/0x1d0
[ 1845.185026] [<ffffffff81063a45>] process_one_work+0x1e5/0x670
[ 1845.185026] [<ffffffff810639e3>] ? process_one_work+0x183/0x670
[ 1845.185026] [<ffffffff810642b3>] worker_thread+0x113/0x370
[ 1845.185026] [<ffffffff810641a0>] ? rescuer_thread+0x290/0x290
[ 1845.185026] [<ffffffff8106bfba>] kthread+0xda/0xe0
[ 1845.185026] [<ffffffff814ff0f0>] ? _raw_spin_unlock_irq+0x30/0x60
[ 1845.185026] [<ffffffff8106bee0>] ? kthread_create_on_node+0x130/0x130
[ 1845.185026] [<ffffffff8150751a>] ret_from_fork+0x7a/0xb0
[ 1845.185026] [<ffffffff8106bee0>] ? kthread_create_on_node+0x130/0x130
[ 1845.185026] Code: 81 48 c7 c7 28 95 ad 81 31 c0 e8 9b da 01 00 e9 4f ff ff ff 66 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb 48 83 ec 08 48 85 ff 74 1d <f6> 87 00 01 00 00 01 74 1e 48 8d 7b 38 83 6b 38 01 0f 94 c0 84
[ 1845.185026] RIP [<ffffffff812cda81>] kobject_put+0x11/0x60
[ 1845.185026] RSP <ffff88007ca5dd08>
[ 1845.185026] CR2: ffffffffa01601d0
[ 1845.185026] ---[ end trace 49a70afd109f5653 ]---
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
For some strings, they are permitted to be larger than PAGE_SIZE, so
need use scnprintf() instead of sprintf(), or it will cause issue.
One case is:
if a module version is crazy defined (length more than PAGE_SIZE),
'modinfo' command is still OK (print full contents),
but for "cat /sys/modules/'modname'/version", will cause issue in kernel.
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
The ops that uses param_set_bool_enable_only() as its set function can
easily handle being used without an argument. There's no reason to
fail the loading of the module if it does not have one.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull module updates from Rusty Russell:
"Nothing interesting. Except the most embarrassing bugfix ever. But
let's ignore that"
* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
module: cleanup call chain.
module: do percpu allocation after uniqueness check. No, really!
modules: don't fail to load on unknown parameters.
ABI: Clarify when /sys/module/MODULENAME is created
There is no /sys/parameters
module: don't modify argument of module_kallsyms_lookup_name()
|
|
Fold alloc_module_percpu into percpu_modalloc().
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
v3.8-rc1-5-g1fb9341 was supposed to stop parallel kvm loads exhausting
percpu memory on large machines:
Now we have a new state MODULE_STATE_UNFORMED, we can insert the
module into the list (and thus guarantee its uniqueness) before we
allocate the per-cpu region.
In my defence, it didn't actually say the patch did this. Just that
we "can".
This patch actually *does* it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Tested-by: Jim Hull <jim.hull@hp.com>
Cc: stable@kernel.org # 3.8
|
|
Although parameters are supposed to be part of the kernel API, experimental
parameters are often removed. In addition, downgrading a kernel might cause
previously-working modules to fail to load.
On balance, it's probably better to warn, and load the module anyway.
This may let through a typo, but at least the logs will show it.
Reported-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
If we pass a pointer to a const string in the form "module:symbol"
module_kallsyms_lookup_name() will try to split the string at the colon,
i.e., will try to modify r/o data. That will, in fact, fail on a kernel
with enabled CONFIG_DEBUG_RODATA.
Avoid modifying the passed string in module_kallsyms_lookup_name(),
modify find_module_all() instead to pass it the module name length.
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
As kmemleak now scans all module sections that are allocated, writable
and non executable, there's no need to scan individual sections that
might reference data.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
Instead of just picking data sections by name (names that start
with .data, .bss or .ref.data), use the section flags and scan all
sections that are allocated, writable and not executable. Which should
cover all sections of a module that might reference data.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
[catalin.marinas@arm.com: removed unused 'name' variable]
[catalin.marinas@arm.com: collapsed 'if' blocks]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
Otherwise we get a race between unload and reload of the same module:
the new module doesn't see the old one in the list, but then fails because
it can't register over the still-extant entries in sysfs:
[ 103.981925] ------------[ cut here ]------------
[ 103.986902] WARNING: at fs/sysfs/dir.c:536 sysfs_add_one+0xab/0xd0()
[ 103.993606] Hardware name: CrownBay Platform
[ 103.998075] sysfs: cannot create duplicate filename '/module/pch_gbe'
[ 104.004784] Modules linked in: pch_gbe(+) [last unloaded: pch_gbe]
[ 104.011362] Pid: 3021, comm: modprobe Tainted: G W 3.9.0-rc5+ #5
[ 104.018662] Call Trace:
[ 104.021286] [<c103599d>] warn_slowpath_common+0x6d/0xa0
[ 104.026933] [<c1168c8b>] ? sysfs_add_one+0xab/0xd0
[ 104.031986] [<c1168c8b>] ? sysfs_add_one+0xab/0xd0
[ 104.037000] [<c1035a4e>] warn_slowpath_fmt+0x2e/0x30
[ 104.042188] [<c1168c8b>] sysfs_add_one+0xab/0xd0
[ 104.046982] [<c1168dbe>] create_dir+0x5e/0xa0
[ 104.051633] [<c1168e78>] sysfs_create_dir+0x78/0xd0
[ 104.056774] [<c1262bc3>] kobject_add_internal+0x83/0x1f0
[ 104.062351] [<c126daf6>] ? kvasprintf+0x46/0x60
[ 104.067231] [<c1262ebd>] kobject_add_varg+0x2d/0x50
[ 104.072450] [<c1262f07>] kobject_init_and_add+0x27/0x30
[ 104.078075] [<c1089240>] mod_sysfs_setup+0x80/0x540
[ 104.083207] [<c1260851>] ? module_bug_finalize+0x51/0xc0
[ 104.088720] [<c108ab29>] load_module+0x1429/0x18b0
We can teardown sysfs first, then to be sure, put the state in
MODULE_STATE_UNFORMED so it's ignored while we deconstruct it.
Reported-by: Veaceslav Falico <vfalico@redhat.com>
Tested-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
Fix symbol versioning on architectures with symbol prefixes. Although
the build was free from warnings the actual modules still wouldn't load
as the ____versions table contained unprefixed symbol names, which were
being compared against the prefixed symbol names when checking the
symbol versions.
This is fixed by modifying modpost to add the symbol prefix to the
____versions table it outputs (Modules.symvers still contains unprefixed
symbol names). The check_modstruct_version() function is also fixed as
it checks the version of the unprefixed "module_layout" symbol which
would no longer work.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jonathan Kliegman <kliegs@chromium.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (use VMLINUX_SYMBOL_STR)
|
|
We have CONFIG_SYMBOL_PREFIX, which three archs define to the string
"_". But Al Viro broke this in "consolidate cond_syscall and
SYSCALL_ALIAS declarations" (in linux-next), and he's not the first to
do so.
Using CONFIG_SYMBOL_PREFIX is awkward, since we usually just want to
prefix it so something. So various places define helpers which are
defined to nothing if CONFIG_SYMBOL_PREFIX isn't set:
1) include/asm-generic/unistd.h defines __SYMBOL_PREFIX.
2) include/asm-generic/vmlinux.lds.h defines VMLINUX_SYMBOL(sym)
3) include/linux/export.h defines MODULE_SYMBOL_PREFIX.
4) include/linux/kernel.h defines SYMBOL_PREFIX (which differs from #7)
5) kernel/modsign_certificate.S defines ASM_SYMBOL(sym)
6) scripts/modpost.c defines MODULE_SYMBOL_PREFIX
7) scripts/Makefile.lib defines SYMBOL_PREFIX on the commandline if
CONFIG_SYMBOL_PREFIX is set, so that we have a non-string version
for pasting.
(arch/h8300/include/asm/linkage.h defines SYMBOL_NAME(), too).
Let's solve this properly:
1) No more generic prefix, just CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX.
2) Make linux/export.h usable from asm.
3) Define VMLINUX_SYMBOL() and VMLINUX_SYMBOL_STR().
4) Make everyone use them.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Tested-by: James Hogan <james.hogan@imgtec.com> (metag)
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs pile (part one) from Al Viro:
"Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent
locking violations, etc.
The most visible changes here are death of FS_REVAL_DOT (replaced with
"has ->d_weak_revalidate()") and a new helper getting from struct file
to inode. Some bits of preparation to xattr method interface changes.
Misc patches by various people sent this cycle *and* ocfs2 fixes from
several cycles ago that should've been upstream right then.
PS: the next vfs pile will be xattr stuff."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
saner proc_get_inode() calling conventions
proc: avoid extra pde_put() in proc_fill_super()
fs: change return values from -EACCES to -EPERM
fs/exec.c: make bprm_mm_init() static
ocfs2/dlm: use GFP_ATOMIC inside a spin_lock
ocfs2: fix possible use-after-free with AIO
ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path
get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero
target: writev() on single-element vector is pointless
export kernel_write(), convert open-coded instances
fs: encode_fh: return FILEID_INVALID if invalid fid_type
kill f_vfsmnt
vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op
nfsd: handle vfs_getattr errors in acl protocol
switch vfs_getattr() to struct path
default SET_PERSONALITY() in linux/elf.h
ceph: prepopulate inodes only when request is aborted
d_hash_and_lookup(): export, switch open-coded instances
9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()
9p: split dropping the acls from v9fs_set_create_acl()
...
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
1fb9341ac34825aa40354e74d9a2c69df7d2c304 made our locking in
load_module more complicated: we grab the mutex once to insert the
module in the list, then again to upgrade it once it's formed.
Since the locking is self-contained, it's neater to do this in
separate functions.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
Fix up all callers as they were before, with make one change: an
unsigned module taints the kernel, but doesn't turn off lockdep.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
Reported-by: Chris Samuel <chris@csamuel.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
Commit 1fb9341ac348 ("module: put modules in list much earlier") moved
some of the module initialization code around, and in the process
changed the exit paths too. But for the duplicate export symbol error
case the change made the ddebug_cleanup path jump to after the module
mutex unlock, even though it happens with the mutex held.
Rusty has some patches to split this function up into some helper
functions, hopefully the mess of complex goto targets will go away
eventually.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull module fixes and a virtio block fix from Rusty Russell:
"Various minor fixes, but a slightly more complex one to fix the
per-cpu overload problem introduced recently by kvm id changes."
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
module: put modules in list much earlier.
module: add new state MODULE_STATE_UNFORMED.
module: prevent warning when finit_module a 0 sized file
virtio-blk: Don't free ida when disk is in use
|
|
If the default iosched is built as module, the kernel may deadlock
while trying to load the iosched module on device probe if the probing
was running off async. This is because async_synchronize_full() at
the end of module init ends up waiting for the async job which
initiated the module loading.
async A modprobe
1. finds a device
2. registers the block device
3. request_module(default iosched)
4. modprobe in userland
5. load and init module
6. async_synchronize_full()
Async A waits for modprobe to finish in request_module() and modprobe
waits for async A to finish in async_synchronize_full().
Because there's no easy to track dependency once control goes out to
userland, implementing properly nested flushing is difficult. For
now, make module init perform async_synchronize_full() iff module init
has queued async jobs as suggested by Linus.
This avoids the described deadlock because iosched module doesn't use
async and thus wouldn't invoke async_synchronize_full(). This is
hacky and incomplete. It will deadlock if async module loading nests;
however, this works around the known problem case and seems to be the
best of bad options.
For more details, please refer to the following thread.
http://thread.gmane.org/gmane.linux.kernel/1420814
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Alex Riesen <raa.lkml@gmail.com>
Tested-by: Ming Lei <ming.lei@canonical.com>
Tested-by: Alex Riesen <raa.lkml@gmail.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Prarit's excellent bug report:
> In recent Fedora releases (F17 & F18) some users have reported seeing
> messages similar to
>
> [ 15.478160] kvm: Could not allocate 304 bytes percpu data
> [ 15.478174] PERCPU: allocation failed, size=304 align=32, alloc from
> reserved chunk failed
>
> during system boot. In some cases, users have also reported seeing this
> message along with a failed load of other modules.
>
> What is happening is systemd is loading an instance of the kvm module for
> each cpu found (see commit e9bda3b). When the module load occurs the kernel
> currently allocates the modules percpu data area prior to checking to see
> if the module is already loaded or is in the process of being loaded. If
> the module is already loaded, or finishes load, the module loading code
> releases the current instance's module's percpu data.
Now we have a new state MODULE_STATE_UNFORMED, we can insert the
module into the list (and thus guarantee its uniqueness) before we
allocate the per-cpu region.
Reported-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Tested-by: Prarit Bhargava <prarit@redhat.com>
|
|
You should never look at such a module, so it's excised from all paths
which traverse the modules list.
We add the state at the end, to avoid gratuitous ABI break (ksplice).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
If we try to finit_module on a file sized 0 bytes vmalloc will
scream and spit out a warning.
Since modules have to be bigger than 0 bytes anyways we can just
check that beforehand and avoid the warning.
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull module update from Rusty Russell:
"Nothing all that exciting; a new module-from-fd syscall for those who
want to verify the source of the module (ChromeOS) and/or use standard
IMA on it or other security hooks."
* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
MODSIGN: Fix kbuild output when using default extra_certificates
MODSIGN: Avoid using .incbin in C source
modules: don't hand 0 to vmalloc.
module: Remove a extra null character at the top of module->strtab.
ASN.1: Use the ASN1_LONG_TAG and ASN1_INDEFINITE_LENGTH constants
ASN.1: Define indefinite length marker constant
moduleparam: use __UNIQUE_ID()
__UNIQUE_ID()
MODSIGN: Add modules_sign make target
powerpc: add finit_module syscall.
ima: support new kernel module syscall
add finit_module syscall to asm-generic
ARM: add finit_module syscall to ARM
security: introduce kernel_module_from_file hook
module: add flags arg to sys_finit_module()
module: add syscall to load module from fd
|
|
In commit 9c0ece069b32 ("Get rid of Documentation/feature-removal.txt"),
Linus removed feature-removal-schedule.txt from Documentation, but there
is still some reference to this file. So remove them.
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
In commit d0a21265dfb5fa8a David Rientjes unified various archs'
module_alloc implementation (including x86) and removed the graduitous
shortcut for size == 0.
Then, in commit de7d2b567d040e3b, Joe Perches added a warning for
zero-length vmallocs, which can happen without kallsyms on modules
with no init sections (eg. zlib_deflate).
Fix this once and for all; the module code has to handle zero length
anyway, so get it right at the caller and remove the now-gratuitous
checks within the arch-specific module_alloc implementations.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=42608
Reported-by: Conrad Kostecki <ConiKost@gmx.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
There is a extra null character('\0') at the top of module->strtab for
each module. Commit 59ef28b introduced this bug and this patch fixes it.
Live dump log of the current linus git kernel(HEAD is 2844a4870):
============================================================================
crash> mod | grep loop
ffffffffa01db0a0 loop 16689 (not loaded) [CONFIG_KALLSYMS]
crash> module.core_symtab ffffffffa01db0a0
core_symtab = 0xffffffffa01db320crash> rd 0xffffffffa01db320 12
ffffffffa01db320: 0000005500000001 0000000000000000 ....U...........
ffffffffa01db330: 0000000000000000 0002007400000002 ............t...
ffffffffa01db340: ffffffffa01d8000 0000000000000038 ........8.......
ffffffffa01db350: 001a00640000000e ffffffffa01daeb0 ....d...........
ffffffffa01db360: 00000000000000a0 0002007400000019 ............t...
ffffffffa01db370: ffffffffa01d8068 000000000000001b h...............
crash> module.core_strtab ffffffffa01db0a0
core_strtab = 0xffffffffa01dbb30 ""
crash> rd 0xffffffffa01dbb30 4
ffffffffa01dbb30: 615f70616d6b0000 66780063696d6f74 ..kmap_atomic.xf
ffffffffa01dbb40: 73636e75665f7265 72665f646e696600 er_funcs.find_fr
============================================================================
We expect Just first one byte of '\0', but actually first two bytes
are '\0'. Here is The relationship between symtab and strtab.
symtab_idx strtab_idx symbol
-----------------------------------------------
0 0x1 "\0" # startab_idx should be 0
1 0x2 "kmap_atomic"
2 0xe "xfer_funcs"
3 0x19 "find_fr..."
By applying this patch, it becomes as follows.
symtab_idx strtab_idx symbol
-----------------------------------------------
0 0x0 "\0" # extra byte is removed
1 0x1 "kmap_atomic"
2 0xd "xfer_funcs"
3 0x18 "find_fr..."
Signed-off-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
Cc: Masaki Kimura <masaki.kimura.kz@hitachi.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
Now that kernel module origins can be reasoned about, provide a hook to
the LSMs to make policy decisions about the module file. This will let
Chrome OS enforce that loadable kernel modules can only come from its
read-only hash-verified root filesystem. Other LSMs can, for example,
read extended attributes for signatures, etc.
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Acked-by: Eric Paris <eparis@redhat.com>
Acked-by: Mimi Zohar <zohar@us.ibm.com>
Acked-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
Thanks to Michael Kerrisk for keeping us honest. These flags are actually
useful for eliminating the only case where kmod has to mangle a module's
internals: for overriding module versioning.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
Acked-by: Kees Cook <keescook@chromium.org>
|
|
As part of the effort to create a stronger boundary between root and
kernel, Chrome OS wants to be able to enforce that kernel modules are
being loaded only from our read-only crypto-hash verified (dm_verity)
root filesystem. Since the init_module syscall hands the kernel a module
as a memory blob, no reasoning about the origin of the blob can be made.
Earlier proposals for appending signatures to kernel modules would not be
useful in Chrome OS, since it would involve adding an additional set of
keys to our kernel and builds for no good reason: we already trust the
contents of our root filesystem. We don't need to verify those kernel
modules a second time. Having to do signature checking on module loading
would slow us down and be redundant. All we need to know is where a
module is coming from so we can say yes/no to loading it.
If a file descriptor is used as the source of a kernel module, many more
things can be reasoned about. In Chrome OS's case, we could enforce that
the module lives on the filesystem we expect it to live on. In the case
of IMA (or other LSMs), it would be possible, for example, to examine
extended attributes that may contain signatures over the contents of
the module.
This introduces a new syscall (on x86), similar to init_module, that has
only two arguments. The first argument is used as a file descriptor to
the module and the second argument is a pointer to the NULL terminated
string of module arguments.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (merge fixes)
|
|
Masaki found and patched a kallsyms issue: the last symbol in a
module's symtab wasn't transferred. This is because we manually copy
the zero'th entry (which is always empty) then copy the rest in a loop
starting at 1, though from src[0]. His fix was minimal, I prefer to
rewrite the loops in more standard form.
There are two loops: one to get the size, and one to copy. Make these
identical: always count entry 0 and any defined symbol in an allocated
non-init section.
This bug exists since the following commit was introduced.
module: reduce symbol table for loaded modules (v2)
commit: 4a4962263f07d14660849ec134ee42b63e95ea9a
LKML: http://lkml.org/lkml/2012/10/24/27
Reported-by: Masaki Kimura <masaki.kimura.kz@hitachi.com>
Cc: stable@kernel.org
|
|
Emit the magic string that indicates a module has a signature after the
signature data instead of before it. This allows module_sig_check() to
be made simpler and faster by the elimination of the search for the
magic string. Instead we just need to do a single memcmp().
This works because at the end of the signature data there is the
fixed-length signature information block. This block then falls
immediately prior to the magic number.
From the contents of the information block, it is trivial to calculate
the size of the signature data and thus the size of the actual module
data.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
If we're in FIPS mode, we should panic if we fail to verify the signature on a
module or we're asked to load an unsigned module in signature enforcing mode.
Possibly FIPS mode should automatically enable enforcing mode.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
We do a very simple search for a particular string appended to the module
(which is cache-hot and about to be SHA'd anyway). There's both a config
option and a boot parameter which control whether we accept or fail with
unsigned modules and modules that are signed with an unknown key.
If module signing is enabled, the kernel will be tainted if a module is
loaded that is unsigned or has a signature for which we don't have the
key.
(Useful feedback and tweaks by David Howells <dhowells@redhat.com>)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
The original module-init-tools module loader used a fnctl lock on the
.ko file to avoid attempts to simultaneously load a module.
Unfortunately, you can't get an exclusive fcntl lock on a read-only
fd, making this not work for read-only mounted filesystems.
module-init-tools has a hacky sleep-and-loop for this now.
It's not that hard to wait in the kernel, and only return -EEXIST once
the first module has finished loading (or continue loading the module
if the first one failed to initialize for some reason). It's also
consistent with what we do for dependent modules which are still loading.
Suggested-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
We use resolve_symbol_wait(), which blocks if the module containing
the symbol is still loading. However:
1) The module_wq we use is only woken after calling the modules' init
function, but there are other failure paths after the module is
placed in the linked list where we need to do the same thing.
2) wake_up() only wakes one waiter, and our waitqueue is shared by all
modules, so we need to wake them all.
3) wake_up_all() doesn't imply a memory barrier: I feel happier calling
it after we've grabbed and dropped the module_mutex, not just after
the state assignment.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
Use the mapping of Elf_[SPE]hdr, Elf_Addr, Elf_Sym, Elf_Dyn, Elf_Rel/Rela,
ELF_R_TYPE() and ELF_R_SYM() to either the 32-bit version or the 64-bit version
into asm-generic/module.h for all arches bar MIPS.
Also, use the generic definition mod_arch_specific where possible.
To this end, I've defined three new config bools:
(*) HAVE_MOD_ARCH_SPECIFIC
Arches define this if they don't want to use the empty generic
mod_arch_specific struct.
(*) MODULES_USE_ELF_RELA
Arches define this if their modules can contain RELA records. This causes
the Elf_Rela mapping to be emitted and allows apply_relocate_add() to be
defined by the arch rather than have the core emit an error message.
(*) MODULES_USE_ELF_REL
Arches define this if their modules can contain REL records. This causes
the Elf_Rel mapping to be emitted and allows apply_relocate() to be
defined by the arch rather than have the core emit an error message.
Note that it is possible to allow both REL and RELA records: m68k and mips are
two arches that do this.
With this, some arch asm/module.h files can be deleted entirely and replaced
with a generic-y marker in the arch Kbuild file.
Additionally, I have removed the bits from m32r and score that handle the
unsupported type of relocation record as that's now handled centrally.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
Cloudlinux have a product called lve that includes a kernel module. This
was previously GPLed but is now under a proprietary license, but the
module continues to declare MODULE_LICENSE("GPL") and makes use of some
EXPORT_SYMBOL_GPL symbols. Forcibly taint it in order to avoid this.
Signed-off-by: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Alex Lyashkov <umka@cloudlinux.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org
|
|
The check:
if (len < hdr->e_shoff + hdr->e_shnum * sizeof(Elf_Shdr))
may not work if there's an overflow in the right-hand side of the condition.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
This introduces a fake module param $module.dyndbg. Its based upon
Thomas Renninger's $module.ddebug boot-time debugging patch from
https://lkml.org/lkml/2010/9/15/397
The 'fake' module parameter is provided for all modules, whether or
not they need it. It is not explicitly added to each module, but is
implemented in callbacks invoked from parse_args.
For builtin modules, dynamic_debug_init() now directly calls
parse_args(..., &ddebug_dyndbg_boot_params_cb), to process the params
undeclared in the modules, just after the ddebug tables are processed.
While its slightly weird to reprocess the boot params, parse_args() is
already called repeatedly by do_initcall_levels(). More importantly,
the dyndbg queries (given in ddebug_query or dyndbg params) cannot be
activated until after the ddebug tables are ready, and reusing
parse_args is cleaner than doing an ad-hoc parse. This reparse would
break options like inc_verbosity, but they probably should be params,
like verbosity=3.
ddebug_dyndbg_boot_params_cb() handles both bare dyndbg (aka:
ddebug_query) and module-prefixed dyndbg params, and ignores all other
parameters. For example, the following will enable pr_debug()s in 4
builtin modules, in the order given:
dyndbg="module params +p; module aio +p" module.dyndbg=+p pci.dyndbg
For loadable modules, parse_args() in load_module() calls
ddebug_dyndbg_module_params_cb(). This handles bare dyndbg params as
passed from modprobe, and errors on other unknown params.
Note that modprobe reads /proc/cmdline, so "modprobe foo" grabs all
foo.params, strips the "foo.", and passes these to the kernel.
ddebug_dyndbg_module_params_cb() is again called for the unknown
params; it handles dyndbg, and errors on others. The "doing" arg
added previously contains the module name.
For non CONFIG_DYNAMIC_DEBUG builds, the stub function accepts
and ignores $module.dyndbg params, other unknowns get -ENOENT.
If no param value is given (as in pci.dyndbg example above), "+p" is
assumed, which enables all pr_debug callsites in the module.
The dyndbg fake parameter is not shown in /sys/module/*/parameters,
thus it does not use any resources. Changes to it are made via the
control file.
Also change pr_info in ddebug_exec_queries to vpr_info,
no need to see it all the time.
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
CC: Thomas Renninger <trenn@suse.de>
CC: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Module size was limited to 64MB, this was legacy limitation due to vmalloc()
which was removed a while ago.
Limiting module size to 64MB is both pointless and affects real world use
cases.
Cc: Tim Abbott <tim.abbott@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
With the preempt, tracepoint and everything, it's getting a bit
chubby. For an Ubuntu-based config:
Before:
$ size -t `find * -name '*.ko'` | grep TOTAL
56199906 3870760 1606616 61677282 3ad1ee2 (TOTALS)
$ size vmlinux
text data bss dec hex filename
8509342 850368 3358720 12718430 c2115e vmlinux
After:
$ size -t `find * -name '*.ko'` | grep TOTAL
56183760 3867892 1606616 61658268 3acd49c (TOTALS)
$ size vmlinux
text data bss dec hex filename
8501842 849088 3358720 12709650 c1ef12 vmlinux
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (made all out-of-line)
|
|
This patch adds a set of macros that can be used to declare
kernel parameters to be parsed _before_ initcalls at a chosen
level are executed. We rename the now-unused "flags" field of
struct kernel_param as the level. It's signed, for when we
use this for early params as well, in future.
Linker macro collating init calls had to be modified in order
to add additional symbols between levels that are later used
by the init code to split the calls into blocks.
Signed-off-by: Pawel Moll <pawel.moll@arm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
Sometimes we need to test a kernel of same version with code or config
option changes.
We already have sysctl to disable module load, but add a kernel
parameter will be more convenient.
Since modules_disabled is int, so here use bint type in core_param.
TODO: make sysctl accept bool and change modules_disabled to bool
Signed-off-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
Recent changes to kernel/module.c caused the following compile
error:
kernel/module.c: In function ‘show_taint’:
kernel/module.c:1024:2: error: implicit declaration of function ‘module_flags_taint’ [-Werror=implicit-function-declaration]
cc1: some warnings being treated as errors
Correct this error by moving the definition of module_flags_taint
outside of the #ifdef CONFIG_MODULE_UNLOAD section.
Signed-off-by: Kevin Winchester <kjwinchester@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Recent tools do not want to use /proc to retrieve module information. A few
values are currently missing from sysfs to replace the information available
in /proc/modules.
This adds /sys/module/*/{coresize,initsize,taint} attributes.
TAINT_PROPRIETARY_MODULE (P) and TAINT_OOT_MODULE (O) flags are both always
shown now, and do no longer exclude each other, also in /proc/modules.
Replace the open-coded sysfs attribute initializers with the __ATTR() macro.
Add the new attributes to Documentation/ABI.
Cc: Lucas De Marchi <lucas.demarchi@profusion.mobi>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
Use more flexible pr_debug. This allows:
echo "module module +p" > /dbg/dynamic_debug/control
to turn on debug messages when needed.
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|