Age | Commit message (Collapse) | Author |
|
Block layer allows selecting an elevator which is built as a module to
be selected as system default via kernel param "elevator=". This is
achieved by automatically invoking request_module() whenever a new
block device is initialized and the elevator is not available.
This led to an interesting deadlock problem involving async and module
init. Block device probing running off an async job invokes
request_module(). While the module is being loaded, it performs
async_synchronize_full() which ends up waiting for the async job which
is already waiting for request_module() to finish, leading to
deadlock.
Invoking request_module() from deep in block device init path is
already nasty in itself. It seems best to avoid these situations from
the beginning by moving on-demand module loading out of block init
path.
The previous patch made sure that the default elevator module is
loaded early during boot if available. This patch removes on-demand
loading of the default elevator from elevator init path. As the
module would have been loaded during boot, userland-visible behavior
difference should be minimal.
For more details, please refer to the following thread.
http://thread.gmane.org/gmane.linux.kernel/1420814
v2: The bool parameter was named @request_module which conflicted with
request_module(). This built okay w/ CONFIG_MODULES because
request_module() was defined as a macro. W/o CONFIG_MODULES, it
causes build breakage. Rename the parameter to @try_loading.
Reported by Fengguang.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Alex Riesen <raa.lkml@gmail.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
|
|
This patch adds default module loading and uses it to load the default
block elevator. During boot, it's called right after initramfs or
initrd is made available and right before control is passed to
userland. This ensures that as long as the modules are available in
the usual places in initramfs, initrd or the root filesystem, the
default modules are loaded as soon as possible.
This will replace the on-demand elevator module loading from elevator
init path.
v2: Fixed build breakage when !CONFIG_BLOCK. Reported by kbuild test
robot.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Alex Riesen <raa.lkml@gmail.com>
Cc: Fengguang We <fengguang.wu@intel.com>
|
|
This function queries whether %current is an async worker executing an
async item. This will be used to implement warning on synchronous
request_module() from async workers.
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
This will be used to implement an inline function to query whether
%current is a workqueue worker and, if so, allow determining which
work item it's executing.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Workqueue wants to expose more interface internal to kernel/. Instead
of adding a new header file, repurpose kernel/workqueue_sched.h.
Rename it to workqueue_internal.h and add include protector.
This patch doesn't introduce any functional changes.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
|
|
PF_WQ_WORKER is used to tell scheduler that the task is a workqueue
worker and needs wq_worker_sleeping/waking_up() invoked on it for
concurrency management. As rescuers never participate in concurrency
management, PF_WQ_WORKER wasn't set on them.
There's a need for an interface which can query whether %current is
executing a work item and if so which. Such interface requires a way
to identify all tasks which may execute work items and PF_WQ_WORKER
will be used for that. As all normal workers always have PF_WQ_WORKER
set, we only need to add it to rescuers.
As rescuers start with WORKER_PREP but never clear it, it's always
NOT_RUNNING and there's no need to worry about it interfering with
concurrency management even if PF_WQ_WORKER is set; however, unlike
normal workers, rescuers currently don't have its worker struct as
kthread_data(). It uses the associated workqueue_struct instead.
This is problematic as wq_worker_sleeping/waking_up() expect struct
worker at kthread_data().
This patch adds worker->rescue_wq and start rescuer kthreads with
worker struct as kthread_data and sets PF_WQ_WORKER on rescuers.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
|
|
42f8570f43 ("workqueue: use new hashtable implementation") incorrectly
made busy workers hashed by the pointer value of worker instead of
work. This broke find_worker_executing_work() which in turn broke a
lot of fundamental operations of workqueue - non-reentrancy and
flushing among others. The flush malfunction triggered warning in
disk event code in Fengguang's automated test.
write_dev_root_ (3265) used greatest stack depth: 2704 bytes left
------------[ cut here ]------------
WARNING: at /c/kernel-tests/src/stable/block/genhd.c:1574 disk_clear_events+0x\
cf/0x108()
Hardware name: Bochs
Modules linked in:
Pid: 3328, comm: ata_id Not tainted 3.7.0-01930-gbff6343 #1167
Call Trace:
[<ffffffff810997c4>] warn_slowpath_common+0x83/0x9c
[<ffffffff810997f7>] warn_slowpath_null+0x1a/0x1c
[<ffffffff816aea77>] disk_clear_events+0xcf/0x108
[<ffffffff811bd8be>] check_disk_change+0x27/0x59
[<ffffffff822e48e2>] cdrom_open+0x49/0x68b
[<ffffffff81ab0291>] idecd_open+0x88/0xb7
[<ffffffff811be58f>] __blkdev_get+0x102/0x3ec
[<ffffffff811bea08>] blkdev_get+0x18f/0x30f
[<ffffffff811bebfd>] blkdev_open+0x75/0x80
[<ffffffff8118f510>] do_dentry_open+0x1ea/0x295
[<ffffffff8118f5f0>] finish_open+0x35/0x41
[<ffffffff8119c720>] do_last+0x878/0xa25
[<ffffffff8119c993>] path_openat+0xc6/0x333
[<ffffffff8119cf37>] do_filp_open+0x38/0x86
[<ffffffff81190170>] do_sys_open+0x6c/0xf9
[<ffffffff8119021e>] sys_open+0x21/0x23
[<ffffffff82c1c3d9>] system_call_fastpath+0x16/0x1b
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
|
|
To avoid executing the same work item concurrenlty, workqueue hashes
currently busy workers according to their current work items and looks
up the the table when it wants to execute a new work item. If there
already is a worker which is executing the new work item, the new item
is queued to the found worker so that it gets executed only after the
current execution finishes.
Unfortunately, a work item may be freed while being executed and thus
recycled for different purposes. If it gets recycled for a different
work item and queued while the previous execution is still in
progress, workqueue may make the new work item wait for the old one
although the two aren't really related in any way.
In extreme cases, this false dependency may lead to deadlock although
it's extremely unlikely given that there aren't too many self-freeing
work item users and they usually don't wait for other work items.
To alleviate the problem, record the current work function in each
busy worker and match it together with the work item address in
find_worker_executing_work(). While this isn't complete, it ensures
that unrelated work items don't interact with each other and in the
very unlikely case where a twisted wq user triggers it, it's always
onto itself making the culprit easy to spot.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Andrey Isakov <andy51@gmx.ru>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=51701
Cc: stable@vger.kernel.org
|
|
Switch workqueues to use the new hashtable implementation. This reduces the
amount of generic unrelated code in the workqueues.
This patch depends on d9b482c ("hashtable: introduce a small and naive
hashtable") which was merged in v3.6.
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Merge misc patches from Andrew Morton:
"Incoming:
- lots of misc stuff
- backlight tree updates
- lib/ updates
- Oleg's percpu-rwsem changes
- checkpatch
- rtc
- aoe
- more checkpoint/restart support
I still have a pile of MM stuff pending - Pekka should be merging
later today after which that is good to go. A number of other things
are twiddling thumbs awaiting maintainer merges."
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (180 commits)
scatterlist: don't BUG when we can trivially return a proper error.
docs: update documentation about /proc/<pid>/fdinfo/<fd> fanotify output
fs, fanotify: add @mflags field to fanotify output
docs: add documentation about /proc/<pid>/fdinfo/<fd> output
fs, notify: add procfs fdinfo helper
fs, exportfs: add exportfs_encode_inode_fh() helper
fs, exportfs: escape nil dereference if no s_export_op present
fs, epoll: add procfs fdinfo helper
fs, eventfd: add procfs fdinfo helper
procfs: add ability to plug in auxiliary fdinfo providers
tools/testing/selftests/kcmp/kcmp_test.c: print reason for failure in kcmp_test
breakpoint selftests: print failure status instead of cause make error
kcmp selftests: print fail status instead of cause make error
kcmp selftests: make run_tests fix
mem-hotplug selftests: print failure status instead of cause make error
cpu-hotplug selftests: print failure status instead of cause make error
mqueue selftests: print failure status instead of cause make error
vm selftests: print failure status instead of cause make error
ubifs: use prandom_bytes
mtd: nandsim: use prandom_bytes
...
|
|
When compiling efivars.c the build fails with:
CC drivers/firmware/efivars.o
drivers/firmware/efivars.c: In function ‘efivarfs_get_inode’:
drivers/firmware/efivars.c:886:31: error: incompatible types when assigning to type ‘kgid_t’ from type ‘int’
make[2]: *** [drivers/firmware/efivars.o] Error 1
make[1]: *** [drivers/firmware/efivars.o] Error 2
Fix the build error by removing the duplicate initialization of i_uid and
i_gid inode_init_always has already initialized them to 0.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This build error is currently hidden by the fact that the x86
implementation of 'update_mmu_cache_pmd()' is a macro that doesn't use
its last argument, but commit b32967ff101a ("mm: numa: Add THP migration
for the NUMA working set scanning fault case") introduced a call with
the wrong third argument.
In the akpm tree, it causes this build error:
mm/migrate.c: In function 'migrate_misplaced_transhuge_page_put':
mm/migrate.c:1666:2: error: incompatible type for argument 3 of 'update_mmu_cache_pmd'
arch/x86/include/asm/pgtable.h:792:20: note: expected 'struct pmd_t *' but argument is of type 'pmd_t'
Fix it.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
There is absolutely no reason to crash the kernel when we have a
perfectly good return value already available to use for conveying
failure status.
Let's return an error code instead of crashing the kernel: that sounds
like a much better plan.
[akpm@linux-foundation.org: s/E2BIG/EINVAL/]
Signed-off-by: Nick Bowler <nbowler@elliptictech.com>
Cc: Maxim Levitsky <maximlevitsky@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrey Vagin <avagin@openvz.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: James Bottomley <jbottomley@parallels.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Matthew Helsley <matt.helsley@gmail.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@onelan.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The kernel keeps FAN_MARK_IGNORED_SURV_MODIFY bit separately from
fsnotify_mark::mask|ignored_mask thus put it in @mflags (mark flags)
field so the user-space reader will be able to detect if such bit were
used on mark creation procedure.
| pos: 0
| flags: 04002
| fanotify flags:10 event-flags:0
| fanotify mnt_id:12 mflags:40 mask:38 ignored_mask:40000003
| fanotify ino:4f969 sdev:800013 mflags:0 mask:3b ignored_mask:40000000 fhandle-bytes:8 fhandle-type:1 f_handle:69f90400c275b5b4
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrey Vagin <avagin@openvz.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: James Bottomley <jbottomley@parallels.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Matthew Helsley <matt.helsley@gmail.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Tvrtko Ursulin <tvrtko.ursulin@onelan.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
[akpm@linux-foundation.org: tweak documentation]
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrey Vagin <avagin@openvz.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: James Bottomley <jbottomley@parallels.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Matthew Helsley <matt.helsley@gmail.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@onelan.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This allow us to print out fsnotify details such as watchee inode, device,
mask and optionally a file handle.
For inotify objects if kernel compiled with exportfs support the output
will be
| pos: 0
| flags: 02000000
| inotify wd:3 ino:9e7e sdev:800013 mask:800afce ignored_mask:0 fhandle-bytes:8 fhandle-type:1 f_handle:7e9e0000640d1b6d
| inotify wd:2 ino:a111 sdev:800013 mask:800afce ignored_mask:0 fhandle-bytes:8 fhandle-type:1 f_handle:11a1000020542153
| inotify wd:1 ino:6b149 sdev:800013 mask:800afce ignored_mask:0 fhandle-bytes:8 fhandle-type:1 f_handle:49b1060023552153
If kernel compiled without exportfs support, the file handle
won't be provided but inode and device only.
| pos: 0
| flags: 02000000
| inotify wd:3 ino:9e7e sdev:800013 mask:800afce ignored_mask:0
| inotify wd:2 ino:a111 sdev:800013 mask:800afce ignored_mask:0
| inotify wd:1 ino:6b149 sdev:800013 mask:800afce ignored_mask:0
For fanotify the output is like
| pos: 0
| flags: 04002
| fanotify flags:10 event-flags:0
| fanotify mnt_id:12 mask:3b ignored_mask:0
| fanotify ino:50205 sdev:800013 mask:3b ignored_mask:40000000 fhandle-bytes:8 fhandle-type:1 f_handle:05020500fb1d47e7
To minimize impact on general fsnotify code the new functionality
is gathered in fs/notify/fdinfo.c file.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrey Vagin <avagin@openvz.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: James Bottomley <jbottomley@parallels.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Matthew Helsley <matt.helsley@gmail.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@onelan.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
We will need this helper in the next patch to provide a file handle for
inotify marks in /proc/pid/fdinfo output.
The patch is rather providing the way to use inodes directly when dentry
is not available (like in case of inotify system).
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrey Vagin <avagin@openvz.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: James Bottomley <jbottomley@parallels.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Matthew Helsley <matt.helsley@gmail.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@onelan.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This routine will be used to generate a file handle in fdinfo output for
inotify subsystem, where if no s_export_op present the general
export_encode_fh should be used. Thus add a test if s_export_op present
inside exportfs_encode_fh itself.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrey Vagin <avagin@openvz.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: James Bottomley <jbottomley@parallels.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Matthew Helsley <matt.helsley@gmail.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@onelan.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This allows us to print out eventpoll target file descriptor, events and
data, the /proc/pid/fdinfo/fd consists of
| pos: 0
| flags: 02
| tfd: 5 events: 1d data: ffffffffffffffff enabled: 1
[avagin@: fix for unitialized ret variable]
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrey Vagin <avagin@openvz.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: James Bottomley <jbottomley@parallels.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Matthew Helsley <matt.helsley@gmail.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@onelan.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This allows us to print out raw counter value. The /proc/pid/fdinfo/fd
output is
| pos: 0
| flags: 04002
| eventfd-count: 5a
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrey Vagin <avagin@openvz.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: James Bottomley <jbottomley@parallels.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Matthew Helsley <matt.helsley@gmail.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@onelan.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This patch brings ability to print out auxiliary data associated with
file in procfs interface /proc/pid/fdinfo/fd.
In particular further patches make eventfd, evenpoll, signalfd and
fsnotify to print additional information complete enough to restore
these objects after checkpoint.
To simplify the code we add show_fdinfo callback inside struct
file_operations (as Al and Pavel are proposing).
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrey Vagin <avagin@openvz.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: James Bottomley <jbottomley@parallels.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Matthew Helsley <matt.helsley@gmail.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@onelan.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
I was curious why sys_kcmp wasn't working, which led me to the testcase.
It turned out I hadn't enabled CHECKPOINT_RESTORE in the kernel I was
testing. Add a decoding of errno to the testcase to make that obvious.
Signed-off-by: Dave Jones <davej@redhat.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
In case breakpoint test exit non zero value it will cause make error.
Better way is just print the test failure status.
Signed-off-by: Dave Young <dyoung@redhat.com>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
In case kcmp_test exit non zero value it will cause make error.
Better way is just print the test failure status.
Signed-off-by: Dave Young <dyoung@redhat.com>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
make run_tests need the target is run_tests instead of run-tests
Also gcc output should be kcmp_test. Fix these two issues.
Signed-off-by: Dave Young <dyoung@redhat.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Original behavior:
bash-4.1$ make -C memory-hotplug run_tests
make: Entering directory `/home/dave/git/linux-2.6/tools/testing/selftests/memory-hotplug'
./on-off-test.sh
make: execvp: ./on-off-test.sh: Permission denied
make: *** [run_tests] Error 127
make: Leaving directory `/home/dave/git/linux-2.6/tools/testing/selftests/memory-hotplug'
After applying the patch:
bash-4.1$ make -C memory-hotplug run_tests
make: Entering directory `/home/dave/git/linux-2.6/tools/testing/selftests/memory-hotplug'
/bin/sh: ./on-off-test.sh: Permission denied
memory-hotplug selftests: [FAIL]
make: Leaving directory `/home/dave/git/linux-2.6/tools/testing/selftests/memory-hotplug'
Signed-off-by: Dave Young <dyoung@redhat.com>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Original behavior:
bash-4.1$ make -C cpu-hotplug run_tests
make: Entering directory `/home/dave/git/linux-2.6/tools/testing/selftests/cpu-hotplug'
./on-off-test.sh
make: execvp: ./on-off-test.sh: Permission denied
make: *** [run_tests] Error 127
make: Leaving directory `/home/dave/git/linux-2.6/tools/testing/selftests/cpu-hotplug'
After applying the patch:
bash-4.1$ make -C cpu-hotplug run_tests
make: Entering directory `/home/dave/git/linux-2.6/tools/testing/selftests/cpu-hotplug'
/bin/sh: ./on-off-test.sh: Permission denied
cpu-hotplug selftests: [FAIL]
make: Leaving directory `/home/dave/git/linux-2.6/tools/testing/selftests/cpu-hotplug'
Signed-off-by: Dave Young <dyoung@redhat.com>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Original behavior:
bash-4.1$ make -C mqueue run_tests
make: Entering directory `/home/dave/git/linux-2.6/tools/testing/selftests/mqueue'
./mq_open_tests /test1
Not running as root, but almost all tests require root in order to modify
system settings. Exiting.
make: *** [run_tests] Error 1
make: Leaving directory `/home/dave/git/linux-2.6/tools/testing/selftests/mqueue'
After applying the patch:
bash-4.1$ make -C mqueue run_tests
make: Entering directory `/home/dave/git/linux-2.6/tools/testing/selftests/mqueue'
Not running as root, but almost all tests require root in order to modify
system settings. Exiting.
mq_open_tests: [FAIL]
Not running as root, but almost all tests require root in order to modify
system settings. Exiting.
mq_perf_tests: [FAIL]
make: Leaving directory `/home/dave/git/linux-2.6/tools/testing/selftests/mqueue'
Signed-off-by: Dave Young <dyoung@redhat.com>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Original behavior:
bash-4.1$ make -C vm run_tests
make: Entering directory `/home/dave/git/linux-2.6/tools/testing/selftests/vm'
/bin/sh ./run_vmtests
./run_vmtests: line 24: /proc/sys/vm/nr_hugepages: Permission denied
Please run this test as root
make: *** [run_tests] Error 1
make: Leaving directory `/home/dave/git/linux-2.6/tools/testing/selftests/vm'
After applying the patch:
bash-4.1$ make -C vm run_tests
make: Entering directory `/home/dave/git/linux-2.6/tools/testing/selftests/vm'
./run_vmtests: line 24: /proc/sys/vm/nr_hugepages: Permission denied
Please run this test as root
vmtests: [FAIL]
make: Leaving directory `/home/dave/git/linux-2.6/tools/testing/selftests/vm'
Signed-off-by: Dave Young <dyoung@redhat.com>
Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This also converts filling memory loop to use memset.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Artem Bityutskiy <dedekind1@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: David Laight <david.laight@aculab.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Eilon Greenstein <eilong@broadcom.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Robert Love <robert.w.love@intel.com>
Cc: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This also removes unnecessary memset call which is immediately overwritten
with random bytes.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Artem Bityutskiy <dedekind1@gmail.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Laight <david.laight@aculab.com>
Cc: Eilon Greenstein <eilong@broadcom.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Robert Love <robert.w.love@intel.com>
Cc: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Use prandom_bytes() to fill rss key with pseudo-random bytes.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Eilon Greenstein <eilong@broadcom.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Artem Bityutskiy <dedekind1@gmail.com>
Cc: David Laight <david.laight@aculab.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Michel Lespinasse <walken@google.com>
Cc: Robert Love <robert.w.love@intel.com>
Cc: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Add functions to get the requested number of pseudo-random bytes.
The difference from get_random_bytes() is that it generates pseudo-random
numbers by prandom_u32(). It doesn't consume the entropy pool, and the
sequence is reproducible if the same rnd_state is used. So it is suitable
for generating random bytes for testing.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Artem Bityutskiy <dedekind1@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Eilon Greenstein <eilong@broadcom.com>
Cc: David Laight <david.laight@aculab.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Robert Love <robert.w.love@intel.com>
Cc: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This renames all random32 functions to have 'prandom_' prefix as follows:
void prandom_seed(u32 seed); /* rename from srandom32() */
u32 prandom_u32(void); /* rename from random32() */
void prandom_seed_state(struct rnd_state *state, u64 seed);
/* rename from prandom32_seed() */
u32 prandom_u32_state(struct rnd_state *state);
/* rename from prandom32() */
The purpose of this renaming is to prevent some kernel developers from
assuming that prandom32() and random32() might imply that only
prandom32() was the one using a pseudo-random number generator by
prandom32's "p", and the result may be a very embarassing security
exposure. This concern was expressed by Theodore Ts'o.
And furthermore, I'm going to introduce new functions for getting the
requested number of pseudo-random bytes. If I continue to use both
prandom32 and random32 prefixes for these functions, the confusion
is getting worse.
As a result of this renaming, "prandom_" is the common prefix for
pseudo-random number library.
Currently, srandom32() and random32() are preserved because it is
difficult to rename too many users at once.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Robert Love <robert.w.love@intel.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Cc: David Laight <david.laight@aculab.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Artem Bityutskiy <dedekind1@gmail.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
We should return NULL on failure instead of returning a freed pointer.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This version number is printed to the console on module initialization
and is available in sysfs, which is where the userland aoe-version tool
looks for it.
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This change only affects experimental AoE storage networks.
It modifies the console message about runt packets detected so that the
AoE major and minor addresses of the AoE target that generated the runt
are mentioned.
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
By default, the aoe driver uses any ethernet interface for AoE, but the
aoe_iflist module parameter provides a convenient way to limit AoE
traffic to a specific list of local network interfaces.
This change allows a list to be specified using the comma character as a
separator. For example,
modprobe aoe aoe_iflist=eth2,eth3
Before, it was inconvenient to get the quoting right in shell scripts
when setting aoe_iflist to have more than one network interface.
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
With this change, the aoe driver treats the value zero as special for
the aoe_deadsecs module parameter. Normally, this value specifies the
number of seconds during which the driver will continue to attempt
retransmits to an unresponsive AoE target. After aoe_deadsecs has
elapsed, the aoe driver marks the aoe device as "down" and fails all
I/O.
The new meaning of an aoe_deadsecs of zero is for the driver to
retransmit commands indefinitely.
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Many AoE targets have four or fewer network ports, but some existing
storage devices have many, and the AoE protocol sets no limit.
This patch allows the use of more than eight remote MAC addresses per AoE
target, while reducing the amount of memory used by the aoe driver in
cases where there are many AoE targets with fewer than eight MAC addresses
each.
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This change avoids a race that could result in a NULL pointer derference
following a WARNing from kobject_add_internal, "don't try to register
things with the same name in the same directory."
The problem was found with a test that forgets and discovers an
aoe device in a loop:
while test ! -r /tmp/stop; do
aoe-flush -a
aoe-discover
done
The race was between aoedev_flush taking aoedevs out of the devlist,
allowing a new discovery of the same AoE target to take place before the
driver gets around to calling sysfs_remove_group. Fixing that one
revealed another race between do_open and add_disk, and this patch avoids
that, too.
The fix required some care, because for flushing (forgetting) an aoedev,
some of the steps must be performed under lock and some must be able to
sleep. Also, for discovering a new aoedev, some steps might sleep.
The check for a bad aoedev pointer remains from a time when about half of
this patch was done, and it was possible for the
bdev->bd_disk->private_data to become corrupted. The check should be
removed eventually, but it is not expected to add significant overhead,
occurring in the aoeblk_open routine.
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
An AoE target can have multiple network ports used for AoE, and in the
aoe driver, those are tracked by the aoetgt struct. These changes allow
the aoe driver to handle network paths, or aoetgts, that are not working
well, compared to the others.
Paths that do not get responses despite the retransmission of AoE
commands are marked as "tainted", and non-tainted paths are preferred.
Meanwhile, the aoe driver attempts to "probe" the tainted path in the
background by issuing reads of LBA 0 that are padded out to full
(possibly jumbo-frame) size. If the probes get responses, then the path
is "redeemed", and its taint is removed.
This mechanism has been shown to be helpful in transparently handling
and recovering from real-world network "brown outs" in ways that the
earlier "shoot the help-needing target in the head" mechanism could not.
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The value returned by the static minor device number number allocator is
the real minor number, so it must be multiplied by the supported number
of partitions per aoedev.
Without this fix the support for systems without udev is incomplete, and
the few users of aoe on such systems will have surprising results when
device nodes names do not match the AoE target.
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Because the minor_get and related functions use the return values for
errors, the compiler doesn't know that sysminor will always either 1) be
initialized in aoedev_by_aoeaddr by the call to minor_get, or 2) be
unused as the "goto out" is executed.
This patch avoids the compiler warning.
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
For some special-purpose systems where udev isn't present, static
allocation of minor numbers is desirable. This update distinguishes
different failure scenarios, to help the user understand what went
wrong.
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
There is no need to call the request handler function in the I/O
completion routine. The user impact of not doing it is a more "nice" aoe
driver that is less susceptible to causing soft lockups.
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
A misplaced comment was attached to the nout member of the aoetgt. This
change corrects the comment.
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The aoe driver will never be waiting for more than aoe_maxout AoE
commands from a given remote network port on an AoE target. Increasing
the cap increases performance. Users can tighten the setting to reduce
the amount of memory used for handling AoE traffic or the network
bandwidth used for AoE.
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Before the aoe driver was an I/O request handler, it was a
make_request-style block driver. Even so, there was a problem where
sysfs expected a request queue to exist, so one was provided in commit
7135a71b19be ("aoe: allocate unused request_queue for sysfs").
During the transition to the request-handler style, a patch was merged
that was based on a driver without the noop queue, and the noop queue
remained in place after the patch was merged, even though a new
functional queue was introduced by the patch, allocated through
blk_init_queue.
The user impact is a memory leak proportional to the number of AoE
targets discovered. This patch removes the memory leak and cleans up
vestiges of the old do-nothing queue from the aoeblk_gdalloc function.
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|