Age | Commit message (Collapse) | Author |
|
de_thread() checks if the old leader was the ->child_reaper, this is not
possible any longer. With the previous patch ->group_leader itself will
change ->child_reaper on exit.
Henceforth find_new_reaper() is the only function (apart from
initialization) which plays with ->child_reaper.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Move it into sysrq.c, along with the rest of the sysrq implementation.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
We currently follow blindly what the partition table lies about the
disk, and let the kernel create block devices which can not be accessed.
Trying to identify the device leads to kernel logs full of:
sdb: rw=0, want=73392, limit=28800
attempt to access beyond end of device
Here is an example of a broken partition table, where sda2 starts
behind the end of the disk, and sdb3 is larger than the entire disk:
Disk /dev/sdb: 14 MB, 14745600 bytes
1 heads, 29 sectors/track, 993 cylinders, total 28800 sectors
Device Boot Start End Blocks Id System
/dev/sdb1 29 7800 3886 83 Linux
/dev/sdb2 37801 45601 3900+ 83 Linux
/dev/sdb3 15602 73402 28900+ 83 Linux
/dev/sdb4 23403 28796 2697 83 Linux
The kernel creates these completely invalid devices, which can not be
accessed, or may lead to other unpredictable failures:
grep . /sys/class/block/sdb*/{start,size}
/sys/class/block/sdb/size:28800
/sys/class/block/sdb1/start:29
/sys/class/block/sdb1/size:7772
/sys/class/block/sdb2/start:37801
/sys/class/block/sdb2/size:7801
/sys/class/block/sdb3/start:15602
/sys/class/block/sdb3/size:57801
/sys/class/block/sdb4/start:23403
/sys/class/block/sdb4/size:5394
With this patch, we ignore partitions which start behind the end of the disk,
and limit partitions to the end of the disk if they pretend to be larger:
grep . /sys/class/block/sdb*/{start,size}
/sys/class/block/sdb/size:28800
/sys/class/block/sdb1/start:29
/sys/class/block/sdb1/size:7772
/sys/class/block/sdb3/start:15602
/sys/class/block/sdb3/size:13198
/sys/class/block/sdb4/start:23403
/sys/class/block/sdb4/size:5394
These warnings are printed to the kernel log:
sdb: p2 ignored, start 37801 is behind the end of the disk
sdb: p3 size 57801 limited to end of disk
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Cc: Herton Ronaldo Krzesinski <herton@mandriva.com.br>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
I missed this when I did the arm26 removal.
Reported-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Don't repeat BINFMT_ELF definition, simply multiply COMPAT and BINFMT_ELF.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
These auxvec entries are the only ones left unhandled out of the current
base implementation. This syncs up binfmt_elf_fdpic with linux/auxvec.h
and current binfmt_elf.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
binfmt_elf_fdpic seems to have grabbed a hard-coded hack from an ancient
version of binfmt_elf in order to try and fix up initial stack alignment
on multi-threaded x86, which while in addition to being unused, was also
pushed down beyond the first set of operations on the stack pointer,
negating the entire purpose.
These days, we have an architecture independent arch_align_stack(), so we
switch to using that instead. Move the initial alignment up before the
initial stores while we're at it.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit 483fad1c3fa1060d7e6710e84a065ad514571739 ("ELF loader support for
auxvec base platform string") introduced AT_BASE_PLATFORM, but only
implemented it for binfmt_elf.
Given that AT_VECTOR_SIZE_BASE is unconditionally enlarged for us, and
it's only optionally added in for the platforms that set
ELF_BASE_PLATFORM, wire it up for binfmt_elf_fdpic, too.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Remove CVS keywords that weren't updated for a long time from comments.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
In case of error, the function open_xa_dir returns an ERR pointer, but
never returns a NULL pointer. So a NULL test that comes after an IS_ERR
test should be deleted.
The semantic match that finds this problem is as follows:
(http://www.emn.fr/x-info/coccinelle/)
// <smpl>
@match_bad_null_test@
expression x, E;
statement S1,S2;
@@
x = open_xa_dir(...)
... when != x = E
(
* if (x == NULL && ...) S1 else S2
|
* if (x == NULL || ...) S1 else S2
)
// </smpl>
Signed-off-by: Julien Brunel <brunel@diku.dk>
Signed-off-by: Julia Lawall <julia@diku.dk>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Jan Kara <jack@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Remove CVS keywords that weren't updated for a long time from comments.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Fix a stack corruption caused by a corrupted hfs filesystem. If the
catalog name length is corrupted the memcpy overwrites the catalog btree
structure. Since the field is limited to HFS_NAMELEN bytes in the
structure and the file format, we throw an error if it is too long.
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
While testing more corrupted images with hfsplus, i came across
one which triggered the following bug:
[15840.675016] BUG: unable to handle kernel paging request at fffffffb
[15840.675016] IP: [<c0116a4f>] kmap+0x15/0x56
[15840.675016] *pde = 00008067 *pte = 00000000
[15840.675016] Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC
[15840.675016] Modules linked in:
[15840.675016]
[15840.675016] Pid: 11575, comm: ln Not tainted (2.6.27-rc4-00123-gd3ee1b4-dirty #29)
[15840.675016] EIP: 0060:[<c0116a4f>] EFLAGS: 00010202 CPU: 0
[15840.675016] EIP is at kmap+0x15/0x56
[15840.675016] EAX: 00000246 EBX: fffffffb ECX: 00000000 EDX: cab919c0
[15840.675016] ESI: 000007dd EDI: cab0bcf4 EBP: cab0bc98 ESP: cab0bc94
[15840.675016] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
[15840.675016] Process ln (pid: 11575, ti=cab0b000 task=cab919c0 task.ti=cab0b000)
[15840.675016] Stack: 00000000 cab0bcdc c0231cfb 00000000 cab0bce0 00000800 ca9290c0 fffffffb
[15840.675016] cab145d0 cab919c0 cab15998 22222222 22222222 22222222 00000001 cab15960
[15840.675016] 000007dd cab0bcf4 cab0bd04 c022cb3a cab0bcf4 cab15a6c ca9290c0 00000000
[15840.675016] Call Trace:
[15840.675016] [<c0231cfb>] ? hfsplus_block_allocate+0x6f/0x2d3
[15840.675016] [<c022cb3a>] ? hfsplus_file_extend+0xc4/0x1db
[15840.675016] [<c022ce41>] ? hfsplus_get_block+0x8c/0x19d
[15840.675016] [<c06adde4>] ? sub_preempt_count+0x9d/0xab
[15840.675016] [<c019ece6>] ? __block_prepare_write+0x147/0x311
[15840.675016] [<c0161934>] ? __grab_cache_page+0x52/0x73
[15840.675016] [<c019ef4f>] ? block_write_begin+0x79/0xd5
[15840.675016] [<c022cdb5>] ? hfsplus_get_block+0x0/0x19d
[15840.675016] [<c019f22a>] ? cont_write_begin+0x27f/0x2af
[15840.675016] [<c022cdb5>] ? hfsplus_get_block+0x0/0x19d
[15840.675016] [<c0139ebe>] ? tick_program_event+0x28/0x4c
[15840.675016] [<c013bd35>] ? trace_hardirqs_off+0xb/0xd
[15840.675016] [<c022b723>] ? hfsplus_write_begin+0x2d/0x32
[15840.675016] [<c022cdb5>] ? hfsplus_get_block+0x0/0x19d
[15840.675016] [<c0161988>] ? pagecache_write_begin+0x33/0x107
[15840.675016] [<c01879e5>] ? __page_symlink+0x3c/0xae
[15840.675016] [<c019ad34>] ? __mark_inode_dirty+0x12f/0x137
[15840.675016] [<c0187a70>] ? page_symlink+0x19/0x1e
[15840.675016] [<c022e6eb>] ? hfsplus_symlink+0x41/0xa6
[15840.675016] [<c01886a9>] ? vfs_symlink+0x99/0x101
[15840.675016] [<c018a2f6>] ? sys_symlinkat+0x6b/0xad
[15840.675016] [<c018a348>] ? sys_symlink+0x10/0x12
[15840.675016] [<c01038bd>] ? sysenter_do_call+0x12/0x31
[15840.675016] =======================
[15840.675016] Code: 00 00 75 10 83 3d 88 2f ec c0 02 75 07 89 d0 e8 12 56 05 00 5d c3 55 ba 06 00 00 00 89 e5 53 89 c3 b8 3d eb 7e c0 e8 16 74 00 00 <8b> 03 c1 e8 1e 69 c0 d8 02 00 00 05 b8 69 8e c0 2b 80 c4 02 00
[15840.675016] EIP: [<c0116a4f>] kmap+0x15/0x56 SS:ESP 0068:cab0bc94
[15840.675016] ---[ end trace 4fea40dad6b70e5f ]---
This happens because the return value of read_mapping_page() is passed on
to kmap unchecked. The bug is triggered after the first
read_mapping_page() in hfsplus_block_allocate(), this patch fixes all
three usages in this functions but leaves the ones further down in the
file unchanged.
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When an hfsplus image gets corrupted it might happen that the catalog
namelength field gets b0rked. If we mount such an image the memcpy() in
hfsplus_cat_build_key_uni() writes more than the 255 that fit in the name
field. Depending on the size of the overwritten data, we either only get
memory corruption or also trigger an oops like this:
[ 221.628020] BUG: unable to handle kernel paging request at c82b0000
[ 221.629066] IP: [<c022d4b1>] hfsplus_find_cat+0x10d/0x151
[ 221.629066] *pde = 0ea29163 *pte = 082b0160
[ 221.629066] Oops: 0002 [#1] PREEMPT DEBUG_PAGEALLOC
[ 221.629066] Modules linked in:
[ 221.629066]
[ 221.629066] Pid: 4845, comm: mount Not tainted (2.6.27-rc4-00123-gd3ee1b4-dirty #28)
[ 221.629066] EIP: 0060:[<c022d4b1>] EFLAGS: 00010206 CPU: 0
[ 221.629066] EIP is at hfsplus_find_cat+0x10d/0x151
[ 221.629066] EAX: 00000029 EBX: 00016210 ECX: 000042c2 EDX: 00000002
[ 221.629066] ESI: c82d70ca EDI: c82b0000 EBP: c82d1bcc ESP: c82d199c
[ 221.629066] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
[ 221.629066] Process mount (pid: 4845, ti=c82d1000 task=c8224060 task.ti=c82d1000)
[ 221.629066] Stack: c080b3c4 c82aa8f8 c82d19c2 00016210 c080b3be c82d1bd4 c82aa8f0 00000300
[ 221.629066] 01000000 750008b1 74006e00 74006900 65006c00 c82d6400 c013bd35 c8224060
[ 221.629066] 00000036 00000046 c82d19f0 00000082 c8224548 c8224060 00000036 c0d653cc
[ 221.629066] Call Trace:
[ 221.629066] [<c013bd35>] ? trace_hardirqs_off+0xb/0xd
[ 221.629066] [<c013bca3>] ? trace_hardirqs_off_caller+0x14/0x9b
[ 221.629066] [<c013bd35>] ? trace_hardirqs_off+0xb/0xd
[ 221.629066] [<c013bca3>] ? trace_hardirqs_off_caller+0x14/0x9b
[ 221.629066] [<c013bd35>] ? trace_hardirqs_off+0xb/0xd
[ 221.629066] [<c0107aa3>] ? native_sched_clock+0x82/0x96
[ 221.629066] [<c01302d2>] ? __kernel_text_address+0x1b/0x27
[ 221.629066] [<c010487a>] ? dump_trace+0xca/0xd6
[ 221.629066] [<c0109e32>] ? save_stack_address+0x0/0x2c
[ 221.629066] [<c0109eaf>] ? save_stack_trace+0x1c/0x3a
[ 221.629066] [<c013b571>] ? save_trace+0x37/0x8d
[ 221.629066] [<c013b62e>] ? add_lock_to_list+0x67/0x8d
[ 221.629066] [<c013ea1c>] ? validate_chain+0x8a4/0x9f4
[ 221.629066] [<c013553d>] ? down+0xc/0x2f
[ 221.629066] [<c013f1f6>] ? __lock_acquire+0x68a/0x6e0
[ 221.629066] [<c013bd35>] ? trace_hardirqs_off+0xb/0xd
[ 221.629066] [<c013bca3>] ? trace_hardirqs_off_caller+0x14/0x9b
[ 221.629066] [<c013bd35>] ? trace_hardirqs_off+0xb/0xd
[ 221.629066] [<c0107aa3>] ? native_sched_clock+0x82/0x96
[ 221.629066] [<c013da5d>] ? mark_held_locks+0x43/0x5a
[ 221.629066] [<c013dc3a>] ? trace_hardirqs_on+0xb/0xd
[ 221.629066] [<c013dbf4>] ? trace_hardirqs_on_caller+0xf4/0x12f
[ 221.629066] [<c06abec8>] ? _spin_unlock_irqrestore+0x42/0x58
[ 221.629066] [<c013555c>] ? down+0x2b/0x2f
[ 221.629066] [<c022aa68>] ? hfsplus_iget+0xa0/0x154
[ 221.629066] [<c022b0b9>] ? hfsplus_fill_super+0x280/0x447
[ 221.629066] [<c0107aa3>] ? native_sched_clock+0x82/0x96
[ 221.629066] [<c013bca3>] ? trace_hardirqs_off_caller+0x14/0x9b
[ 221.629066] [<c013bca3>] ? trace_hardirqs_off_caller+0x14/0x9b
[ 221.629066] [<c013f1f6>] ? __lock_acquire+0x68a/0x6e0
[ 221.629066] [<c041c9e4>] ? string+0x2b/0x74
[ 221.629066] [<c041cd16>] ? vsnprintf+0x2e9/0x512
[ 221.629066] [<c010487a>] ? dump_trace+0xca/0xd6
[ 221.629066] [<c0109eaf>] ? save_stack_trace+0x1c/0x3a
[ 221.629066] [<c0109eaf>] ? save_stack_trace+0x1c/0x3a
[ 221.629066] [<c013b571>] ? save_trace+0x37/0x8d
[ 221.629066] [<c013b62e>] ? add_lock_to_list+0x67/0x8d
[ 221.629066] [<c013ea1c>] ? validate_chain+0x8a4/0x9f4
[ 221.629066] [<c01354d3>] ? up+0xc/0x2f
[ 221.629066] [<c013f1f6>] ? __lock_acquire+0x68a/0x6e0
[ 221.629066] [<c013bd35>] ? trace_hardirqs_off+0xb/0xd
[ 221.629066] [<c013bca3>] ? trace_hardirqs_off_caller+0x14/0x9b
[ 221.629066] [<c013bd35>] ? trace_hardirqs_off+0xb/0xd
[ 221.629066] [<c0107aa3>] ? native_sched_clock+0x82/0x96
[ 221.629066] [<c041cfb7>] ? snprintf+0x1b/0x1d
[ 221.629066] [<c01ba466>] ? disk_name+0x25/0x67
[ 221.629066] [<c0183960>] ? get_sb_bdev+0xcd/0x10b
[ 221.629066] [<c016ad92>] ? kstrdup+0x2a/0x4c
[ 221.629066] [<c022a7b3>] ? hfsplus_get_sb+0x13/0x15
[ 221.629066] [<c022ae39>] ? hfsplus_fill_super+0x0/0x447
[ 221.629066] [<c0183583>] ? vfs_kern_mount+0x3b/0x76
[ 221.629066] [<c0183602>] ? do_kern_mount+0x32/0xba
[ 221.629066] [<c01960d4>] ? do_new_mount+0x46/0x74
[ 221.629066] [<c0196277>] ? do_mount+0x175/0x193
[ 221.629066] [<c013dbf4>] ? trace_hardirqs_on_caller+0xf4/0x12f
[ 221.629066] [<c01663b2>] ? __get_free_pages+0x1e/0x24
[ 221.629066] [<c06ac07b>] ? lock_kernel+0x19/0x8c
[ 221.629066] [<c01962e6>] ? sys_mount+0x51/0x9b
[ 221.629066] [<c01962f9>] ? sys_mount+0x64/0x9b
[ 221.629066] [<c01038bd>] ? sysenter_do_call+0x12/0x31
[ 221.629066] =======================
[ 221.629066] Code: 89 c2 c1 e2 08 c1 e8 08 09 c2 8b 85 e8 fd ff ff 66 89 50 06 89 c7 53 83 c7 08 56 57 68 c4 b3 80 c0 e8 8c 5c ef ff 89 d9 c1 e9 02 <f3> a5 89 d9 83 e1 03 74 02 f3 a4 83 c3 06 8b 95 e8 fd ff ff 0f
[ 221.629066] EIP: [<c022d4b1>] hfsplus_find_cat+0x10d/0x151 SS:ESP 0068:c82d199c
[ 221.629066] ---[ end trace e417a1d67f0d0066 ]---
Since hfsplus_cat_build_key_uni() returns void and only has one callsite,
the check is performed at the callsite.
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Check whether the file system was to be mounted read only anyway before
warning about changing the mount to read only.
Signed-off-by: Mike Crowe <mac@mcrowe.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Does compile-time byteswapping rather than runtime.
Noticed by sparse:
fs/befs/super.c:29:6: warning: cast to restricted __le32
fs/befs/super.c:29:6: warning: cast from restricted fs32
fs/befs/super.c:31:11: warning: cast to restricted __be32
fs/befs/super.c:31:11: warning: cast from restricted fs32
fs/befs/super.c:31:11: warning: cast to restricted __be32
fs/befs/super.c:31:11: warning: cast from restricted fs32
fs/befs/super.c:31:11: warning: cast to restricted __be32
fs/befs/super.c:31:11: warning: cast from restricted fs32
fs/befs/super.c:31:11: warning: cast to restricted __be32
fs/befs/super.c:31:11: warning: cast from restricted fs32
fs/befs/super.c:31:11: warning: cast to restricted __be32
fs/befs/super.c:31:11: warning: cast from restricted fs32
fs/befs/super.c:31:11: warning: cast to restricted __be32
fs/befs/super.c:31:11: warning: cast from restricted fs32
fs/befs/linuxvfs.c:811:7: warning: cast to restricted __le32
fs/befs/linuxvfs.c:811:7: warning: cast from restricted fs32
fs/befs/linuxvfs.c:812:7: warning: cast to restricted __be32
fs/befs/linuxvfs.c:812:7: warning: cast from restricted fs32
fs/befs/linuxvfs.c:812:7: warning: cast to restricted __be32
fs/befs/linuxvfs.c:812:7: warning: cast from restricted fs32
fs/befs/linuxvfs.c:812:7: warning: cast to restricted __be32
fs/befs/linuxvfs.c:812:7: warning: cast from restricted fs32
fs/befs/linuxvfs.c:812:7: warning: cast to restricted __be32
fs/befs/linuxvfs.c:812:7: warning: cast from restricted fs32
fs/befs/linuxvfs.c:812:7: warning: cast to restricted __be32
fs/befs/linuxvfs.c:812:7: warning: cast from restricted fs32
fs/befs/linuxvfs.c:812:7: warning: cast to restricted __be32
fs/befs/linuxvfs.c:812:7: warning: cast from restricted fs32
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Cc: "Sergey S. Kostyliov" <rathamahata@php4.ru>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
A very large directory with many read failures (either due to storage
problems, or due to invalid size & blocks from corruption) will generate a
printk storm as the filesystem continues to try to read all the blocks.
This flood of messages can tie up the box until it is complete - which may
be a very long time, especially for very large corrupted values.
This is fixed by only reporting the corruption once each time we try to
read the directory.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Eugene Teo <eugeneteo@kernel.sg>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
We could run into ENOSPC error on ext2, even when there is free blocks on
the filesystem.
The problem is triggered in the case the goal block group has 0 free
blocks , and the rest block groups are skipped due to the check of
"free_blocks < windowsz/2". Current code could fall back to non
reservation allocation to prevent early ENOSPC after examing all the block
groups with reservation on , but this code was bypassed if the reservation
window is turned off already, which is true in this case.
This patch fixed two issues:
1) We don't need to turn off block reservation if the goal block group has
0 free blocks left and continue search for the rest of block groups.
Current code the intention is to turn off the block reservation if the
goal allocation group has a few (some) free blocks left (not enough for
make the desired reservation window),to try to allocation in the goal
block group, to get better locality. But if the goal blocks have 0 free
blocks, it should leave the block reservation on, and continues search for
the next block groups,rather than turn off block reservation completely.
2) we don't need to check the window size if the block reservation is off.
The problem was originally found and fixed in ext4.
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Add a miscellaneous device to the autofs4 module for routing ioctls. This
provides the ability to obtain an ioctl file handle for an autofs mount
point that is possibly covered by another mount.
The actual problem with autofs is that it can't reconnect to existing
mounts. Immediately one things of just adding the ability to remount
autofs file systems would solve it, but alas, that can't work. This is
because autofs direct mounts and the implementation of "on demand mount
and expire" of nested mount trees have the file system mounted on top of
the mount trigger dentry.
To resolve this a miscellaneous device node for routing ioctl commands to
these mount points has been implemented in the autofs4 kernel module and a
library added to autofs. This provides the ability to open a file
descriptor for these over mounted autofs mount points.
Please refer to Documentation/filesystems/autofs4-mount-control.txt for a
discussion of the problem, implementation alternatives considered and a
description of the interface.
[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: build fix]
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Track the uid and gid of the last process to request a mount for on an
autofs dentry.
[akpm@linux-foundation.org: fix tpyo in comment]
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Usage of the AUTOFS_TYPE_* defines is a little confusing and appears
inconsistent.
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The netlink transport code has not worked for a while and the miscdev
transport is a simpler solution. This patch removes the netlink code and
makes the miscdev transport the only eCryptfs kernel to userspace
transport.
Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
Cc: Michael Halcrow <mhalcrow@us.ibm.com>
Cc: Dustin Kirkland <kirkland@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Convert ecryptfs to use write_begin/write_end
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Acked-by: Michael Halcrow <mhalcrow@us.ibm.com>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The retry block in ecryptfs_readdir() has been in the eCryptfs code base
for a while, apparently for no good reason. This loop could potentially
run without terminating. This patch removes the loop, instead erroring
out if vfs_readdir() on the lower file fails.
Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com>
Reported-by: Al Viro <viro@ZinIV.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
binfmt_script and binfmt_misc disallow recursion to avoid stack overflow
using sh_bang and misc_bang. It causes problem in some cases:
$ echo '#!/bin/ls' > /tmp/t0
$ echo '#!/tmp/t0' > /tmp/t1
$ echo '#!/tmp/t1' > /tmp/t2
$ chmod +x /tmp/t*
$ /tmp/t2
zsh: exec format error: /tmp/t2
Similar problem with binfmt_misc.
This patch introduces field 'recursion_depth' into struct linux_binprm to
track recursion level in binfmt_misc and binfmt_script. If recursion
level more then BINPRM_MAX_RECURSION it generates -ENOEXEC.
[akpm@linux-foundation.org: make linux_binprm.recursion_depth a uint]
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This change is Alpha-specific. It adds field 'taso' into struct
linux_binprm to remember if the application is TASO. Previously, field
sh_bang was used for this purpose.
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Add the missing MODULE_LICENSE("GPL").
Reported-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
struct stat / compat_stat is the same on all architectures, so
cp_compat_stat should be, too.
Turns out it is, except that various architectures have slightly and some
high2lowuid/high2lowgid or the direct assignment instead of the
SET_UID/SET_GID that expands to the correct one anyway.
This patch replaces the arch-specific cp_compat_stat implementations with
a common one based on the x86-64 one.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: David S. Miller <davem@davemloft.net> [ sparc bits ]
Acked-by: Kyle McMartin <kyle@mcmartin.ca> [ parisc bits ]
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
People can use the real name an an index into MAINTAINERS to find the
current email address.
Signed-off-by: Francois Cami <francois.cami@free.fr>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Thomas found that there is an unnecessary (always true) test in
ep_send_events(). The callback never inserts into ->rdllink while the
send loop is performed, and also does the ~EP_PRIVATE_BITS test. Given
we're holding the mutex during this time, the conditions tested inside the
loop are always true. This patch drops the test done inside the
re-insertion loop.
Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
With MAX_ARG_STRINGS set to 0x7FFFFFFF, and being passed to 'count()' and
compat_count(), it would appear that the current max bounds check of
fs/exec.c:394:
if(++i > max)
return -E2BIG;
would never trigger. Since 'i' is of type int, so values would wrap and the
function would continue looping.
Simple fix seems to be chaning ++i to i++ and checking for '>='.
Signed-off-by: Jason Baron <jbaron@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: "Ollie Wild" <aaw@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
There are off-by-one errors in decompress_exec() when calculating the length of
optional "original file name" and "comment" fields: the "ret" index is not
incremented when terminating '\0' character is reached. The check of the buffer
overflow (after an "extra-field" length was taken into account) is also fixed.
I've encountered this off-by-one error when tried to reuse
gzip-header-parsing part of the decompress_exec() function. There was an
"original file name" field in the payload (with miscalculated length) and
zlib_inflate() returned Z_DATA_ERROR. But after the fix similar to this
one all worked fine.
Signed-off-by: Volodymyr G Lukiianyk <volodymyrgl@gmail.com>
Acked-by: Greg Ungerer <gerg@snapgear.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When we skip unrecognized options in xfs_fs_remount we should just break
out of the switch and not return because otherwise we may skip clearing
the xfs-internal read-only flag. This will only show up on some
operations like touch because most read-only checks are done by the VFS
which thinks this filesystem is r/w. Eventually we should replace the
XFS read-only flag with a helper that always checks the VFS flag to make
sure they can never get out of sync.
Bug reported and fix verified by Marcel Beister on #xfs.
Bug fix verified by updated xfstests/189.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Timothy Shimmin <tes@sgi.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
I merged the latest ocfs2_read_blocks() changes in xattr.c wrong. This makes
Ocfs2 compile again.
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: (56 commits)
ocfs2: Make cached block reads the common case.
ocfs2: Kill the last naked wait_on_buffer() for cached reads.
ocfs2: Move ocfs2_bread() into dir.c
ocfs2: Simplify ocfs2_read_block()
ocfs2: Require an inode for ocfs2_read_block(s)().
ocfs2: Separate out sync reads from ocfs2_read_blocks()
ocfs2: Refactor xattr list and remove ocfs2_xattr_handler().
ocfs2: Calculate EA hash only by its suffix.
ocfs2: Move trusted and user attribute support into xattr.c
ocfs2: Uninline ocfs2_xattr_name_hash()
ocfs2: Don't check for NULL before brelse()
ocfs2: use smaller counters in ocfs2_remove_xattr_clusters_from_cache
ocfs2: Documentation update for user_xattr / nouser_xattr mount options
ocfs2: make la_debug_mutex static
ocfs2: Remove pointless !!
ocfs2: Add empty bucket support in xattr.
ocfs2/xattr.c: Fix a bug when inserting xattr.
ocfs2: Add xattr mount option in ocfs2_show_options()
ocfs2: Switch over to JBD2.
ocfs2: Add the 'inode64' mount option.
...
|
|
* 'for-2.6.28' of git://linux-nfs.org/~bfields/linux: (59 commits)
svcrdma: Fix IRD/ORD polarity
svcrdma: Update svc_rdma_send_error to use DMA LKEY
svcrdma: Modify the RPC reply path to use FRMR when available
svcrdma: Modify the RPC recv path to use FRMR when available
svcrdma: Add support to svc_rdma_send to handle chained WR
svcrdma: Modify post recv path to use local dma key
svcrdma: Add a service to register a Fast Reg MR with the device
svcrdma: Query device for Fast Reg support during connection setup
svcrdma: Add FRMR get/put services
NLM: Remove unused argument from svc_addsock() function
NLM: Remove "proto" argument from lockd_up()
NLM: Always start both UDP and TCP listeners
lockd: Remove unused fields in the nlm_reboot structure
lockd: Add helper to sanity check incoming NOTIFY requests
lockd: change nlmclnt_grant() to take a "struct sockaddr *"
lockd: Adjust nlmsvc_lookup_host() to accomodate AF_INET6 addresses
lockd: Adjust nlmclnt_lookup_host() signature to accomodate non-AF_INET
lockd: Support non-AF_INET addresses in nlm_lookup_host()
NLM: Convert nlm_lookup_host() to use a single argument
svcrdma: Add Fast Reg MR Data Types
...
|
|
ocfs2_read_blocks() currently requires the CACHED flag for cached I/O.
However, that's the common case. Let's flip it around and provide an
IGNORE_CACHE flag for the special users. This has the added benefit of
cleaning up the code some (ignore_cache takes on its special meaning
earlier in the loop).
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
|
|
ocfs2's cached buffer I/O goes through ocfs2_read_block(s)(). dir.c had
a naked wait_on_buffer() to wait for some readahead, but it should
use ocfs2_read_block() instead.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
|
|
dir.c is the only place using ocfs2_bread(), so let's make it static to
that file.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
|
|
More than 30 callers of ocfs2_read_block() pass exactly OCFS2_BH_CACHED.
Only six pass a different flag set. Rather than have every caller care,
let's make ocfs2_read_block() take no flags and always do a cached read.
The remaining six places can call ocfs2_read_blocks() directly.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
|
|
Now that synchronous readers are using ocfs2_read_blocks_sync(), all
callers of ocfs2_read_blocks() are passing an inode. Use it
unconditionally. Since it's there, we don't need to pass the
ocfs2_super either.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
|
|
The ocfs2_read_blocks() function currently handles sync reads, cached,
reads, and sometimes cached reads. We're going to add some
functionality to it, so first we should simplify it. The uncached,
synchronous reads are much easer to handle as a separate function, so we
instroduce ocfs2_read_blocks_sync().
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
|
|
According to Christoph Hellwig's advice, we really don't need
a ->list to handle one xattr's list. Just a map from index to
xattr prefix is enough. And I also refactor the old list method
with the reference from fs/xfs/linux-2.6/xfs_xattr.c and the
xattr list method in btrfs.
Signed-off-by: Tao Ma <tao.ma@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
|
|
According to Christoph Hellwig's advice, the hash value of EA
is only calculated by its suffix.
Signed-off-by: Tao Ma <tao.ma@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
|
|
Per Christoph Hellwig's suggestion - don't split these up. It's not like we
gained much by having the two tiny files around.
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
|
|
This is too big to be inlined.
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
|
|
This is pointless as brelse() already does the check.
Signed-off-by: Mark Fasheh
|
|
i and b_len don't really need to be u64's. Xattr extent lengths should be
limited by the VFS, and then the size of our on-disk length field.
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
|
|
It can also be moved into ocfs2_la_debug_read().
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
|
|
ocfs2_stack_supports_plocks() doesn't need this to properly return a zero or
one value.
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
|