summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2014-10-10ocfs2: don't fire quorum before connection establishedJunxiao Bi
Firing quorum before connection established can cause unexpected node to reboot. Assume there are 3 nodes in the cluster, Node 1, 2, 3. Node 2 and 3 have wrong ip address of Node 1 in cluster.conf and global heartbeat is enabled in the cluster. After the heatbeats are started on these three nodes, Node 1 will reboot due to quorum fencing. It is similar case if Node 1's networking is not ready when starting the global heartbeat. The reboot is not friendly as customer is not fully ready for ocfs2 to work. Fix it by not allowing firing quorum before the connection is established. In this case, ocfs2 will wait until the wrong configuration is fixed or networking is up to continue. Also update the log to guide the user where to check when connection is not built for a long time. Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-10fs/ocfs2/dlmglue.c: use __seq_open_private() not seq_open()Rob Jones
Reduce boilerplate code by using seq_open_private() instead of seq_open() Signed-off-by: Rob Jones <rob.jones@codethink.co.uk> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-10fs/ocfs2/cluster/netdebug.c: use seq_open_private() not seq_open()Rob Jones
Reduce boilerplate code by using seq_open_private() instead of seq_open() Note that the code in and using sc_common_open() has been quite extensively changed. Not least because there was a latent memory leak in the code as was: if sc_common_open() failed, the previously allocated buffer was not freed. Signed-off-by: Rob Jones <rob.jones@codethink.co.uk> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-10fs/ocfs2/dlm/dlmdebug.c: use seq_open_private() not seq_open()Rob Jones
Reduce boilerplate code by using seq_open_private() instead of seq_open() Signed-off-by: Rob Jones <rob.jones@codethink.co.uk> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-10ocfs2: remove unused code in dlm_new_lockres()Xue jiufei
Remove the branch that free res->lockname.name because the condition is never satisfied when jump to label error. Signed-off-by: joyce.xue <xuejiufei@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-10ocfs2/dlm: call dlm_lockres_put without resource spinlockalex chen
dlm_lockres_put() should be called without &res->spinlock, otherwise a deadlock case may happen. spin_lock(&res->spinlock) ... dlm_lockres_put ->dlm_lockres_release ->dlm_print_one_lock_resource ->spin_lock(&res->spinlock) Signed-off-by: Alex Chen <alex.chen@huawei.com> Reviewed-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-10ocfs2: call o2quo_exit() if malloc failed in o2net_init()Joseph Qi
In o2net_init, if malloc failed, it directly returns -ENOMEM. Then o2quo_exit won't be called in init_o2nm. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Reviewed-by: joyce.xue <xuejiufei@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-10ocfs2: fix shift left operations overflowJoseph Qi
ocfs2_inode_info->ip_clusters and ocfs2_dinode->id1.bitmap1.i_total are defined as type u32, so the shift left operations may overflow if volume size is large, for example, 2TB and cluster size is 1MB. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Reviewed-by: Alex Chen <alex.chen@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-10ocfs2/dlm: refactor error handling in dlm_alloc_ctxtJoseph Qi
Refactoring error handling in dlm_alloc_ctxt to simplify code. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Reviewed-by: Alex Chen <alex.chen@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-10fs/ocfs2/stack_user.c: fix typo in ocfs2_control_release()Andrew Morton
It is supposed to zero pv_minor. Reported-by: Himangi Saraogi <himangi774@gmail.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-10ntfs: remove bogus spaceAndrea Gelmini
fs/ntfs/debug.c:124: WARNING: space prohibited between function name and open parenthesis '(' Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net> Signed-off-by: Anton Altaparmakov <anton@tuxera.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-10ntfs: use find_get_page_flags() to mark page accessed as it is no longer ↵Anton Altaparmakov
marked later on Mel Gorman's commit 2457aec63745 ("mm: non-atomically mark page accessed during page cache allocation where possible") removed mark_page_accessed() calls from NTFS without updating the matching find_lock_page() to find_get_page_flags(GFP_LOCK | FGP_ACCESSED) thus causing the page to never be marked accessed. This patch fixes that. Signed-off-by: Anton Altaparmakov <anton@tuxera.com> Cc: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-10fanotify: enable close-on-exec on events' fd when requested in fanotify_init()Yann Droneaud
According to commit 80af258867648 ("fanotify: groups can specify their f_flags for new fd"), file descriptors created as part of file access notification events inherit flags from the event_f_flags argument passed to syscall fanotify_init(2)[1]. Unfortunately O_CLOEXEC is currently silently ignored. Indeed, event_f_flags are only given to dentry_open(), which only seems to care about O_ACCMODE and O_PATH in do_dentry_open(), O_DIRECT in open_check_o_direct() and O_LARGEFILE in generic_file_open(). It's a pity, since, according to some lookup on various search engines and http://codesearch.debian.net/, there's already some userspace code which use O_CLOEXEC: - in systemd's readahead[2]: fanotify_fd = fanotify_init(FAN_CLOEXEC|FAN_NONBLOCK, O_RDONLY|O_LARGEFILE|O_CLOEXEC|O_NOATIME); - in clsync[3]: #define FANOTIFY_EVFLAGS (O_LARGEFILE|O_RDONLY|O_CLOEXEC) int fanotify_d = fanotify_init(FANOTIFY_FLAGS, FANOTIFY_EVFLAGS); - in examples [4] from "Filesystem monitoring in the Linux kernel" article[5] by Aleksander Morgado: if ((fanotify_fd = fanotify_init (FAN_CLOEXEC, O_RDONLY | O_CLOEXEC | O_LARGEFILE)) < 0) Additionally, since commit 48149e9d3a7e ("fanotify: check file flags passed in fanotify_init"). having O_CLOEXEC as part of fanotify_init() second argument is expressly allowed. So it seems expected to set close-on-exec flag on the file descriptors if userspace is allowed to request it with O_CLOEXEC. But Andrew Morton raised[6] the concern that enabling now close-on-exec might break existing applications which ask for O_CLOEXEC but expect the file descriptor to be inherited across exec(). In the other hand, as reported by Mihai Dontu[7] close-on-exec on the file descriptor returned as part of file access notify can break applications due to deadlock. So close-on-exec is needed for most applications. More, applications asking for close-on-exec are likely expecting it to be enabled, relying on O_CLOEXEC being effective. If not, it might weaken their security, as noted by Jan Kara[8]. So this patch replaces call to macro get_unused_fd() by a call to function get_unused_fd_flags() with event_f_flags value as argument. This way O_CLOEXEC flag in the second argument of fanotify_init(2) syscall is interpreted and close-on-exec get enabled when requested. [1] http://man7.org/linux/man-pages/man2/fanotify_init.2.html [2] http://cgit.freedesktop.org/systemd/systemd/tree/src/readahead/readahead-collect.c?id=v208#n294 [3] https://github.com/xaionaro/clsync/blob/v0.2.1/sync.c#L1631 https://github.com/xaionaro/clsync/blob/v0.2.1/configuration.h#L38 [4] http://www.lanedo.com/~aleksander/fanotify/fanotify-example.c [5] http://www.lanedo.com/2013/filesystem-monitoring-linux-kernel/ [6] http://lkml.kernel.org/r/20141001153621.65e9258e65a6167bf2e4cb50@linux-foundation.org [7] http://lkml.kernel.org/r/20141002095046.3715eb69@mdontu-l [8] http://lkml.kernel.org/r/20141002104410.GB19748@quack.suse.cz Link: http://lkml.kernel.org/r/cover.1411562410.git.ydroneaud@opteya.com Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed by: Heinrich Schuchardt <xypron.glpk@gmx.de> Tested-by: Heinrich Schuchardt <xypron.glpk@gmx.de> Cc: Mihai Don\u021bu <mihai.dontu@gmail.com> Cc: Pádraig Brady <P@draigBrady.com> Cc: Heinrich Schuchardt <xypron.glpk@gmx.de> Cc: Jan Kara <jack@suse.cz> Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu> Cc: Michael Kerrisk-manpages <mtk.manpages@gmail.com> Cc: Lino Sanfilippo <LinoSanfilippo@gmx.de> Cc: Richard Guy Briggs <rgb@redhat.com> Cc: Eric Paris <eparis@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-10fsnotify: don't put user context if it was never assignedSasha Levin
On some failure paths we may attempt to free user context even if it wasn't assigned yet. This will cause a NULL ptr deref and a kernel BUG. The path I was looking at is in inotify_new_group(): oevent = kmalloc(sizeof(struct inotify_event_info), GFP_KERNEL); if (unlikely(!oevent)) { fsnotify_destroy_group(group); return ERR_PTR(-ENOMEM); } fsnotify_destroy_group() would get called here, but group->inotify_data.user is only getting assigned later: group->inotify_data.user = get_current_user(); Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Cc: John McCutchan <john@johnmccutchan.com> Cc: Robert Love <rlove@rlove.org> Cc: Eric Paris <eparis@parisplace.org> Reviewed-by: Heinrich Schuchardt <xypron.glpk@gmx.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-10fs/notify/group.c: make fsnotify_final_destroy_group() staticAndrew Morton
No callers outside this file. Cc: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09Merge branch 'timers-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner: "Nothing really exciting this time: - a few fixlets in the NOHZ code - a new ARM SoC timer abomination. One should expect that we have enough of them already, but they insist on inventing new ones. - the usual bunch of ARM SoC timer updates. That feels like herding cats" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: clocksource: arm_arch_timer: Consolidate arch_timer_evtstrm_enable clocksource: arm_arch_timer: Enable counter access for 32-bit ARM clocksource: arm_arch_timer: Change clocksource name if CP15 unavailable clocksource: sirf: Disable counter before re-setting it clocksource: cadence_ttc: Add support for 32bit mode clocksource: tcb_clksrc: Sanitize IRQ request clocksource: arm_arch_timer: Discard unavailable timers correctly clocksource: vf_pit_timer: Support shutdown mode ARM: meson6: clocksource: Add Meson6 timer support ARM: meson: documentation: Add timer documentation clocksource: sh_tmu: Document r8a7779 binding clocksource: sh_mtu2: Document r7s72100 binding clocksource: sh_cmt: Document SoC specific bindings timerfd: Remove an always true check nohz: Avoid tick's double reprogramming in highres mode nohz: Fix spurious periodic tick behaviour in low-res dynticks mode
2014-10-08Merge tag 'f2fs-for-3.18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "This patch-set introduces a couple of new features such as large sector size, FITRIM, and atomic/volatile writes. Several patches enhance power-off recovery and checkpoint routines. The fsck.f2fs starts to support fixing corrupted partitions with recovery hints provided by this patch-set. Summary: - retain some recovery information for fsck.f2fs - enhance checkpoint speed - enhance flush command management - bug fix for lseek - tune in-place-update policies - enhance roll-forward speed - revisit all the roll-forward and fsync rules - support larget sector size - support FITRIM - support atomic and volatile writes And several clean-ups and bug fixes are included" * tag 'f2fs-for-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (42 commits) f2fs: support volatile operations for transient data f2fs: support atomic writes f2fs: remove unused return value f2fs: clean up f2fs_ioctl functions f2fs: potential shift wrapping buf in f2fs_trim_fs() f2fs: call f2fs_unlock_op after error was handled f2fs: check the use of macros on block counts and addresses f2fs: refactor flush_nat_entries to remove costly reorganizing ops f2fs: introduce FITRIM in f2fs_ioctl f2fs: introduce cp_control structure f2fs: use more free segments until SSR is activated f2fs: change the ipu_policy option to enable combinations f2fs: fix to search whole dirty segmap when get_victim f2fs: fix to clean previous mount option when remount_fs f2fs: skip punching hole in special condition f2fs: support large sector size f2fs: fix to truncate blocks past EOF in ->setattr f2fs: update i_size when __allocate_data_block f2fs: use MAX_BIO_BLOCKS(sbi) f2fs: remove redundant operation during roll-forward recovery ...
2014-10-08Merge branch 'for-3.18' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
Pull nfsd updates from Bruce Fields: "Highlights: - support the NFSv4.2 SEEK operation (allowing clients to support SEEK_HOLE/SEEK_DATA), thanks to Anna. - end the grace period early in a number of cases, mitigating a long-standing annoyance, thanks to Jeff - improve SMP scalability, thanks to Trond" * 'for-3.18' of git://linux-nfs.org/~bfields/linux: (55 commits) nfsd: eliminate "to_delegation" define NFSD: Implement SEEK NFSD: Add generic v4.2 infrastructure svcrdma: advertise the correct max payload nfsd: introduce nfsd4_callback_ops nfsd: split nfsd4_callback initialization and use nfsd: introduce a generic nfsd4_cb nfsd: remove nfsd4_callback.cb_op nfsd: do not clear rpc_resp in nfsd4_cb_done_sequence nfsd: fix nfsd4_cb_recall_done error handling nfsd4: clarify how grace period ends nfsd4: stop grace_time update at end of grace period nfsd: skip subsequent UMH "create" operations after the first one for v4.0 clients nfsd: set and test NFSD4_CLIENT_STABLE bit to reduce nfsdcltrack upcalls nfsd: serialize nfsdcltrack upcalls for a particular client nfsd: pass extra info in env vars to upcalls to allow for early grace period end nfsd: add a v4_end_grace file to /proc/fs/nfsd lockd: add a /proc/fs/lockd/nlm_end_grace file nfsd: reject reclaim request when client has already sent RECLAIM_COMPLETE nfsd: remove redundant boot_time parm from grace_done client tracking op ...
2014-10-08Merge tag 'nfs-for-3.18-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client updates from Trond Myklebust: "Highlights include: Stable fixes: - fix an NFSv4.1 state renewal regression - fix open/lock state recovery error handling - fix lock recovery when CREATE_SESSION/SETCLIENTID_CONFIRM fails - fix statd when reconnection fails - don't wake tasks during connection abort - don't start reboot recovery if lease check fails - fix duplicate proc entries Features: - pNFS block driver fixes and clean ups from Christoph - More code cleanups from Anna - Improve mmap() writeback performance - Replace use of PF_TRANS with a more generic mechanism for avoiding deadlocks in nfs_release_page" * tag 'nfs-for-3.18-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (66 commits) NFSv4.1: Fix an NFSv4.1 state renewal regression NFSv4: fix open/lock state recovery error handling NFSv4: Fix lock recovery when CREATE_SESSION/SETCLIENTID_CONFIRM fails NFS: Fabricate fscache server index key correctly SUNRPC: Add missing support for RPC_CLNT_CREATE_NO_RETRANS_TIMEOUT NFSv3: Fix missing includes of nfs3_fs.h NFS/SUNRPC: Remove other deadlock-avoidance mechanisms in nfs_release_page() NFS: avoid waiting at all in nfs_release_page when congested. NFS: avoid deadlocks with loop-back mounted NFS filesystems. MM: export page_wakeup functions SCHED: add some "wait..on_bit...timeout()" interfaces. NFS: don't use STABLE writes during writeback. NFSv4: use exponential retry on NFS4ERR_DELAY for async requests. rpc: Add -EPERM processing for xs_udp_send_request() rpc: return sent and err from xs_sendpages() lockd: Try to reconnect if statd has moved SUNRPC: Don't wake tasks during connection abort Fixing lease renewal nfs: fix duplicate proc entries pnfs/blocklayout: Fix a 64-bit division/remainder issue in bl_map_stripe ...
2014-10-08Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull "trivial tree" updates from Jiri Kosina: "Usual pile from trivial tree everyone is so eagerly waiting for" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits) Remove MN10300_PROC_MN2WS0038 mei: fix comments treewide: Fix typos in Kconfig kprobes: update jprobe_example.c for do_fork() change Documentation: change "&" to "and" in Documentation/applying-patches.txt Documentation: remove obsolete pcmcia-cs from Changes Documentation: update links in Changes Documentation: Docbook: Fix generated DocBook/kernel-api.xml score: Remove GENERIC_HAS_IOMAP gpio: fix 'CONFIG_GPIO_IRQCHIP' comments tty: doc: Fix grammar in serial/tty dma-debug: modify check_for_stack output treewide: fix errors in printk genirq: fix reference in devm_request_threaded_irq comment treewide: fix synchronize_rcu() in comments checkstack.pl: port to AArch64 doc: queue-sysfs: minor fixes init/do_mounts: better syntax description MIPS: fix comment spelling powerpc/simpleboot: fix comment ...
2014-10-07f2fs: support volatile operations for transient dataJaegeuk Kim
This patch adds support for volatile writes which keep data pages in memory until f2fs_evict_inode is called by iput. For instance, we can use this feature for the sqlite database as follows. While supporting atomic writes for main database file, we can keep its journal data temporarily in the page cache by the following sequence. 1. open -> ioctl(F2FS_IOC_START_VOLATILE_WRITE); 2. writes : keep all the data in the page cache. 3. flush to the database file with atomic writes a. ioctl(F2FS_IOC_START_ATOMIC_WRITE); b. writes c. ioctl(F2FS_IOC_COMMIT_ATOMIC_WRITE); 4. close -> drop the cached data Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-10-07f2fs: support atomic writesJaegeuk Kim
This patch introduces a very limited functionality for atomic write support. In order to support atomic write, this patch adds two ioctls: o F2FS_IOC_START_ATOMIC_WRITE o F2FS_IOC_COMMIT_ATOMIC_WRITE The database engine should be aware of the following sequence. 1. open -> ioctl(F2FS_IOC_START_ATOMIC_WRITE); 2. writes : all the written data will be treated as atomic pages. 3. commit -> ioctl(F2FS_IOC_COMMIT_ATOMIC_WRITE); : this flushes all the data blocks to the disk, which will be shown all or nothing by f2fs recovery procedure. 4. repeat to #2. The IO pattens should be: ,- START_ATOMIC_WRITE ,- COMMIT_ATOMIC_WRITE CP | D D D D D D | FSYNC | D D D D | FSYNC ... `- COMMIT_ATOMIC_WRITE Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-10-06f2fs: remove unused return valueJaegeuk Kim
Don't return any value without any usage. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-10-03Merge branch 'for-linus' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull cifs/smb3 fixes from Steve French: "Fix for CIFS/SMB3 oops on reconnect during readpages (3.17 regression) and for incorrectly closing file handle in symlink error cases" * 'for-linus' of git://git.samba.org/sfrench/cifs-2.6: CIFS: Fix readpages retrying on reconnects Fix problem recognizing symlinks
2014-10-02ocfs2/dlm: should put mle when goto kill in dlm_assert_master_handleralex chen
In dlm_assert_master_handler, the mle is get in dlm_find_mle, should be put when goto kill, otherwise, this mle will never be released. Signed-off-by: Alex Chen <alex.chen@huawei.com> Reviewed-by: Joseph Qi <joseph.qi@huawei.com> Reviewed-by: joyce.xue <xuejiufei@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-02CIFS: Fix readpages retrying on reconnectsPavel Shilovsky
If we got a reconnect error from async readv we re-add pages back to page_list and continue loop. That is wrong because these pages have been already added to the pagecache but page_list has pages that have not been added to the pagecache yet. This ends up with a general protection fault in put_pages after readpages. Fix it by not retrying the read of these pages and falling back to readpage instead. Fixes debian bug 762306 Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steve French <smfrench@gmail.com> Tested-by: Arthur Marsh <arthur.marsh@internode.on.net>
2014-10-02Fix problem recognizing symlinksSteve French
Changeset eb85d94bd introduced a problem where if a cifs open fails during query info of a file we will still try to close the file (happens with certain types of reparse points) even though the file handle is not valid. In addition for SMB2/SMB3 we were not mapping the return code returned by Windows when trying to open a file (like a Windows NFS symlink) which is a reparse point. Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Pavel Shilovsky <pshilovsky@samba.org> CC: stable <stable@vger.kernel.org> #v3.13+
2014-10-01nfsd: eliminate "to_delegation" defineJeff Layton
We now have cb_to_delegation and to_delegation, which do the same thing and are defined separately in different .c files. Move the cb_to_delegation definition into a header file and eliminate the redundant to_delegation definition. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jeff Layton <jlayton@primarydata.com>
2014-09-30f2fs: clean up f2fs_ioctl functionsJaegeuk Kim
This patch cleans up f2fs_ioctl functions for better readability. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-30f2fs: potential shift wrapping buf in f2fs_trim_fs()Dan Carpenter
My static checker complains that segment is a u64 but only the lower 31 bits can be used before we hit a shift wrapping bug. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-30f2fs: call f2fs_unlock_op after error was handledJaegeuk Kim
This patch relocates f2fs_unlock_op in every directory operations to be called after any error was processed. Otherwise, the checkpoint can be entered with valid node ids without its dentry when -ENOSPC is occurred. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-30f2fs: check the use of macros on block counts and addressesJaegeuk Kim
This patch cleans up the existing and new macros for readability. Rule is like this. ,-----------------------------------------> MAX_BLKADDR -, | ,------------- TOTAL_BLKS ----------------------------, | | | | ,- seg0_blkaddr ,----- sit/nat/ssa/main blkaddress | block | | (SEG0_BLKADDR) | | | | (e.g., MAIN_BLKADDR) | address 0..x................ a b c d ............................. | | global seg# 0...................... m ............................. | | | | `------- MAIN_SEGS -----------' `-------------- TOTAL_SEGS ---------------------------' | | seg# 0..........xx.................. = Note = o GET_SEGNO_FROM_SEG0 : blk address -> global segno o GET_SEGNO : blk address -> segno o START_BLOCK : segno -> starting block address Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-30f2fs: refactor flush_nat_entries to remove costly reorganizing opsJaegeuk Kim
Previously, f2fs tries to reorganize the dirty nat entries into multiple sets according to its nid ranges. This can improve the flushing nat pages, however, if there are a lot of cached nat entries, it becomes a bottleneck. This patch introduces a new set management flow by removing dirty nat list and adding a series of set operations when the nat entry becomes dirty. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-30f2fs: introduce FITRIM in f2fs_ioctlJaegeuk Kim
This patch introduces FITRIM in f2fs_ioctl. In this case, f2fs will issue small discards and prefree discards as many as possible for the given area. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-30f2fs: introduce cp_control structureJaegeuk Kim
This patch add a new data structure to control checkpoint parameters. Currently, it presents the reason of checkpoint such as is_umount and normal sync. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-30Merge branch 'bugfixes' into linux-nextTrond Myklebust
* bugfixes: NFSv4.1: Fix an NFSv4.1 state renewal regression NFSv4: fix open/lock state recovery error handling NFSv4: Fix lock recovery when CREATE_SESSION/SETCLIENTID_CONFIRM fails NFS: Fabricate fscache server index key correctly SUNRPC: Add missing support for RPC_CLNT_CREATE_NO_RETRANS_TIMEOUT nfs: fix duplicate proc entries
2014-09-30NFSv4.1: Fix an NFSv4.1 state renewal regressionAndy Adamson
Commit 2f60ea6b8ced ("NFSv4: The NFSv4.0 client must send RENEW calls if it holds a delegation") set the NFS4_RENEW_TIMEOUT flag in nfs4_renew_state, and does not put an nfs41_proc_async_sequence call, the NFSv4.1 lease renewal heartbeat call, on the wire to renew the NFSv4.1 state if the flag was not set. The NFS4_RENEW_TIMEOUT flag is set when "now" is after the last renewal (cl_last_renewal) plus the lease time divided by 3. This is arbitrary and sometimes does the following: In normal operation, the only way a future state renewal call is put on the wire is via a call to nfs4_schedule_state_renewal, which schedules a nfs4_renew_state workqueue task. nfs4_renew_state determines if the NFS4_RENEW_TIMEOUT should be set, and the calls nfs41_proc_async_sequence, which only gets sent if the NFS4_RENEW_TIMEOUT flag is set. Then the nfs41_proc_async_sequence rpc_release function schedules another state remewal via nfs4_schedule_state_renewal. Without this change we can get into a state where an application stops accessing the NFSv4.1 share, state renewal calls stop due to the NFS4_RENEW_TIMEOUT flag _not_ being set. The only way to recover from this situation is with a clientid re-establishment, once the application resumes and the server has timed out the lease and so returns NFS4ERR_BAD_SESSION on the subsequent SEQUENCE operation. An example application: open, lock, write a file. sleep for 6 * lease (could be less) ulock, close. In the above example with NFSv4.1 delegations enabled, without this change, there are no OP_SEQUENCE state renewal calls during the sleep, and the clientid is recovered due to lease expiration on the close. This issue does not occur with NFSv4.1 delegations disabled, nor with NFSv4.0, with or without delegations enabled. Signed-off-by: Andy Adamson <andros@netapp.com> Link: http://lkml.kernel.org/r/1411486536-23401-1-git-send-email-andros@netapp.com Fixes: 2f60ea6b8ced (NFSv4: The NFSv4.0 client must send RENEW calls...) Cc: stable@vger.kernel.org # 3.2.x Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-09-30nfsd4: fix corruption of NFSv4 read dataJ. Bruce Fields
The calculation of page_ptr here is wrong in the case the read doesn't start at an offset that is a multiple of a page. The result is that nfs4svc_encode_compoundres sets rq_next_page to a value one too small, and then the loop in svc_free_res_pages may incorrectly fail to clear a page pointer in rq_respages[]. Pages left in rq_respages[] are available for the next rpc request to use, so xdr data may be written to that page, which may hold data still waiting to be transmitted to the client or data in the page cache. The observed result was silent data corruption seen on an NFSv4 client. We tag this as "fixing" 05638dc73af2 because that commit exposed this bug, though the incorrect calculation predates it. Particular thanks to Andrea Arcangeli and David Gilbert for analysis and testing. Fixes: 05638dc73af2 "nfsd4: simplify server xdr->next_page use" Cc: stable@vger.kernel.org Reported-by: Andrea Arcangeli <aarcange@redhat.com> Tested-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-09-29NFSD: Implement SEEKAnna Schumaker
This patch adds server support for the NFS v4.2 operation SEEK, which returns the position of the next hole or data segment in a file. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-09-29NFSD: Add generic v4.2 infrastructureAnna Schumaker
It's cleaner to introduce everything at once and have the server reply with "not supported" than it would be to introduce extra operations when implementing a specific one in the middle of the list. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-09-28NFSv4: fix open/lock state recovery error handlingTrond Myklebust
The current open/lock state recovery unfortunately does not handle errors such as NFS4ERR_CONN_NOT_BOUND_TO_SESSION correctly. Instead of looping, just proceeds as if the state manager is finished recovering. This patch ensures that we loop back, handle higher priority errors and complete the open/lock state recovery. Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-09-28NFSv4: Fix lock recovery when CREATE_SESSION/SETCLIENTID_CONFIRM failsTrond Myklebust
If a NFSv4.x server returns NFS4ERR_STALE_CLIENTID in response to a CREATE_SESSION or SETCLIENTID_CONFIRM in order to tell us that it rebooted a second time, then the client will currently take this to mean that it must declare all locks to be stale, and hence ineligible for reboot recovery. RFC3530 and RFC5661 both suggest that the client should instead rely on the server to respond to inelegible open share, lock and delegation reclaim requests with NFS4ERR_NO_GRACE in this situation. Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2014-09-28Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: "Assorted fixes + unifying __d_move() and __d_materialise_dentry() + minimal regression fix for d_path() of victims of overwriting rename() ported on top of that" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: vfs: Don't exchange "short" filenames unconditionally. fold swapping ->d_name.hash into switch_names() fold unlocking the children into dentry_unlock_parents_for_move() kill __d_materialise_dentry() __d_materialise_dentry(): flip the order of arguments __d_move(): fold manipulations with ->d_child/->d_subdirs don't open-code d_rehash() in d_materialise_unique() pull rehashing and unlocking the target dentry into __d_materialise_dentry() ufs: deal with nfsd/iget races fuse: honour max_read and max_write in direct_io mode shmem: fix nlink for rename overwrite directory
2014-09-27vfs: Don't exchange "short" filenames unconditionally.Mikhail Efremov
Only exchange source and destination filenames if flags contain RENAME_EXCHANGE. In case if executable file was running and replaced by other file /proc/PID/exe should still show correct file name, not the old name of the file by which it was replaced. The scenario when this bug manifests itself was like this: * ALT Linux uses rpm and start-stop-daemon; * during a package upgrade rpm creates a temporary file for an executable to rename it upon successful unpacking; * start-stop-daemon is run subsequently and it obtains the (nonexistant) temporary filename via /proc/PID/exe thus failing to identify the running process. Note that "long" filenames (> DNAiME_INLINE_LEN) are still exchanged without RENAME_EXCHANGE and this behaviour exists long enough (should be fixed too apparently). So this patch is just an interim workaround that restores behavior for "short" names as it was before changes introduced by commit da1ce0670c14 ("vfs: add cross-rename"). See https://lkml.org/lkml/2014/9/7/6 for details. AV: the comments about being more careful with ->d_name.hash than with ->d_name.name are from back in 2.3.40s; they became obsolete by 2.3.60s, when we started to unhash the target instead of swapping hash chain positions followed by d_delete() as we used to do when dcache was first introduced. Acked-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Cc: stable@vger.kernel.org Fixes: da1ce0670c14 "vfs: add cross-rename" Signed-off-by: Mikhail Efremov <sem@altlinux.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-09-27fold swapping ->d_name.hash into switch_names()Linus Torvalds
and do it along with ->d_name.len there Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-09-27fold unlocking the children into dentry_unlock_parents_for_move()Al Viro
... renaming it into dentry_unlock_for_move() and making it more symmetric with dentry_lock_for_move(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-09-27kill __d_materialise_dentry()Al Viro
it folds into __d_move() now Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-09-27__d_materialise_dentry(): flip the order of argumentsAl Viro
... thus making it much closer to (now unreachable, BTW) IS_ROOT(dentry) case in __d_move(). A bit more and it'll fold in. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-09-27__d_move(): fold manipulations with ->d_child/->d_subdirsAl Viro
list_del() + list_add() is a slightly pessimised list_move() list_del() + INIT_LIST_HEAD() is a slightly pessimised list_del_init() Interleaving those makes the resulting code even worse. And harder to follow... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-09-27don't open-code d_rehash() in d_materialise_unique()Al Viro
... and get rid of duplicate BUG_ON() there Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>