summaryrefslogtreecommitdiff
path: root/fs/f2fs/f2fs.h
AgeCommit message (Collapse)Author
2017-06-24crypto: Work around deallocated stack frame reference gcc bug on sparc.David Miller
commit d41519a69b35b10af7fda867fb9100df24fdf403 upstream. On sparc, if we have an alloca() like situation, as is the case with SHASH_DESC_ON_STACK(), we can end up referencing deallocated stack memory. The result can be that the value is clobbered if a trap or interrupt arrives at just the right instruction. It only occurs if the function ends returning a value from that alloca() area and that value can be placed into the return value register using a single instruction. For example, in lib/libcrc32c.c:crc32c() we end up with a return sequence like: return %i7+8 lduw [%o5+16], %o0 ! MEM[(u32 *)__shash_desc.1_10 + 16B], %o5 holds the base of the on-stack area allocated for the shash descriptor. But the return released the stack frame and the register window. So if an intererupt arrives between 'return' and 'lduw', then the value read at %o5+16 can be corrupted. Add a data compiler barrier to work around this problem. This is exactly what the gcc fix will end up doing as well, and it absolutely should not change the code generated for other cpus (unless gcc on them has the same bug :-) With crucial insight from Eric Sandeen. Reported-by: Anatoly Pugachev <matorola@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25f2fs: check entire encrypted bigname when finding a dentryJaegeuk Kim
commit 6332cd32c8290a80e929fc044dc5bdba77396e33 upstream. If user has no key under an encrypted dir, fscrypt gives digested dentries. Previously, when looking up a dentry, f2fs only checks its hash value with first 4 bytes of the digested dentry, which didn't handle hash collisions fully. This patch enhances to check entire dentry bytes likewise ext4. Eric reported how to reproduce this issue by: # seq -f "edir/abcdefghijklmnopqrstuvwxyz012345%.0f" 100000 | xargs touch # find edir -type f | xargs stat -c %i | sort | uniq | wc -l 100000 # sync # echo 3 > /proc/sys/vm/drop_caches # keyctl new_session # find edir -type f | xargs stat -c %i | sort | uniq | wc -l 99999 Cc: <stable@vger.kernel.org> Reported-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> (fixed f2fs_dentry_hash() to work even when the hash is 0) Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-12f2fs: add ovp valid_blocks check for bg gc victim to fg_gcHou Pengyang
commit e93b9865251a0503d83fd570e7d5a7c8bc351715 upstream. For foreground gc, greedy algorithm should be adapted, which makes this formula work well: (2 * (100 / config.overprovision + 1) + 6) But currently, we fg_gc have a prior to select bg_gc victim segments to gc first, these victims are selected by cost-benefit algorithm, we can't guarantee such segments have the small valid blocks, which may destroy the f2fs rule, on the worstest case, would consume all the free segments. This patch fix this by add a filter in check_bg_victims, if segment's has # of valid blocks over overprovision ratio, skip such segments. Signed-off-by: Hou Pengyang <houpengyang@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-12f2fs: fix multiple f2fs_add_link() calls having same nameJaegeuk Kim
commit 88c5c13a5027b36d914536fdba23f069d7067204 upstream. It turns out a stakable filesystem like sdcardfs in AOSP can trigger multiple vfs_create() to lower filesystem. In that case, f2fs will add multiple dentries having same name which breaks filesystem consistency. Until upper layer fixes, let's work around by f2fs, which shows actually not much performance regression. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-12f2fs: remove percpu_count due to performance regressionJaegeuk Kim
commit 35782b233f37e48ecc469d9c7232f3f6a7fad41a upstream. This patch removes percpu_count usage due to performance regression in iozone. Fixes: 523be8a6b3 ("f2fs: use percpu_counter for page counters") Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-06f2fs: fix to determine start_cp_addr by sbi->cur_cp_packJaegeuk Kim
commit 8508e44ae98622f841f5ef29d0bf3d5db4e0c1cc upstream. We don't guarantee cp_addr is fixed by cp_version. This is to sync with f2fs-tools. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-06Revert "f2fs: use percpu_counter for # of dirty pages in inode"Jaegeuk Kim
commit 204706c7accfabb67b97eef9f9a28361b6201199 upstream. This reverts commit 1beba1b3a953107c3ff5448ab4e4297db4619c76. The perpcu_counter doesn't provide atomicity in single core and consume more DRAM. That incurs fs_mark test failure due to ENOMEM. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-01f2fs: support checkpoint error injectionChao Yu
This patch adds to support checkpoint error injection in f2fs for testing fatal error tolerance, it will be useful that it can simulate abnormal power off by f2fs itself instead of calling godown ioctl by running apps. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-10-01f2fs: support configuring fault injection per superblockChao Yu
Previously, we only support global fault injection configuration, so that when we configure type/rate of fault injection through sysfs, mount option, it will influence all f2fs partition which is being used. It is not make sence, since it will be not convenient if developer want to test separated partitions with different fault injection rate/type simultaneously, also it's not possible to enable fault injection in one partition and disable fault injection in other one. >From now on, we move global configuration of fault injection in module into per-superblock, hence injection testing can be more flexible. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-10-01f2fs: add customized migrate_page callbackWeichao Guo
This patch improves the migration of dirty pages and allows migrating atomic written pages that F2FS uses in Page Cache. Instead of the fallback releasing page path, it provides better performance for memory compaction, CMA and other users of memory page migrating. For dirty pages, there is no need to write back first when migrating. For an atomic written page before committing, we can migrate the page and update the related 'inmem_pages' list at the same time. Signed-off-by: Weichao Guo <guoweichao@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> [Jaegeuk Kim: fix some coding style] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-10-01f2fs: introduce cp_lock to protect updating of ckpt_flagsChao Yu
This patch introduces spinlock to protect updating process of ckpt_flags field in struct f2fs_checkpoint, it avoids incorrectly updating in race condition. Signed-off-by: Chao Yu <yuchao0@huawei.com> [Jaegeuk Kim: add __is_set_ckpt_flags likewise __set_ckpt_flags] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-09-30f2fs: fix to avoid race condition when updating sbi flagChao Yu
Making updating of sbi flag atomic by using {test,set,clear}_bit, otherwise in concurrency scenario, the flag could be updated incorrectly. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-09-30f2fs: use crc and cp version to determine roll-forward recoveryJaegeuk Kim
Previously, we used cp_version only to detect recoverable dnodes. In order to avoid same garbage cp_version, we needed to truncate the next dnode during checkpoint, resulting in additional discard or data write. If we can distinguish this by using crc in addition to cp_version, we can remove this overhead. There is backward compatibility concern where it changes node_footer layout. So, this patch introduces a new checkpoint flag, CP_CRC_RECOVERY_FLAG, to detect new layout. New layout will be activated only when this flag is set. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-09-22f2fs: show dirty inode numberChao Yu
This patch enables showing dirty inode number in procfs. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-09-22f2fs: support IO error injectionChao Yu
This patch adds to support IO error injection for testing IO error tolerance of f2fs. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-09-22f2fs: make f2fs_filetype_table staticChao Yu
There is no more user of f2fs_filetype_table outside of dir.c, make it static. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-09-13f2fs: avoid ENOMEM during roll-forward recoveryJaegeuk Kim
This patch gives another chances during roll-forward recovery regarding to -ENOMEM. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-09-08f2fs: add roll-forward recovery process for encrypted dentryShuoran Liu
Add roll-forward recovery process for encrypted dentry, so the first fsync issued to an encrypted file does not need writing checkpoint. This improves the performance of the following test at thousands of small files: open -> write -> fsync -> close Signed-off-by: Shuoran Liu <liushuoran@huawei.com> Acked-by: Chao Yu <yuchao0@huawei.com> [Jaegeuk Kim: modify kernel message to show encrypted names] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-09-08f2fs: fix lost xattrs of directoriesJaegeuk Kim
This patch enhances the xattr consistency of dirs from suddern power-cuts. Possible scenario would be: 1. dir->setxattr used by per-file encryption 2. file->setxattr goes into inline_xattr 3. file->fsync In that case, we should do checkpoint for #1. Otherwise we'd lose dir's key information for the file given #2. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-09-08f2fs: support async discardChao Yu
Like most filesystems, f2fs will issue discard command synchronously, so when user trigger fstrim through ioctl, multiple discard commands will be issued serially with sync mode, which makes poor performance. In this patch we try to support async discard, so that all discard commands can be issued and be waited for endio in batch to improve performance. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-09-08f2fs: fix to do security initialization of encrypted inode with original ↵Chao Yu
filename When creating new inode, security_inode_init_security will be called for initializing security info related to the inode, and filename is passed to security module, it helps security module such as SElinux to know which rule or label could be applied for the inode with specified name. Previously, if new inode is created as an encrypted one, f2fs will transfer encrypted filename to security module which may fail the check of security policy belong to the inode. So in order to this issue, alter to transfer original unencrypted filename instead. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-08-30f2fs: set dirty state for filesystem only when updating meta dataChao Yu
We don't guarantee integrity of user data after checkpoint, since we only guarantee meta data integrity for data consistency of filesystem. Due to above reason, we only need to set fs as dirty when meta data is updated, so that we can skip writing checkpoint in some case of non-meta data is updated. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-08-30f2fs: add discard info to sys entry of f2fs statusYunlei He
This patch add discard block count to sys entry of f2fs status Signed-off-by: Yunlei He <heyunlei@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-08-30f2fs: reduce batch size of fstrimJaegeuk Kim
This is to reduce the batch size of fstrim to avoid long latency. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-08-24f2fs: do not use discard_map for hard disksJaegeuk Kim
We don't need to keep discard_map, if disk does not support discard command. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-08-19Revert "f2fs: use percpu_rw_semaphore"Jaegeuk Kim
LKP reported -36.3% regression of fsmark.files_per_sec due to this patch. I've confirmed that fxmark [1] has also slight regression for DWAL. [1] https://github.com/sslab-gatech/fxmark This reverts commit ec795418c41850056feb956534edf059dc1155d4.
2016-08-06Merge branch 'work.const-qstr' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull qstr constification updates from Al Viro: "Fairly self-contained bunch - surprising lot of places passes struct qstr * as an argument when const struct qstr * would suffice; it complicates analysis for no good reason. I'd prefer to feed that separately from the assorted fixes (those are in #for-linus and with somewhat trickier topology)" * 'work.const-qstr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: qstr: constify instances in adfs qstr: constify instances in lustre qstr: constify instances in f2fs qstr: constify instances in ext2 qstr: constify instances in vfat qstr: constify instances in procfs qstr: constify instances in fuse qstr constify instances in fs/dcache.c qstr: constify instances in nfs qstr: constify instances in ocfs2 qstr: constify instances in autofs4 qstr: constify instances in hfs qstr: constify instances in hfsplus qstr: constify instances in logfs qstr: constify dentry_init_security
2016-07-30qstr: constify instances in f2fsAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-07-27Merge tag 'for-f2fs-4.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "The major change in this version is mitigating cpu overheads on write paths by replacing redundant inode page updates with mark_inode_dirty calls. And we tried to reduce lock contentions as well to improve filesystem scalability. Other feature is setting F2FS automatically when detecting host-managed SMR. Enhancements: - ioctl to move a range of data between files - inject orphan inode errors - avoid flush commands congestion - support lazytime Bug fixes: - return proper results for some dentry operations - fix deadlock in add_link failure - disable extent_cache for fcollapse/finsert" * tag 'for-f2fs-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (68 commits) f2fs: clean up coding style and redundancy f2fs: get victim segment again after new cp f2fs: handle error case with f2fs_bug_on f2fs: avoid data race when deciding checkpoin in f2fs_sync_file f2fs: support an ioctl to move a range of data blocks f2fs: fix to report error number of f2fs_find_entry f2fs: avoid memory allocation failure due to a long length f2fs: reset default idle interval value f2fs: use blk_plug in all the possible paths f2fs: fix to avoid data update racing between GC and DIO f2fs: add maximum prefree segments f2fs: disable extent_cache for fcollapse/finsert inodes f2fs: refactor __exchange_data_block for speed up f2fs: fix ERR_PTR returned by bio f2fs: avoid mark_inode_dirty f2fs: move i_size_write in f2fs_write_end f2fs: fix to avoid redundant discard during fstrim f2fs: avoid mismatching block range for discard f2fs: fix incorrect f_bfree calculation in ->statfs f2fs: use percpu_rw_semaphore ...
2016-07-20f2fs: avoid data race when deciding checkpoin in f2fs_sync_fileJaegeuk Kim
When fs utilization is almost full, f2fs_sync_file should do checkpoint if there is not enough space for roll-forward later. (i.e. space_for_roll_forward) So, currently we have no lock for sbi->alloc_valid_block_count, resulting in race condition. In rare case, we can get -ENOSPC when doing roll-forward which triggers if (is_valid_blkaddr(sbi, dest, META_POR)) { if (src == NULL_ADDR) { err = reserve_new_block(&dn); f2fs_bug_on(sbi, err); ... } ... } in do_recover_data. So, this patch avoids that situation in advance. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-20f2fs: support an ioctl to move a range of data blocksJaegeuk Kim
This patch implements moving a range of data blocks from source file to destination file. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-20f2fs: fix to report error number of f2fs_find_entryChao Yu
This patch fixes to report the right error number of f2fs_find_entry to its caller. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-15f2fs: reset default idle interval valueChao Yu
The default value of idle interval is 2 mins, but for most time when screen shutdown, there are still operations during the 2 mins interval, and gc's sleep time is about 30 secs to 60 secs, so there is almost no chance for GC thread to do garbage collecting. Set default value of idle interval value from 2 mins to 5 secs for fixing. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-15f2fs: fix to avoid data update racing between GC and DIOChao Yu
Datas in file can be operated by GC and DIO simultaneously, so we will face race case as below: For write case: Thread A Thread B - generic_file_direct_write - invalidate_inode_pages2_range - f2fs_direct_IO - do_blockdev_direct_IO - do_direct_IO - get_more_blocks - f2fs_gc - do_garbage_collect - gc_data_segment - move_data_page - do_write_data_page migrate data block to new block address - dio_bio_submit update user data to old block address For read case: Thread A Thread B - generic_file_direct_write - invalidate_inode_pages2_range - f2fs_direct_IO - do_blockdev_direct_IO - do_direct_IO - get_more_blocks - f2fs_balance_fs - f2fs_gc - do_garbage_collect - gc_data_segment - move_data_page - do_write_data_page migrate data block to new block address - write_checkpoint - do_checkpoint - clear_prefree_segments - f2fs_issue_discard discard old block adress - dio_bio_submit update user buffer from obsolete block address In order to fix this, for one file, we should let DIO and GC getting exclusion against with each other. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-15f2fs: disable extent_cache for fcollapse/finsert inodesJaegeuk Kim
This reduces the elapsed time to do xfstests/generic/017. Before: 458 s After: 390 s Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-08f2fs: avoid mark_inode_dirtyJaegeuk Kim
Let's check inode's dirtiness before calling mark_inode_dirty. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-08f2fs: use percpu_rw_semaphoreJaegeuk Kim
This patch replaces rw_semaphore with percpu_rw_semaphore for: sbi->cp_rwsem nm_i->nat_tree_lock Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-08f2fs: shrink critical region in spin_lockJaegeuk Kim
This patch shrinks the critical region in spin_lock. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-08f2fs: introduce f2fs_set_page_dirty_nobufferJaegeuk Kim
This patch adds f2fs_set_page_dirty_nobuffer() copied from __set_page_dirty_buffer. When appending 4KB blocks in f2fs on pmem with multiple cores, this improves the overall performance. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-06f2fs: produce more nids and reduce readahead natsJaegeuk Kim
The readahead nat pages are more likely to be reclaimed quickly, so it'd better to gather more free nids in advance. And, let's keep some free nids as much as possible. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-06f2fs: detect host-managed SMR by feature flagJaegeuk Kim
If mkfs.f2fs gives a feature flag for host-managed SMR, we can set mode=lfs by default. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-06f2fs: call update_inode_page for orphan inodesJaegeuk Kim
Let's store orphan inode pages right away. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-15f2fs: find parent dentry correctlySheng Yong
If dotdot directory is corrupted, its slot may be ocupied by another file. In this case, dentry[1] is not the parent directory. Rename and cross-rename will update the inode in dentry[1] incorrectly. This patch finds dotdot dentry by name. Signed-off-by: Sheng Yong <shengyong1@huawei.com> [Jaegeuk Kim: remove wron bug_on] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-13f2fs: introduce mode=lfs mount optionJaegeuk Kim
This mount option is to enable original log-structured filesystem forcefully. So, there should be no random writes for main area. Especially, this supports host-managed SMR device. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-08f2fs: avoid reverse IO order for NODE and DATAJaegeuk Kim
There is a data race between allocate_data_block() and f2fs_sbumit_page_mbio(), which incur unnecessary reversed bio submission. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-07f2fs: use bio op accessorsMike Christie
Separate the op from the rq_flag_bits and have f2fs set/get the bio using bio_set_op_attrs/bio_op. Signed-off-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-06-07f2fs: remove obsolete parameter in f2fs_truncateJaegeuk Kim
We don't need lock parameter, which is always true. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-07f2fs: remove deprecated parameterJaegeuk Kim
Remove deprecated paramter. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-03f2fs: inject to produce some orphan inodesJaegeuk Kim
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-03f2fs: remove writepages lockJaegeuk Kim
This patch removes writepages lock. We can improve multi-threading performance. tiobench, 32 threads, 4KB write per fsync on SSD Before: 25.88 MB/s After: 28.03 MB/s Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>