summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-07-25f2fs: clean up coding style and redundancyJaegeuk Kim
This patch includes minor clean-ups. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-22f2fs: get victim segment again after new cpYunlei He
Previous selected segment may become free after write_checkpoint, if we do garbage collect on this segment, and then new_curseg happen to reuse it, it may cause f2fs_bug_on as below. panic+0x154/0x29c do_garbage_collect+0x15c/0xaf4 f2fs_gc+0x2dc/0x444 f2fs_balance_fs.part.22+0xcc/0x14c f2fs_balance_fs+0x28/0x34 f2fs_map_blocks+0x5ec/0x790 f2fs_preallocate_blocks+0xe0/0x100 f2fs_file_write_iter+0x64/0x11c new_sync_write+0xac/0x11c vfs_write+0x144/0x1e4 SyS_write+0x60/0xc0 Here, maybe we check sit and ssa type during reset_curseg. So, we check segment is stale or not, and select a new victim to avoid this. Signed-off-by: Yunlei He <heyunlei@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-20f2fs: handle error case with f2fs_bug_onJaegeuk Kim
It's enough to show BUG or WARN by f2fs_bug_on for error case. Then, we don't need to remain corrupted filesystem. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-20f2fs: avoid data race when deciding checkpoin in f2fs_sync_fileJaegeuk Kim
When fs utilization is almost full, f2fs_sync_file should do checkpoint if there is not enough space for roll-forward later. (i.e. space_for_roll_forward) So, currently we have no lock for sbi->alloc_valid_block_count, resulting in race condition. In rare case, we can get -ENOSPC when doing roll-forward which triggers if (is_valid_blkaddr(sbi, dest, META_POR)) { if (src == NULL_ADDR) { err = reserve_new_block(&dn); f2fs_bug_on(sbi, err); ... } ... } in do_recover_data. So, this patch avoids that situation in advance. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-20f2fs: support an ioctl to move a range of data blocksJaegeuk Kim
This patch implements moving a range of data blocks from source file to destination file. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-20f2fs: fix to report error number of f2fs_find_entryChao Yu
This patch fixes to report the right error number of f2fs_find_entry to its caller. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-18f2fs: avoid memory allocation failure due to a long lengthJaegeuk Kim
We need to avoid ENOMEM due to unexpected long length. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-15f2fs: reset default idle interval valueChao Yu
The default value of idle interval is 2 mins, but for most time when screen shutdown, there are still operations during the 2 mins interval, and gc's sleep time is about 30 secs to 60 secs, so there is almost no chance for GC thread to do garbage collecting. Set default value of idle interval value from 2 mins to 5 secs for fixing. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-15f2fs: use blk_plug in all the possible pathsJaegeuk Kim
This patch reverts 19a5f5e2ef37 (f2fs: drop any block plugging), and adds blk_plug in write paths additionally. The main reason is that blk_start_plug can be used to wake up from low-power mode before submitting further bios. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-15f2fs: fix to avoid data update racing between GC and DIOChao Yu
Datas in file can be operated by GC and DIO simultaneously, so we will face race case as below: For write case: Thread A Thread B - generic_file_direct_write - invalidate_inode_pages2_range - f2fs_direct_IO - do_blockdev_direct_IO - do_direct_IO - get_more_blocks - f2fs_gc - do_garbage_collect - gc_data_segment - move_data_page - do_write_data_page migrate data block to new block address - dio_bio_submit update user data to old block address For read case: Thread A Thread B - generic_file_direct_write - invalidate_inode_pages2_range - f2fs_direct_IO - do_blockdev_direct_IO - do_direct_IO - get_more_blocks - f2fs_balance_fs - f2fs_gc - do_garbage_collect - gc_data_segment - move_data_page - do_write_data_page migrate data block to new block address - write_checkpoint - do_checkpoint - clear_prefree_segments - f2fs_issue_discard discard old block adress - dio_bio_submit update user buffer from obsolete block address In order to fix this, for one file, we should let DIO and GC getting exclusion against with each other. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-15f2fs: add maximum prefree segmentsJaegeuk Kim
In 1TB storage, we need to admit 22841 prefree segments, which can consume too much segments. This patch sets 8GB in max. prefree segments in that case. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-15f2fs: disable extent_cache for fcollapse/finsert inodesJaegeuk Kim
This reduces the elapsed time to do xfstests/generic/017. Before: 458 s After: 390 s Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-15f2fs: refactor __exchange_data_block for speed upJaegeuk Kim
This reduces the elapsed time to do xfstests/generic/017. Before: 715 s After: 458 s Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-15f2fs: fix ERR_PTR returned by bioJaegeuk Kim
This is to fix wrong error pointer handling flow reported by Dan. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-08f2fs: avoid mark_inode_dirtyJaegeuk Kim
Let's check inode's dirtiness before calling mark_inode_dirty. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-08f2fs: move i_size_write in f2fs_write_endJaegeuk Kim
We don't need to do i_size_write under page lock. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-08f2fs: fix to avoid redundant discard during fstrimChao Yu
With below test steps, f2fs will issue redundant discard when doing fstrim, the reason is that we issue discards for both prefree segments and consecutive freed region user wants to trim, part regions they covered are overlapped, here, we change to do not to issue any discards for prefree segments in trimmed range. 1. mount -t f2fs -o discard /dev/zram0 /mnt/f2fs 2. fstrim -o 0 -l 3221225472 -m 2097152 -v /mnt/f2fs/ 3. dd if=/dev/zero of=/mnt/f2fs/a bs=2M count=1 4. dd if=/dev/zero of=/mnt/f2fs/b bs=1M count=1 5. sync 6. rm /mnt/f2fs/a /mnt/f2fs/b 7. fstrim -o 0 -l 3221225472 -m 2097152 -v /mnt/f2fs/ Before: <...>-5428 [001] ...1 9511.052125: f2fs_issue_discard: dev = (251,0), blkstart = 0x2200, blklen = 0x200 <...>-5428 [001] ...1 9511.052787: f2fs_issue_discard: dev = (251,0), blkstart = 0x2200, blklen = 0x300 After: <...>-6764 [000] ...1 9720.382504: f2fs_issue_discard: dev = (251,0), blkstart = 0x2200, blklen = 0x300 Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-08f2fs: avoid mismatching block range for discardYunlei He
This patch skip discard block range smaller than trim_minlen, and can not be merged by neighbour Signed-off-by: Yunlei He <heyunlei@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-08f2fs: fix incorrect f_bfree calculation in ->statfsChao Yu
As manual described, f_bfree indicates total free blocks in fs, in f2fs, it includes two parts: visible free blocks and over-provision blocks. This patch corrrects the calculation. fsblkcnt_t f_bfree; /* free blocks in fs */ Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-08f2fs: use percpu_rw_semaphoreJaegeuk Kim
This patch replaces rw_semaphore with percpu_rw_semaphore for: sbi->cp_rwsem nm_i->nat_tree_lock Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-08f2fs: skip to check the block address of node pageJaegeuk Kim
If the node page is up-to-date, it should be alive. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-08f2fs: shrink critical region in spin_lockJaegeuk Kim
This patch shrinks the critical region in spin_lock. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-08f2fs: call SetPageUptodate if neededJaegeuk Kim
SetPageUptodate() issues memory barrier, resulting in performance degrdation. Let's avoid that. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-08f2fs: introduce f2fs_set_page_dirty_nobufferJaegeuk Kim
This patch adds f2fs_set_page_dirty_nobuffer() copied from __set_page_dirty_buffer. When appending 4KB blocks in f2fs on pmem with multiple cores, this improves the overall performance. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-08f2fs: remove unnecessary goto statementTiezhu Yang
When base_addr is NULL, there is no need to call kzfree, it should return -ENOMEM directly. Additionally, it is better to initialize variable 'error' with 0. Signed-off-by: Tiezhu Yang <kernelpatch@126.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-08f2fs: add nodiscard mount optionChao Yu
This patch adds 'nodiscard' mount option. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-08f2fs: fix to redirty page if fail to gc data pageChao Yu
If we fail to move data page during foreground GC, we should give another chance to writeback that page which was set dirty previously by writer. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-08f2fs: fix to detect truncation prior rather than EIO during readChao Yu
In procedure of synchonized read, after sending out the read request, reader will try to lock the page for waiting device to finish the read jobs and unlock the page, but meanwhile, truncater will race with reader, so after reader get lock of the page, it should check page's mapping to detect whether someone has truncated the page in advance, then reader has the chance to do the retry if truncation was done, otherwise read can be failed due to previous condition check. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-08f2fs: fix to avoid reading out encrypted data in page cacheChao Yu
For encrypted inode, if user overwrites data of the inode, f2fs will read encrypted data into page cache, and then do the decryption. However reader can race with overwriter, and it will see encrypted data which has not been decrypted by overwriter yet. Fix it by moving decrypting work to background and keep page non-uptodated until data is decrypted. Thread A Thread B - f2fs_file_write_iter - __generic_file_write_iter - generic_perform_write - f2fs_write_begin - f2fs_submit_page_bio - generic_file_read_iter - do_generic_file_read - lock_page_killable - unlock_page - copy_page_to_iter hit the encrypted data in updated page - lock_page - fscrypt_decrypt_page Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-06f2fs: avoid latency-critical readahead of node pagesJaegeuk Kim
The f2fs_map_blocks is very related to the performance, so let's avoid any latency to read ahead node pages. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-06f2fs: avoid writing node/metapages during writesJaegeuk Kim
Let's keep more node/meta pages in run time. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-06f2fs: produce more nids and reduce readahead natsJaegeuk Kim
The readahead nat pages are more likely to be reclaimed quickly, so it'd better to gather more free nids in advance. And, let's keep some free nids as much as possible. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-06f2fs: detect host-managed SMR by feature flagJaegeuk Kim
If mkfs.f2fs gives a feature flag for host-managed SMR, we can set mode=lfs by default. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-06f2fs: call update_inode_page for orphan inodesJaegeuk Kim
Let's store orphan inode pages right away. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-06f2fs: report error for f2fs_parent_dirJaegeuk Kim
If there is no dentry, we can report its error correctly. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-15f2fs: find parent dentry correctlySheng Yong
If dotdot directory is corrupted, its slot may be ocupied by another file. In this case, dentry[1] is not the parent directory. Rename and cross-rename will update the inode in dentry[1] incorrectly. This patch finds dotdot dentry by name. Signed-off-by: Sheng Yong <shengyong1@huawei.com> [Jaegeuk Kim: remove wron bug_on] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-13f2fs: fix deadlock in add_link failureJaegeuk Kim
mkdir sync_dirty_inode - init_inode_metadata - lock_page(node) - make_empty_dir - filemap_fdatawrite() - do_writepages - lock_page(data) - write_page(data) - lock_page(node) - f2fs_init_acl - error - truncate_inode_pages - lock_page(data) So, we don't need to truncate data pages in this error case, which will be done by f2fs_evict_inode. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-13f2fs: introduce mode=lfs mount optionJaegeuk Kim
This mount option is to enable original log-structured filesystem forcefully. So, there should be no random writes for main area. Especially, this supports host-managed SMR device. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-08f2fs: skip clean segment for gcJaegeuk Kim
If a segment in a section is clean or prefreed, we don't need to get its summary and do gc. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-08f2fs: drop any block pluggingJaegeuk Kim
In f2fs, we don't need to keep block plugging for NODE and DATA writes, since we already merged bios as much as possible. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-08f2fs: avoid reverse IO order for NODE and DATAJaegeuk Kim
There is a data race between allocate_data_block() and f2fs_sbumit_page_mbio(), which incur unnecessary reversed bio submission. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-08f2fs: set mapping error for EIOJaegeuk Kim
If EIO occurred, we need to set all the mapping to avoid any further IOs. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-07f2fs: control not to exceed # of cached nat entriesJaegeuk Kim
This is to avoid cache entry management overhead including radix tree. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-07f2fs: fix wrong percentageJaegeuk Kim
This should be 1%, 10MB / 1GB. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-07f2fs: avoid data race between FI_DIRTY_INODE flag and update_inodeJaegeuk Kim
FI_DIRTY_INODE flag is not covered by inode page lock, so it can be unset at any time like below. Thread #1 Thread #2 - lock_page(ipage) - update i_fields - update i_size/i_blocks/and so on - set FI_DIRTY_INODE - reset FI_DIRTY_INODE - set_page_dirty(ipage) In this case, we can lose the latest i_field information. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-07f2fs: remove obsolete parameter in f2fs_truncateJaegeuk Kim
We don't need lock parameter, which is always true. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-07f2fs: avoid wrong count on dirty inodesJaegeuk Kim
The number should be covered by spin_lock. Otherwise we can see wrong count in f2fs_stat. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-07f2fs: remove deprecated parameterJaegeuk Kim
Remove deprecated paramter. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-03f2fs: handle writepage correctlyJaegeuk Kim
Previously, f2fs_write_data_pages() calls __f2fs_writepage() which calls f2fs_write_data_page(). If f2fs_write_data_page() returns AOP_WRITEPAGE_ACTIVATE, __f2fs_writepage() calls mapping_set_error(). But, this should not happen at every time, since sometimes f2fs_write_data_page() tries to skip writing pages without error. For example, volatile_write() gives EIO all the time, as Shuoran Liu pointed out. Reported-by: Shuoran Liu <liushuoran@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-03f2fs: return error of f2fs_lookupJaegeuk Kim
Now we can report an error to f2fs_lookup given by f2fs_find_entry. Suggested-by: He YunLei <heyunlei@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>