summaryrefslogtreecommitdiff
path: root/fs/f2fs/f2fs.h
AgeCommit message (Collapse)Author
2014-12-08f2fs: avoid to ra unneeded blocks in recover flowChao Yu
To improve recovery speed, f2fs try to readahead many contiguous blocks in warm node segment, but for most time, abnormal power-off do not occur frequently, so when mount a normal power-off f2fs image, by contrary ra so many blocks and then invalid them will hurt the performance of mount. It's better to just ra the first next-block for normal condition. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-12-08f2fs: use atomic for counting inode with inline_{dir,inode} flagChao Yu
As inline_{dir,inode} stat is increased/decreased concurrently by multi threads, so the value is not so accurate, let's use atomic type for counting accurately. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-12-08f2fs: count the number of inmemory pagesJaegeuk Kim
This patch adds counting # of inmemory pages in the page cache. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-12-08f2fs: do retry operations with cond_reschedJaegeuk Kim
This patch revists retrial paths in f2fs. The basic idea is to use cond_resched instead of retrying from the very early stage. Suggested-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-12-04f2fs: use rw_semaphore for nat entry lockJaegeuk Kim
Previoulsy, we used rwlock for nat_entry lock. But, now we have a lot of complex operations in set_node_addr. (e.g., allocating kernel memories, handling radix_trees, and so on) So, this patches tries to change spinlock to rw_semaphore to give CPUs to other threads. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-24f2fs: introduce f2fs_dentry_kunmap to clean upJaegeuk Kim
This patch introduces f2fs_dentry_kunmap to clean up dirty codes. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-20f2fs: introduce struct inode_management to wrap inner fieldsChao Yu
Now in f2fs, we have three inode cache: ORPHAN_INO, APPEND_INO, UPDATE_INO, and we manage fields related to inode cache separately in struct f2fs_sb_info for each inode cache type. This makes codes a bit messy, so that this patch intorduce a new struct inode_management to wrap inner fields as following which make codes more neat. /* for inner inode cache management */ struct inode_management { struct radix_tree_root ino_root; /* ino entry array */ spinlock_t ino_lock; /* for ino entry lock */ struct list_head ino_list; /* inode list head */ unsigned long ino_num; /* number of entries */ }; struct f2fs_sb_info { ... struct inode_management im[MAX_INO_ENTRY]; /* manage inode cache */ ... } Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-06f2fs: introduce the number of inode entriesJaegeuk Kim
This patch adds to monitor the number of ino entries. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-05f2fs: introduce -o fastboot for reducing booting time onlyJaegeuk Kim
If a system wants to reduce the booting time as a top priority, now we can use a mount option, -o fastboot. With this option, f2fs conducts a little bit slow write_checkpoint, but it can avoid the node page reads during the next mount time. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-05f2fs: avoid race condition in handling wait_ioJaegeuk Kim
__submit_merged_bio f2fs_write_end_io f2fs_write_end_io wait_io = X wait_io = x complete(X) complete(X) wait_io = NULL wait_for_completion() free(X) spin_lock(X) kernel panic In order to avoid this, this patch removes the wait_io facility. Instead, we can use wait_on_all_pages_writeback(sbi) to wait for end_ios. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-05f2fs: revisit inline_data to avoid data races and potential bugsJaegeuk Kim
This patch simplifies the inline_data usage with the following rule. 1. inline_data is set during the file creation. 2. If new data is requested to be written ranges out of inline_data, f2fs converts that inode permanently. 3. There is no cases which converts non-inline_data inode to inline_data. 4. The inline_data flag should be changed under inode page lock. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-04f2fs: rename f2fs_set/clear_bit to f2fs_test_and_set/clear_bitGu Zheng
Rename f2fs_set/clear_bit to f2fs_test_and_set/clear_bit, which mean set/clear bit and return the old value, for better readability. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-04f2fs: introduce f2fs_change_bit to simplify the change bit logicGu Zheng
Introduce f2fs_change_bit to simplify the change bit logic in function set_to_next_nat{sit}. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-04f2fs: remove the redundant function cond_clear_inode_flagGu Zheng
Use clear_inode_flag to replace the redundant cond_clear_inode_flag. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-04f2fs: reuse make_empty_dir code for inline_dentryJaegeuk Kim
This patch introduces do_make_empty_dir to mitigate code redundancy for inline_dentry. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-04f2fs: introduce f2fs_dentry_ptr structure for code clean-upJaegeuk Kim
This patch introduces f2fs_dentry_ptr structure for the use of a function parameter in inline_dentry operations. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-04f2fs: reuse core function in f2fs_readdir for inline_dentryJaegeuk Kim
This patch introduces a core function, f2fs_fill_dentries, to remove redundant code in f2fs_readdir and f2fs_read_inline_dir. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-04f2fs: add stat info for inline_dentry inodesJaegeuk Kim
This patch adds status information for inline_dentry inodes. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-04f2fs: avoid deadlock on init_inode_metadataJaegeuk Kim
Previously, init_inode_metadata does not hold any parent directory's inode page. So, f2fs_init_acl can grab its parent inode page without any problem. But, when we use inline_dentry, that page is grabbed during f2fs_add_link, so that we can fall into deadlock condition like below. INFO: task mknod:11006 blocked for more than 120 seconds. Tainted: G OE 3.17.0-rc1+ #13 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. mknod D ffff88003fc94580 0 11006 11004 0x00000000 ffff880007717b10 0000000000000002 ffff88003c323220 ffff880007717fd8 0000000000014580 0000000000014580 ffff88003daecb30 ffff88003c323220 ffff88003fc94e80 ffff88003ffbb4e8 ffff880007717ba0 0000000000000002 Call Trace: [<ffffffff8173dc40>] ? bit_wait+0x50/0x50 [<ffffffff8173d4cd>] io_schedule+0x9d/0x130 [<ffffffff8173dc6c>] bit_wait_io+0x2c/0x50 [<ffffffff8173da3b>] __wait_on_bit_lock+0x4b/0xb0 [<ffffffff811640a7>] __lock_page+0x67/0x70 [<ffffffff810acf50>] ? autoremove_wake_function+0x40/0x40 [<ffffffff811652cc>] pagecache_get_page+0x14c/0x1e0 [<ffffffffa029afa9>] get_node_page+0x59/0x130 [f2fs] [<ffffffffa02a63ad>] read_all_xattrs+0x24d/0x430 [f2fs] [<ffffffffa02a6ca2>] f2fs_getxattr+0x52/0xe0 [f2fs] [<ffffffffa02a7481>] f2fs_get_acl+0x41/0x2d0 [f2fs] [<ffffffff8122d847>] get_acl+0x47/0x70 [<ffffffff8122db5a>] posix_acl_create+0x5a/0x150 [<ffffffffa02a7759>] f2fs_init_acl+0x29/0xcb [f2fs] [<ffffffffa0286a8d>] init_inode_metadata+0x5d/0x340 [f2fs] [<ffffffffa029253a>] f2fs_add_inline_entry+0x12a/0x2e0 [f2fs] [<ffffffffa0286ea5>] __f2fs_add_link+0x45/0x4a0 [f2fs] [<ffffffffa028b5b6>] ? f2fs_new_inode+0x146/0x220 [f2fs] [<ffffffffa028b816>] f2fs_mknod+0x86/0xf0 [f2fs] [<ffffffff811e3ec1>] vfs_mknod+0xe1/0x160 [<ffffffff811e4b26>] SyS_mknod+0x1f6/0x200 [<ffffffff81741d7f>] tracesys+0xe1/0xe6 Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-04f2fs: reuse find_in_block code for find_in_inline_dirJaegeuk Kim
This patch removes redundant copied code in find_in_inline_dir. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-04f2fs: reuse room_for_filename for inline dentry operationJaegeuk Kim
This patch introduces to reuse the existing room_for_filename for inline dentry operation. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-04f2fs: add key function to handle inline dirChao Yu
Adds Functions to implement inline dir init/lookup/insert/delete/convert ops. Signed-off-by: Chao Yu <chao2.yu@samsung.com> [Jaegeuk Kim: remove needless reserved area copy, pointed by Dan Carpenter] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-04f2fs: export dir operations for inline dirChao Yu
This patch exports some dir operations for inline dir, additionally introduces f2fs_drop_nlink from f2fs_delete_entry for reusing by inline dir function. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-04f2fs: add infra struct and helper for inline dirChao Yu
This patch defines macro/inline dentry structure, and adds some helpers for inline dir infrastructure. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-04f2fs: invalidate inmemory pageJaegeuk Kim
If user truncates file's data, we should truncate inmemory pages too. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-11-04f2fs: do not make dirty any inmemory pagesJaegeuk Kim
This patch let inmemory pages be clean all the time. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-10-07f2fs: support volatile operations for transient dataJaegeuk Kim
This patch adds support for volatile writes which keep data pages in memory until f2fs_evict_inode is called by iput. For instance, we can use this feature for the sqlite database as follows. While supporting atomic writes for main database file, we can keep its journal data temporarily in the page cache by the following sequence. 1. open -> ioctl(F2FS_IOC_START_VOLATILE_WRITE); 2. writes : keep all the data in the page cache. 3. flush to the database file with atomic writes a. ioctl(F2FS_IOC_START_ATOMIC_WRITE); b. writes c. ioctl(F2FS_IOC_COMMIT_ATOMIC_WRITE); 4. close -> drop the cached data Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-10-07f2fs: support atomic writesJaegeuk Kim
This patch introduces a very limited functionality for atomic write support. In order to support atomic write, this patch adds two ioctls: o F2FS_IOC_START_ATOMIC_WRITE o F2FS_IOC_COMMIT_ATOMIC_WRITE The database engine should be aware of the following sequence. 1. open -> ioctl(F2FS_IOC_START_ATOMIC_WRITE); 2. writes : all the written data will be treated as atomic pages. 3. commit -> ioctl(F2FS_IOC_COMMIT_ATOMIC_WRITE); : this flushes all the data blocks to the disk, which will be shown all or nothing by f2fs recovery procedure. 4. repeat to #2. The IO pattens should be: ,- START_ATOMIC_WRITE ,- COMMIT_ATOMIC_WRITE CP | D D D D D D | FSYNC | D D D D | FSYNC ... `- COMMIT_ATOMIC_WRITE Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-30f2fs: call f2fs_unlock_op after error was handledJaegeuk Kim
This patch relocates f2fs_unlock_op in every directory operations to be called after any error was processed. Otherwise, the checkpoint can be entered with valid node ids without its dentry when -ENOSPC is occurred. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-30f2fs: refactor flush_nat_entries to remove costly reorganizing opsJaegeuk Kim
Previously, f2fs tries to reorganize the dirty nat entries into multiple sets according to its nid ranges. This can improve the flushing nat pages, however, if there are a lot of cached nat entries, it becomes a bottleneck. This patch introduces a new set management flow by removing dirty nat list and adding a series of set operations when the nat entry becomes dirty. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-30f2fs: introduce FITRIM in f2fs_ioctlJaegeuk Kim
This patch introduces FITRIM in f2fs_ioctl. In this case, f2fs will issue small discards and prefree discards as many as possible for the given area. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-30f2fs: introduce cp_control structureJaegeuk Kim
This patch add a new data structure to control checkpoint parameters. Currently, it presents the reason of checkpoint such as is_umount and normal sync. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-23f2fs: remove redundant operation during roll-forward recoveryJaegeuk Kim
If same data is updated multiple times, we don't need to redo whole the operations. Let's just update the lastest one. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-23f2fs: fix conditions to remain recovery information in f2fs_sync_fileJaegeuk Kim
This patch revisited whole the recovery information during the f2fs_sync_file. In this patch, there are three information to make a decision. a) IS_CHECKPOINTED, /* is it checkpointed before? */ b) HAS_FSYNCED_INODE, /* is the inode fsynced before? */ c) HAS_LAST_FSYNC, /* has the latest node fsync mark? */ And, the scenarios for our rule are based on: [Term] F: fsync_mark, D: dentry_mark 1. inode(x) | CP | inode(x) | dnode(F) 2. inode(x) | CP | inode(F) | dnode(F) 3. inode(x) | CP | dnode(F) | inode(x) | inode(F) 4. inode(x) | CP | dnode(F) | inode(F) 5. CP | inode(x) | dnode(F) | inode(DF) 6. CP | inode(DF) | dnode(F) 7. CP | dnode(F) | inode(DF) 8. CP | dnode(F) | inode(x) | inode(DF) For example, #3, the three conditions should be changed as follows. inode(x) | CP | dnode(F) | inode(x) | inode(F) a) x o o o o b) x x x x o c) x o o x o If f2fs_sync_file stops ------^, it should write inode(F) --------------^ So, the need_inode_block_update should return true, since c) get_nat_flag(e, HAS_LAST_FSYNC), is false. For example, #8, CP | alloc | dnode(F) | inode(x) | inode(DF) a) o x x x x b) x x x o c) o o x o If f2fs_sync_file stops -------^, it should write inode(DF) --------------^ Note that, the roll-forward policy should follow this rule, which means, if there are any missing blocks, we doesn't need to recover that inode. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-23f2fs: use meta_inode cache to improve roll-forward speedJaegeuk Kim
Previously, all the dnode pages should be read during the roll-forward recovery. Even worsely, whole the chain was traversed twice. This patch removes that redundant and costly read operations by using page cache of meta_inode and readahead function as well. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-16f2fs: give an option to enable in-place-updates during fsync to usersJaegeuk Kim
If user wrote F2FS_IPU_FSYNC:4 in /sys/fs/f2fs/ipu_policy, f2fs_sync_file only starts to try in-place-updates. And, if the number of dirty pages is over /sys/fs/f2fs/min_fsync_blocks, it keeps out-of-order manner. Otherwise, it triggers in-place-updates. This may be used by storage showing very high random write performance. For example, it can be used when, Seq. writes (Data) + wait + Seq. writes (Node) is pretty much slower than, Rand. writes (Data) Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-16f2fs: expand counting dirty pages in the inode page cacheJaegeuk Kim
Previously f2fs only counts dirty dentry pages, but there is no reason not to expand the scope. This patch changes the names on the management of dirty pages and to count dirty pages in each inode info as well. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-09f2fs: use lock-less list(llist) to simplify the flush cmd managementGu Zheng
We use flush cmd control to collect many flush cmds, and flush them together. In this case, we use two list to manage the flush cmds (collect and dispatch), and one spin lock is used to protect this. In fact, the lock-less list(llist) is very suitable to this case, and we use simplify this routine. - v2: -use llist_for_each_entry_safe to fix possible use-after-free issue. -remove the unused field from struct flush_cmd. Thanks for Yu's suggestion. - Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-09f2fs: refactor flush_sit_entries codes for reducing SIT writesChao Yu
In commit aec71382c681 ("f2fs: refactor flush_nat_entries codes for reducing NAT writes"), we descripte the issue as below: "Although building NAT journal in cursum reduce the read/write work for NAT block, but previous design leave us lower performance when write checkpoint frequently for these cases: 1. if journal in cursum has already full, it's a bit of waste that we flush all nat entries to page for persistence, but not to cache any entries. 2. if journal in cursum is not full, we fill nat entries to journal util journal is full, then flush the left dirty entries to disk without merge journaled entries, so these journaled entries may be flushed to disk at next checkpoint but lost chance to flushed last time." Actually, we have the same problem in using SIT journal area. In this patch, firstly we will update sit journal with dirty entries as many as possible. Secondly if there is no space in sit journal, we will remove all entries in journal and walk through the whole dirty entry bitmap of sit, accounting dirty sit entries located in same SIT block to sit entry set. All entry sets are linked to list sit_entry_set in sm_info, sorted ascending order by count of entries in set. Later we flush entries in set which have fewest entries into journal as many as we can, and then flush dense set with merged entries to disk. In this way we can use sit journal area more effectively, also we will reduce SIT update, result in gaining in performance and saving lifetime of flash device. In my testing environment, it shows this patch can help to reduce SIT block update obviously. virtual machine + hard disk: fsstress -p 20 -n 400 -l 5 sit page num cp count sit pages/cp based 2006.50 1349.75 1.486 patched 1566.25 1463.25 1.070 Our latency of merging op is small when handling a great number of dirty SIT entries in flush_sit_entries: latency(ns) dirty sit count 36038 2151 49168 2123 37174 2232 Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-09f2fs: need fsck.f2fs when f2fs_bug_on is triggeredJaegeuk Kim
If any f2fs_bug_on is triggered, fsck.f2fs is needed. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-09f2fs: retain inconsistency information to initiate fsck.f2fsJaegeuk Kim
This patch adds sbi->need_fsck to conduct fsck.f2fs later. This flag can only be removed by fsck.f2fs. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-04f2fs: introduce F2FS_I_SB, F2FS_M_SB, and F2FS_P_SBJaegeuk Kim
This patch adds three inline functions to clean up dirty casting codes. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-08-21f2fs: remove rewrite_node_pageJaegeuk Kim
I think we need to let the dirty node pages remain in the page cache instead of rewriting them in their places. So, after done with successful recovery, write_checkpoint will flush all of them through the normal write path. Through this, we can avoid potential error cases in terms of block allocation. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-08-21f2fs: avoid double lock in truncate_blocksJaegeuk Kim
The init_inode_metadata calls truncate_blocks when error is occurred. The callers holds f2fs_lock_op, so we should not call it again in truncate_blocks. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-08-21f2fs: add WARN_ON in f2fs_bug_onJaegeuk Kim
This patch adds WARN_ON when f2fs_bug_on is disable to see kernel messages. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-08-21f2fs: introduce f2fs_cp_error for readabilityJaegeuk Kim
This patch adds f2fs_cp_error for readability. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-08-21f2fs: trigger release_dirty_inode in f2fs_put_superJaegeuk Kim
The generic_shutdown_super calls sync_filesystem, evict_inode, and then f2fs_put_super. In f2fs_evict_inode, we remain some dirty inode information so we should release them at f2fs_put_super. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-08-19f2fs: fix to recover inline_xattr/data and blocksJaegeuk Kim
This patch fixes not to skip xattr recovery and inline xattr/data recovery order. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-08-19f2fs: make clear on test condition and return typesJaegeuk Kim
This patch adds a parentheses to make clear for condition check. And also it changes the return type for better meanings. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-08-19f2fs: should convert inline_data during the mkwriteJaegeuk Kim
If mkwrite is called to an inode having inline_data, it can overwrite the data index space as NEW_ADDR. (e.g., the first 4 bytes are coincidently zero) Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>