summaryrefslogtreecommitdiff
path: root/drivers/staging/ramster/tmem.c
AgeCommit message (Collapse)Author
2012-09-06staging: ramster: move to new zcache2 codebaseDan Magenheimer
[V2: rebased to apply to 20120905 staging-next, no other changes] The original zcache in staging is a "demo" version, and this is a massive rewrite. This was intended to result in a merged zcache and ramster, but that option has been blocked so, to continue forward progress on ramster and future related projects, only ramster moves to the new codebase. To differentiate between the old demo zcache and the rewrite, we refer to the latter as zcache2, config'd as CONFIG_ZCACHE2. Zcache and zcache2 cannot be built in the same kernel, so CONFIG_ZCACHE2 implies !CONFIG_ZCACHE. This developer still has hope that zcache and zcache2 will be merged into one codebase. Until then, zcache2 can be considered a one-node version of ramster. No history of changes was recorded during the zcache2 rewrite and recreating a sane one would be a Sisyphean task but, since ramster is still in staging and has been unchanged since it was merged, presumably this is acceptable. This commit also provides the hooks in zcache2 for ramster, but all ramster-specific code is provided in a separate commit. Some of the highlights of this rewritten codebase for zcache2: (Note: If you are not familiar with the tmem terminology, you can review it here: http://lwn.net/Articles/454795/ ) 1. Merge of "demo" zcache and the v1.1 version of zcache in ramster. Zcache and ramster had a great deal of duplicate code which is now merged. In essence, zcache2 *is* ramster but with no remote machine available, but !CONFIG_RAMSTER will avoid compiling lots of ramster-specific code. 2. Allocator. Previously, persistent pools used zsmalloc and ephemeral pools used zbud. Now a completely rewritten zbud is used for both. Notably this zbud maintains all persistent (frontswap) and ephemeral (cleancache) pageframes in separate queues in LRU order. 3. Interaction with page allocator. Zbud does no page allocation/freeing, it is done entirely in zcache2 where it can be tracked more effectively. 4. Better pre-allocation. Previously, on put, if a new pageframe could not be pre-allocated, the put would fail, even if the allocator had plenty of partial pages where the data could be stored; this is now fixed. 5. Ouroboros ("eating its own tail") allocation. If no pageframe can be allocated AND no partial pages are available, the least-recently-used ephemeral pageframe is reclaimed immediately (including flushing tmem pointers to it) and re-used. This ensures that most-recently-used cleancache pages are more likely to be retained than LRU pages and also that, as in the core mm subsystem, anonymous pages have a higher priority than clean page cache pages. 6. Zcache and zbud now use debugfs instead of sysfs. Ramster uses debugfs where possible and sysfs where necessary. (Some ramster configuration is done from userspace so some sysfs is necessary.) 7. Modularization. As some have observed, the monolithic zcache-main.c code included zbud code, which has now been separated into its own code module. Much ramster-specific code in the old ramster zcache-main.c has also been moved into ramster.c so that it does not get compiled with !CONFIG_RAMSTER. 8. Rebased to 3.5. This new codebase also provides hooks for several future new features: A. WasActive patch, requires some mm/frontswap changes previously posted. A new version of this patch will be provided separately. See ifdef __PG_WAS_ACTIVE B. Exclusive gets. It seems tmem _can_ support exclusive gets with a minor change to both zcache2 and a small backwards-compatible change to frontswap.c. Explanation and frontswap patch will be provided separately. See ifdef FRONTSWAP_HAS_EXCLUSIVE_GETS C. Ouroboros writeback. Since persistent (frontswap) pages may now also be reclaimed in LRU order, the foundation is in place to properly writeback these pages back into the swap cache and then the swap disk. This is still under development and requires some other mm changes which are prototyped. See ifdef FRONTSWAP_HAS_UNUSE. A new feature that desperately needs attention (if someone is looking for a way to contribute) is kernel module support. A preliminary version of a patch was posted by Erlangen University and needs to be integrated and tested for zcache2 and brought up to kernel standards. If anybody is interested on helping out with any of these, let me know! Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-09-06staging: ramster: remove old driver to prep for new baseDan Magenheimer
[V2: rebased to apply to 20120905 staging-next, no other changes] To prep for moving the ramster codebase on top of the new redesigned zcache2 codebase, we remove ramster (as well as its contained diverged v1.1 version of zcache) entirely. Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-02-15staging: ramster: ramster-specific changes to zcache/tmemDan Magenheimer
RAMster implements peer-to-peer transcendent memory, allowing a "cluster" of kernels to dynamically pool their RAM. This patch incorporates changes transforming zcache to work with a remote store. In tmem.[ch], new "repatriate" (provoke async get) and "localify" (handle incoming data resulting from an async get) routines combine with a handful of changes to existing pamops interfaces allow the generic tmem code to support asynchronous operations. Also, a new tmem_xhandle struct groups together key information that must be passed to remote tmem stores. Zcache-main.c is augmented with a large amount of ramster-specific code to handle remote operations and "foreign" pages on both ends of the "remotify" protocol. New "foreign" pools are auto-created on demand. A "selfshrinker" thread periodically repatriates remote persistent pages when local memory conditions allow. For certain operations, a queue is necessary to guarantee strict ordering as out-of-order puts/flushes can cause strange race conditions. Pampd pointers now either point to local memory OR describe a remote page; to allow the same 64-bits to describe either, the LSB is used to differentiate. Some acrobatics must be performed to ensure local memory is available to handle a remote persistent get, or deal with the data directly anyway if the malloc failed. Lots of ramster-specific statistics are available via sysfs. Note: Some debug ifdefs left in for now. Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-02-15staging: ramster: local compression + tmemDan Magenheimer
RAMster implements peer-to-peer transcendent memory, allowing a "cluster" of kernels to dynamically pool their RAM. This patch copies files from drivers/staging/zcache. RAMster compresses pages locally before transmitting them to another node, so we can leverage the zcache and tmem code directly. Note: there are no ramster-specific changes yet to these files. (Why copy? The ramster tmem.c/tmem.h changes are definitely shareable between zcache and ramster; the eventual destination for tmem.c is the linux lib directory. Ramster changes to zcache are more substantial and zcache is currently undergoing some significant unrelated changes (including a new allocator and breaking zcache-main.c into smaller files), so it seemed best to branch temporarily and merge later.) Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>