summaryrefslogtreecommitdiff
path: root/mm/memory-failure.c
diff options
context:
space:
mode:
authorNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>2010-05-28 00:29:20 (GMT)
committerAndi Kleen <ak@linux.intel.com>2010-08-11 07:22:46 (GMT)
commit93f70f900da36fbc19c13c2aa04b2e468c8d00fb (patch)
tree7868f891bca0ed18c9806771a68feac0b4010517 /mm/memory-failure.c
parentc9fbdd5f131440981b124883656ea21fb12cde4a (diff)
downloadlinux-93f70f900da36fbc19c13c2aa04b2e468c8d00fb.tar.xz
HWPOISON, hugetlb: isolate corrupted hugepage
If error hugepage is not in-use, we can fully recovery from error by dequeuing it from freelist, so return RECOVERY. Otherwise whether or not we can recovery depends on user processes, so return DELAYED. Dependency: "HWPOISON, hugetlb: enable error handling path for hugepage" Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Andrew Morton <akpm@linux-foundation.org> Acked-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
Diffstat (limited to 'mm/memory-failure.c')
-rw-r--r--mm/memory-failure.c28
1 files changed, 20 insertions, 8 deletions
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 473f15a..d0b420a 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -690,17 +690,29 @@ static int me_swapcache_clean(struct page *p, unsigned long pfn)
/*
* Huge pages. Needs work.
* Issues:
- * No rmap support so we cannot find the original mapper. In theory could walk
- * all MMs and look for the mappings, but that would be non atomic and racy.
- * Need rmap for hugepages for this. Alternatively we could employ a heuristic,
- * like just walking the current process and hoping it has it mapped (that
- * should be usually true for the common "shared database cache" case)
- * Should handle free huge pages and dequeue them too, but this needs to
- * handle huge page accounting correctly.
+ * - Error on hugepage is contained in hugepage unit (not in raw page unit.)
+ * To narrow down kill region to one page, we need to break up pmd.
+ * - To support soft-offlining for hugepage, we need to support hugepage
+ * migration.
*/
static int me_huge_page(struct page *p, unsigned long pfn)
{
- return FAILED;
+ struct page *hpage = compound_head(p);
+ /*
+ * We can safely recover from error on free or reserved (i.e.
+ * not in-use) hugepage by dequeuing it from freelist.
+ * To check whether a hugepage is in-use or not, we can't use
+ * page->lru because it can be used in other hugepage operations,
+ * such as __unmap_hugepage_range() and gather_surplus_pages().
+ * So instead we use page_mapping() and PageAnon().
+ * We assume that this function is called with page lock held,
+ * so there is no race between isolation and mapping/unmapping.
+ */
+ if (!(page_mapping(hpage) || PageAnon(hpage))) {
+ __isolate_hwpoisoned_huge_page(hpage);
+ return RECOVERED;
+ }
+ return DELAYED;
}
/*