From: Huang Ying <ying.huang(a)intel.com>
ANBZ: #80
commit bfe9d006c971a5daefe7a8b27819ccd497090fd8 upstream
When zone_watermark_ok() is called in migrate_balanced_pgdat() to check
migration target node, the parameter classzone_idx (for requested zone)
is specified as 0 (ZONE_DMA). But when allocating memory for autonuma
in alloc_misplaced_dst_page(), the requested zone from GFP flags is
ZONE_MOVABLE. That is, the requested zone is different. The size of
lowmem_reserve for the different requested zone is different. And this
may cause some issues.
For example, in the zoneinfo of a test machine as below,
Node 0, zone DMA32
pages free 61592
min 29
low 454
high 879
spanned 1044480
present 442306
managed 425921
protection: (0, 0, 62457, 62457, 62457)
The free page number of ZONE_DMA32 is greater than "high watermark +
lowmem_reserve[ZONE_DMA]", but less than "high watermark +
lowmem_reserve[ZONE_MOVABLE]". And because __alloc_pages_node() in
alloc_misplaced_dst_page() requests ZONE_MOVABLE, the
zone_watermark_ok() on ZONE_DMA32 in migrate_balanced_pgdat() may always
return true. So, autonuma may not stop even when memory pressure in
node 0 is heavy.
To fix the issue, ZONE_MOVABLE is used as parameter to call
zone_watermark_ok() in migrate_balanced_pgdat(). This makes it same as
requested zone in alloc_misplaced_dst_page(). So that
migrate_balanced_pgdat() returns false when memory pressure is heavy.
Link:
http://lkml.kernel.org/r/20191101075727.26683-2-ying.huang@intel.com
Signed-off-by: "Huang, Ying" <ying.huang(a)intel.com>
Acked-by: Mel Gorman <mgorman(a)suse.de>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Rik van Riel <riel(a)redhat.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Ingo Molnar <mingo(a)kernel.org>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Fengguang Wu <fengguang.wu(a)intel.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Baolin Wang <baolin.wang(a)linux.alibaba.com>
---
Note: this patch fixes the problem that the DRAM node's kswapd
is not waked up in time, and improves about 12% with mysql performance
testing.
---
mm/migrate.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/migrate.c b/mm/migrate.c
index 6d25ea0..e2dbf24 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1969,7 +1969,7 @@ static bool migrate_balanced_pgdat(struct pglist_data *pgdat, int
order)
/* Avoid waking kswapd by allocating pages to migrate. */
if (!zone_watermark_ok(zone, order,
high_wmark_pages(zone),
- 0, 0))
+ ZONE_MOVABLE, 0))
continue;
return true;
}
--
1.8.3.1