Pmem February 2022

pmem@lists.openanolis.cn

3 参与者
4 讨论

[PATCH] anolis: mm: Add page promotion throttle statistic

by Baolin Wang

ANBZ: #80 Add page promotion throttle statistic, which can be used to check how many cold pages were trying to be prmoted to DRAM, and help to tuning the latency threhold. Signed-off-by: Baolin Wang <baolin.wang(a)linux.alibaba.com> --- include/linux/mmzone.h | 1 + kernel/sched/fair.c | 5 ++++- mm/vmstat.c | 1 + 3 files changed, 6 insertions(+), 1 deletion(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index d03a536..eecffa7 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -213,6 +213,7 @@ enum node_stat_item { PGPROMOTE_FILE, PGPROMOTE_TRY, /* pages to try to migrate via NUMA balancing */ PGDEMOTED_HOT, + PGPROMOTE_COLD_THROTTLE, #endif NR_VM_NODE_STAT_ITEMS }; diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index c473da4..0a11b03 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3021,8 +3021,11 @@ bool should_numa_migrate_memory(struct task_struct *p, struct page * page, if (flags & TNF_WRITE) th *= 2; latency = numa_hint_fault_latency(page); - if (latency > th) + if (latency > th) { + mod_node_page_state(pgdat, PGPROMOTE_COLD_THROTTLE, + hpage_nr_pages(page)); return false; + } if (flags & TNF_DEMOTED) mod_node_page_state(pgdat, PGDEMOTED_HOT, diff --git a/mm/vmstat.c b/mm/vmstat.c index 7f2c498..98f2a03 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1193,6 +1193,7 @@ int fragmentation_index(struct zone *zone, unsigned int order) "pgpromote_file", "pgpromote_try", "pgpromote_demoted", + "pgpromote_cold_throttle", #endif /* enum writeback_stat_item counters */ -- 1.8.3.1

3 years, 3 months

[PMEM PATCH 0/3] memory tiered: adjust promotion threshold to

by zhongjiang-ali

Mysql benchmark will produce too mnay promotion due to DRAM node is not enough, but promotion frequently will result in performance decrease. hence, we perfer to remote access rather than demote/promote traffic frequently. Huang Ying (3): mm, migrate: use flags parameter for remove_migration_ptes() memory tiering: measure whether demoted pages are hot memory tiering: adjust promotion threshold based on hot pages demoted include/linux/mmzone.h | 3 ++ include/linux/page-flags.h | 9 ++++++ include/linux/page_ext.h | 3 ++ include/linux/rmap.h | 8 ++++- include/linux/sched/numa_balancing.h | 62 ++++++++++++++++++++++++++++++++++++ include/linux/sched/sysctl.h | 3 ++ include/trace/events/mmflags.h | 8 ++++- kernel/sched/fair.c | 27 +++++++++++++--- kernel/sysctl.c | 16 ++++++++++ mm/huge_memory.c | 6 ++-- mm/mempolicy.c | 2 ++ mm/migrate.c | 60 ++++++++++++++++++++++++++++------ mm/vmstat.c | 1 + 13 files changed, 189 insertions(+), 19 deletions(-) -- 1.8.3.1

3 years, 4 months

[PATCH] memory tiered: Do not use thp in pmem node

by zhongjiang-ali

Currently, Mysql testcase show that a large number of thp are migrated from pmem node to toptier node, it will bring in more pgpromote_demoted and migrated failiure. because pmem node memory is marked as prot_none, it will be migrated by cpu access as soon as possible when it is hot, and it is unnesscessary to migrate thp to dram when dram memory is not enough, which will bring in more demoted and promoted. Hence, the patch forbid the thp to produce in pmem node. the result show about 3% improvements. the relative statistics is as follows. before appling patch: mysql prepare: pgpromote_demoted 908267 pgmigrate_fail_dst_node_fail 428223 pgmigrate_fail_numa_isolate_fail 460480 mysql run: pgpromote_demoted 2901105 pgmigrate_fail_dst_node_fail 5653776 pgmigrate_fail_numa_isolate_fail 5686052 after appling patch: mysql prepare: pgpromote_demoted 839297 pgmigrate_fail_dst_node_fail 36585 pgmigrate_fail_numa_isolate_fail 36585 mysql run: pgpromote_demoted 913828 pgmigrate_fail_dst_node_fail 235863 pgmigrate_fail_numa_isolate_fail 235870 Signed-off-by: zhongjiang-ali <zhongjiang-ali(a)linux.alibaba.com> --- mm/page_alloc.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 8cfce92..4fff3cd 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -461,6 +461,17 @@ static __always_inline int get_pfnblock_migratetype(struct page *page, unsigned return __get_pfnblock_flags_mask(page, pfn, PB_migrate_end, MIGRATETYPE_MASK); } +static inline bool allow_hugepage_allocation(int nid, unsigned int order) +{ + if (node_is_toptier(nid)) + return true; + + if (order != HPAGE_PMD_ORDER) + return true; + + return false; +} + /** * set_pfnblock_flags_mask - Set the requested group of flags for a pageblock_nr_pages block of pages * @page: The page within the block of interest @@ -3689,6 +3700,9 @@ static bool zone_allows_reclaim(struct zone *local_zone, struct zone *zone) } } + if (!allow_hugepage_allocation(zone_to_nid(zone), order)) + continue; + if (no_fallback && nr_online_nodes > 1 && zone != ac->preferred_zoneref->zone) { int local_nid; -- 1.8.3.1

3 years, 4 months

[Pmem PATCH] memory tiering: fix promote_success count when thp numa fault work

by zhongjiang-ali

Currently, promote_success just include the normal page count when it is migrated from pmem node to toptier node, but an huge page also can trigger the same operation when thp numa fault work. hence it miss the count in migrate_misplaced_transhuge_page. Signed-off-by: zhongjiang-ali <zhongjiang-ali(a)linux.alibaba.com> --- mm/migrate.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/mm/migrate.c b/mm/migrate.c index e9adaa7..9d6cac9 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -2138,7 +2138,7 @@ int migrate_misplaced_page(struct page *page, struct vm_area_struct *vma, if (nr_succeeded) { count_vm_numa_events(NUMA_PAGE_MIGRATE, nr_succeeded); if (!node_is_toptier(page_to_nid(page)) && node_is_toptier(node)) - mod_node_page_state(NODE_DATA(node), PGPROMOTE_SUCCESS, + mod_node_page_state(pgdat, PGPROMOTE_SUCCESS, nr_succeeded); } BUG_ON(!list_empty(&migratepages)); @@ -2264,6 +2264,9 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm, count_vm_events(PGMIGRATE_SUCCESS, HPAGE_PMD_NR); count_vm_numa_events(NUMA_PAGE_MIGRATE, HPAGE_PMD_NR); + if (!node_is_toptier(page_to_nid(page)) && node_is_toptier(node)) + mod_node_page_state(pgdat, PGPROMOTE_SUCCESS, + HPAGE_PMD_NR); mod_node_page_state(page_pgdat(page), NR_ISOLATED_ANON + page_lru, -- 1.8.3.1

3 years, 4 months

2025

2024

2023

2022

2021

Pmem February 2022