From: Huang Ying <ying.huang(a)intel.com>
ANBZ: #80
commit a818f5363a0eba04bcff986c64c919d3f44b8017 upstream
In auto NUMA balancing page table scanning, if the pte_protnone() is
true, the PTE needs not to be changed because it's in target state
already. So other checking on corresponding struct page is unnecessary
too.
So, if we check pte_protnone() firstly for each PTE, we can avoid
unnecessary struct page accessing, so that reduce the cache footprint of
NUMA balancing page table scanning.
In the performance test of pmbench memory accessing benchmark with 80:20
read/write ratio and normal access address distribution on a 2 socket
Intel server with Optance DC Persistent Memory, perf profiling shows
that the autonuma page table scanning time reduces from 1.23% to 0.97%
(that is, reduced 21%) with the patch.
Link: http://lkml.kernel.org/r/20191101075727.26683-3-ying.huang@intel.com
Signed-off-by: "Huang, Ying" <ying.huang(a)intel.com>
Acked-by: Mel Gorman <mgorman(a)suse.de>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Rik van Riel <riel(a)redhat.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Ingo Molnar <mingo(a)kernel.org>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Fengguang Wu <fengguang.wu(a)intel.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Baolin Wang <baolin.wang(a)linux.alibaba.com>
---
mm/mprotect.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/mm/mprotect.c b/mm/mprotect.c
index 6525e96..01681e3 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -84,6 +84,10 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd,
int nid;
bool toptier;
+ /* Avoid TLB flush if possible */
+ if (pte_protnone(oldpte))
+ continue;
+
page = vm_normal_page(vma, addr, oldpte);
if (!page || PageKsm(page))
continue;
@@ -93,10 +97,6 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd,
page_mapcount(page) != 1)
continue;
- /* Avoid TLB flush if possible */
- if (pte_protnone(oldpte))
- continue;
-
/*
* Don't mess with PTEs if page is already on the node
* a single-threaded process is running on.
--
1.8.3.1
Hi, All,
I have a patch as below to reduce TLB shootdown during page promotion.
If not yet, you may consider to backport it to Anolis kernel.
b99a342d4f11a5455d999b12f5fee42ab6acaf8c
Author: Huang Ying <ying.huang(a)intel.com>
AuthorDate: Thu Apr 29 22:57:41 2021 -0700
Commit: Linus Torvalds <torvalds(a)linux-foundation.org>
CommitDate: Fri Apr 30 11:20:39 2021 -0700
NUMA balancing: reduce TLB flush via delaying mapping on hint page fault
Best Regards,
Huang, Ying