Hi, Zhongjiang,
zhongjiang-ali <zhongjiang-ali(a)linux.alibaba.com> writes:
ANBZ: #80
sysctl_numa_balancing_mode is set to NUMA_BALANCING_MEMORY_TIERING
allowing memory migration between fast and slow node, and the page
of slow memory reuse the cpupid field. But it will bring in the
issue when sysctl_numa_balancing_mode is turned off dynamtically.
should_numa_migrate_memory will choose whether the slow memory should
be migrated to fast memory when NUMA_BALANCING_MEMORY_TIERING is
turned off simultaneously. It will fails to obtain the correct node
from cpupid field in slow memory. hence it will trigger the panic.
Thanks for catching this! Can you share the panic kernel log?
Best Regards,
Huang, Ying
Signed-off-by: zhongjiang-ali
<zhongjiang-ali(a)linux.alibaba.com>
---
kernel/sched/fair.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 0184145..6afa935 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3016,6 +3016,14 @@ bool should_numa_migrate_memory(struct task_struct *p, struct page
* page,
last_cpupid = page_cpupid_xchg_last(page, this_cpupid);
/*
+ * Migration will turn off between fast memory and slow node when
+ * sysctl_numa_balancing_mode disable the feature dynamically.
+ */
+ if (!(sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING) &&
+ !node_is_toptier(src_nid))
+ return false;
+
+ /*
* Allow first faults or private faults to migrate immediately early in
* the lifetime of a task. The magic number 4 is based on waiting for
* two full passes of the "multi-stage node selection" test that is