mm: sparsemem: use page table lock to protect kernel pmd operations

Linux Kernel / Linux Kernel - Muchun Song [bytedance.com] - 22 March 2022 22:57 UTC

The init_mm.page_table_lock is used to protect kernel page tables, we can use it to serialize splitting vmemmap PMD mappings instead of mmap write lock, which can increase the concurrency of vmemmap_remap_free().

Actually, It increase the concurrency between allocations of HugeTLB pages. But it is not the only benefit. There are a lot of users of mmap read lock of init_mm. The mmap write lock is holding through
vmemmap_remap_free(), removing mmap write lock usage to make it does not affect other users of mmap read lock. It is not making anything worse and always a win to move.

Now the kernel page table walker does not hold the page_table_lock when walking pmd entries. There may be consistency issue of a pmd entry, because pmd entry might change from a huge pmd entry to a PTE page table. There is only one user of kernel page table walker, namely ptdump. The ptdump already considers the consistency, which use a local
variable to cache the value of pmd entry. But we also need to update->action to ACTION_CONTINUE to make sure the walker does not walk every pte entry again when concurrent thread has split the huge pmd.

Link: https://lkml.kernel.org/r/20211101031651.75851-4-songmuchun@bytedance.com

d8d55f5616cf mm: sparsemem: use page table lock to protect kernel pmd operations
mm/ptdump.c | 16 ++++++++++++----
mm/sparse-vmemmap.c | 47 +++++++++++++++++++++++++++++++----------------
2 files changed, 43 insertions(+), 20 deletions(-)

Upstream: git.kernel.org


  • Share