We questioned the hardware reality of “2 bits available”. We verified it with axiom_check.c.
alignof(struct address_space) = 8 bytesalignof(struct anon_vma) = 8 byteslog2(8) = 3 bits (Bits 0, 1, 2 are ALWAYS zero in a valid pointer)PAGE_MAPPING_ANON = 0x1 (Bit 0 set)PAGE_MAPPING_MOVABLE = 0x2 (Bit 1 set)PAGE_MAPPING_KSM = 0x3 (Bits 0 and 1 set)| Raw Bits [1:0] | Flag Name | Meaning | Target Struct Type |
|—————-|———–|———|——————–|
| 00 | (None) | Page Cache | struct address_space * |
| 01 | PAGE_MAPPING_ANON | Anonymous | struct anon_vma * |
| 10 | PAGE_MAPPING_MOVABLE | Movable | struct movable_operations * |
| 11 | PAGE_MAPPING_KSM | KSM | struct ksm_stable_node *? (Check source) |
To observe page->mapping raw value changing states based on how we allocate memory.
mapping_user.c)We need to generate 2 (or 3) distinct types of pages within one process.
mmap() a file on disk (e.g., create a temp file, write data, map it).page->mapping lower bits == 00.page->mapping points to inode->i_mapping.mmap(MAP_ANONYMOUS) or malloc().page->mapping lower bits == 01.page->mapping & ~3 points to a struct anon_vma.madvise(MADV_MERGEABLE). All 0s is best candidate.ksmd) must satisfy the merge.mapping_hw.c)Logic to decode the field safely:
unsigned long raw_val = (unsigned long)page->mapping;long flags = raw_val & 3;unsigned long ptr = raw_val & ~3UL;flags:
ptr.ptr.CRITICAL: Do NOT dereference the pointer in the module yet. We just want to see the address and bits. Determining if it is a valid pointer is Step 3.
mapping_user.c to create File + Anon mappings.mapping_hw.c to inspect them.1. WHAT: The 3-Bit Gap
0x1000 (4096 / 8 = 512.0) ✓0x1008 (4104 / 8 = 513.0) ✓0x1010 (4112 / 8 = 514.0) ✓0x1001 (4097 / 8 = 512.125) ✗0x1002 (4098 / 8 = 512.250) ✗0x1003 (4099 / 8 = 512.375) ✗0x1007 (4103 / 8 = 512.875) ✗0x1000 = ...100000000000 (Ends in 000)0x1008 = ...100000001000 (Ends in 000)2. WHY: Efficiency (Space Saving)
struct page { void *mapping; int is_anon; };
struct page { unsigned long mapping_val; };
mapping_val = (anon_vma_ptr | 1).3. WHERE: Between the Bits
0xffff8880abcd10000x1 (PAGE_MAPPING_ANON)OR
1111...1000 (Pointer)
| 0000...0001 (Flag)
= 1111...1001 (Stored Value)
4. WHO: The Kernel (mm/rmap.c)
page->mapping = (struct address_space *) ((unsigned long)anon_vma | PAGE_MAPPING_ANON);struct anon_vma *av = (struct anon_vma *) (page->mapping & ~PAGE_MAPPING_ANON);5. WHEN: Page Fault Time
mmap(MAP_ANONYMOUS). VMA created.anon_vma.page->mapping.6. WITHOUT: Crash
page->mapping->host directly on an ANON page:
0xffff8880abcd10011001 to 1009. Garbage data.7. WHICH: The Least Significant Bit (LSB)
PID: 219766
ANONYMOUS VA: 0x746d87c00000
RAW MAPPING VALUE: 0xffff888102a9cd21
DECODING STEP-BY-STEP:
1. Value = 0xffff888102a9cd21
Binary (last 4 bits): 0001
2. Extract Flags:
Mask: 0x3 (0011)
Calculation: 0001 & 0011 = 0001
Result: 1 (PAGE_MAPPING_ANON)
Meaning: This page is backed by anon_vma, NOT a file.
3. Extract Pointer:
Mask: ~0x3 (1100)
Calculation: 0001 & 1100 = 0000
Last 4 bits become: ...d20 (was ...d21)
Full Pointer: 0xffff888102a9cd20
4. Verification:
Is 0xffff888102a9cd20 divisible by 8?
0x...20 = 32 (decimal). 32 / 8 = 4.
YES. It is a valid aligned struct pointer.
AXIOM 1: Computers address memory by Bytes.
AXIOM 2: A Pointer is a Number representing a Byte Address (0, 1, 2...).
AXIOM 3: The CPU fetches Data in Chunks, not Bytes (Architecture Dependent).
AXIOM 4: x86_64 CPU fetches 64-bit Words (8 Bytes) at a time.
AXIOM 5: For Efficiency (Alignment), 8-Byte Data must start at addresses divisible by 8.
struct page pointer.sizeof(struct page) = 64 bytes (on typical build).alignof(struct page) = 8 bytes (Word Size).A must satisfy A % 8 == 0.8 (decimal) = 1000 (binary).0, 8, 16, 24...0000, 1000, 10000, 11000…000.struct page, bits [2:0] are mathematically guaranteed to be ZERO.P = P_real + 0.P | 7 changes bits [2:0] without destroying P_real (if we mask later).P (e.g., 0x1000). Binary: ...10000_000.F (e.g., 0x1). Binary: 001.F < 8 (must fit in 3 bits).E = P | F.
...000 OR ...001 = ...001.E = 0x1001.E is NOT a valid pointer (Odd address).E around as a value (unsigned long), but CANNOT dereference it directly.E = 0x1001.F.
M_flag = 7 (Binary 111).E & M_flag.0x1001 & 0x7 -> 0001 & 0111 -> 0001 (1, Correct).P.
M_ptr = ~7 (Binary ...11111000).E & M_ptr.0x1001 & ~0x7 -> 0x1001 & ...111000. ...10000_001
& ...11111_000
= ...10000_000 (0x1000, Correct)
page->mapping.struct address_space (Aligned 8).bool is_anon adds 4-8 bytes to struct page.page->mapping ends in 00 -> It is a Pointer.page->mapping ends in 01 -> It is an Encoded Value (Anon Ptr | 1).page->mapping ends in 10 -> It is an Encoded Value (Movable Ptr | 2).page->mapping ends in 11 -> It is an Encoded Value (KSM Ptr | 3).align(4) or align(8) objects have 2 or 3 spare bits.struct address_space * (Tag: 00).struct anon_vma * (Tag: 01).Goal: Calculate P | F and (P|F) & ~M.
Reference P |
Alignment | Valid? | Flag F |
Encoded E |
Decoded P' |
|---|---|---|---|---|---|
0x100 |
8 | ✓ | 1 | 0x101 |
0x100 |
0x104 |
8 | ✗ | 1 | 0x105 |
0x100 (Wrong!) |
0x2000 |
4 | ✓ | 2 | 0x2002 |
0x2000 |
0x2000 |
4 | ✓ | 3 | 0x2003 |
0x2000 |
0xFFFF |
2 | ✗ (Odd) | 1 | 0xFFFF |
0xFFFE (Wrong!) |
0x80 |
16 | ✓ | 7 | 0x87 |
0x80 |
0x80 |
16 | ✓ | 8 | 0x88 |
COLLISION! |
Surprise in Case 2: If the original pointer wasn’t aligned to the mask size, decoding destroys data. (0x104 -> 0x100).
Surprise in Case 7: If the Flag is larger than the alignment gap (8 >= 8), it corrupts the pointer’s actual bits. 0x80 | 8 = 0x88. Decoding 0x88 & ~7 = 0x88. The flag has merged into the address!
Therefore: Flags MUST be strictly smaller than Alignment (F < Align).
AXIOM 1: A Theory is valid if and only if Real Data matches Prediction.
AXIOM 2: dmesg implies Kernel Truth.
open("temp_map_file", ...) -> Returns FD 3.write(3, "F"..., 4096) -> Extends file to 4096 bytes.mmap(..., MAP_SHARED, fd=3)
page->mapping lower bits must be 00.mmap(..., MAP_ANONYMOUS|MAP_PRIVATE, ...)
anon_vma.page->mapping lower bits must be 01.We ran the driver against PID 219766 (captured live).
TRACE A: FILE PAGE
0x746d87e07000.get_user_pages(...) -> Returns struct page *.0xffff88810672e380.0x...380 AND 3.
...10000000 & 00000011.0.0x...380 AND ~3.
0xffff88810672e380 (Unchanged).0x...380 % 8 == 0. Aligned ✓.struct address_space.TRACE B: ANON PAGE AXIOMATIC RE-DERIVATION
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0).
MAP_ANONYMOUS is defined as 0x20 (32) in include/uapi/asm-generic/mman-common.h.
sys_mmap sees flags & MAP_ANONYMOUS is True (0x22 & 0x20 = 0x20).
mm/mmap.c -> vma->vm_ops = NULL (No file operations).
anon_ptr[0] = 'A'.
vma->vm_ops. It is NULL.
anon_vma_prepare to allocate struct anon_vma.
struct anon_vma *av = 0xffff888102a9cd20.
0xffff888102a9cd20 % 8 = 0. Last 3 bits are 000.
PAGE_MAPPING_ANON is defined as 0x1 in include/linux/page-flags.h.
stored_value = (unsigned long)av | PAGE_MAPPING_ANON.
| *Check: New Calc? OR operation. 0 | 1 = 1.* |
...11100000 | ...0000001 = ...11100001.
0xffff888102a9cd21 (This is the value we read in the driver).
value & 3 -> ...01 & ...11 -> 1.
value & ~3 -> ...01 & ...00 -> ...00.
...21 inherently proves the page is Anonymous because the only way bit 0 becomes 1 is via this explicit OR operation in the kernel source.
mapping field is simply Integer Math taking advantage of Hardware Alignment rules.AXIOM 1: A CPU runs Instructions.
AXIOM 2: A “Process” is a container for Instructions + Memory Maps (struct mm_struct).
AXIOM 3: The Kernel must run its own instructions (to clean memory, manage disks, merge pages).
AXIOM 4: struct task_struct is the kernel definition of ANY schedulable entity (Process or Thread).
task->mm.
task->mm != NULL (Has valid userspace addresses 0x0…0x7fff…).task_struct where task->mm == NULL.
ksmd access User Page X?
0xffff...).0x7f...) -> HW Page Table -> Phys Frame P.0xff...) -> Offset Math -> Phys Frame P.ksmd accesses the Physical Data via its own Window, ignoring the User’s Window.ksmd is a Kernel Thread created at boot (mm/ksm.c).run=1).madvise).struct page metadata (the mapping field) to reflect this new “Shared” reality (PAGE_MAPPING_KSM).AXIOM 1: Virtual Memory is an Array of Pages indexed by VPN (Virtual Page Number).
AXIOM 2: Physical Memory is an Array of Frames indexed by PFN (Physical Frame Number).
AXIOM 3: Hardware stores mapping in a Hierarchical Tree (PGD -> PUD -> PMD -> PTE).
AXIOM 4: Linux exposes this as a Flattened Linear Array in /proc/PID/pagemap.
PFN[Index].
read():
I.
VA = I * 4096.
CR3 -> PGD Entry.
PGD -> PUD Entry.PUD -> PMD Entry.PMD -> PTE Entry (The Bottom Level).PTE.
VA (e.g., 4096).
SZ = 4096 bytes.
VPN = VA / SZ.
ES = 8 bytes (64 bits).
File_Offset = VPN * ES.
E read from file.
E & (1 << 63).0x007FFFFFFFFFFFFF (55 bits).PFN = E & Mask.VA1 and VA2 (Different Virtual Addresses).
PFN1 and PFN2 using steps above.
PFN1 != PFN2: They point to different RAM chunks.
PFN1 == PFN2: They point to the SAME RAM chunk.
mmap(SHARED), the only way PFNs match is if the Kernel Deduplicated them (KSM).
AXIOM 1: The Pagemap Entry is strictly 64 bits (8 bytes). * Check: New Axiom? API Standard. AXIOM 2: The Kernel needs to store Metadata Flags AND the Address in these 64 bits. * Check: New Axiom? Design Requirement. AXIOM 3: The Flags are stored in the High Bits (Top Down). * Check: New Axiom? API Standard.
CALCULATION OF RESERVED BITS:
PM_PRESENT).
PM_SWAP).PM_FILE).PM_SOFT_DIRTY - since 3.11).PM_UFFD_WP).PM_MMAP_EXCLUSIVE).PM_SOFT_DIRTY).SUBTRACTION:
64 - 9 = 55 bits.
VERIFICATION:
0x7FFFFFFFFFFFFF covers ALL possible physical Memory Addresses on any current architecture.
TRACE C: KSM PAGE
VA_1 with content 0xCC (All bytes).VA_2 with content 0xCC (All bytes).madvise(VA_1, MADV_MERGEABLE) ∧ madvise(VA_2, MADV_MERGEABLE).sleep(5) → ksmd thread wakes up.ksmd hashes VA_1 → Hash_X.ksmd hashes VA_2 → Hash_X.Hash_X == Hash_X → Deduplication Triggered.VA_2 to Page 1 (Physical).page->mapping.
Old: struct anon_vma * |
0x1. |
struct ksm_stable_node *.PAGE_MAPPING_KSM (0x3).VA_1 = 0x7744f147f000.0xffff8cf5c4652483 (from dmesg).0x...483 & 3 → ...0011 & 0011 → 3.3 == PAGE_MAPPING_KSM (0x3) ✓.0x...483 & ~3 → ...0011 & ...1100 → ...0000.0xffff8cf5c4652480.0x...480 % 8 == 0 ✓.FINAL: ALL STATES VERIFIED
ERROR 1: CONFUSION OF LAYERS (HW vs SW)
ERROR 2: INVISIBLE MECHANISMS (KSM)
ksmd).ksmd acts while User sleeps.ERROR 3: EXECUTION BLINDNESS (IO BUFFERS)
grep buffering.stdbuf -o0 or fflush).ERROR 4: SOURCE FALLACY
linux-headers package contains .c implementation files.ls) before asserting availability.ERROR 5: BINARY ENDING CONFUSION