-
Notifications
You must be signed in to change notification settings - Fork 0
Chapter 2: Memory Foundational Concepts
Starting to feel worn out from theory? Good, because this chapter is the most theory-heavy one yet. I know, I know, you’re eager to dive into code. Honestly, I debated putting this chapter later in the docs, but it makes more sense to establish these foundations before moving forward. The good news? This will be the last theory-dense chapter (at least for now). The better news? If there’s one chapter you really shouldn’t skip, it’s this one.
In the last chapter, we covered the importance of having a solid build pipeline for our kernel, along with how to structure the project. With that foundation in place, we can now shift our focus to kernel development without extra distractions.
In this chapter, we’ll dive into Virtual and Physical Memory, Paging, the Higher Half Kernel design, and a few other essential concepts. These are just as important as the build system — we need to understand them early on so we can structure the project correctly from the start.
The last thing we want is to set everything up in a way that forces a complete rewrite later (I'm looking at you, u/BillyZeim).
When we talk about physical memory, we are referring to the actual hardware component of a computer system used for temporary data storage: Random Access Memory (RAM). These are the physical chips installed on the motherboard or memory modules (DIMMs) that store data and machine code currently being used by the CPU.
- Hardware-Based: It is a tangible resource with a finite capacity (e.g., 4 GB, 16 GB, 64 GB). Once all physical memory is occupied, the system must resort to other mechanisms, like swapping, which are significantly slower.
- Direct CPU Access: The CPU can read from and write to physical memory addresses directly via the memory bus. This is the fastest form of storage available to the CPU, aside from its internal caches (L1, L2, L3).
- Volatile: The contents of physical memory are lost when the system loses power.
-
Flat Address Space: Physical memory is organized as a contiguous, linear array of bytes. Each byte has a unique physical address, starting from
0and extending to the top of the installed RAM.
When a program runs, the CPU needs to fetch instructions and data. It does this by issuing requests for specific memory addresses. In a simple system without memory management, the address generated by the CPU (a physical address) is sent directly over the address bus to the RAM controller, which then accesses the corresponding location.
For example, if the CPU instructs, "read the byte at address 0x1000," the memory controller will physically access the 4096th byte (since 0x1000 is 4096 in decimal) in the RAM chips.
Important
This is a simplified explanation. In reality, modern systems are more complex:
- Caching: The CPU first checks its high-speed caches. If the data is found there (a cache hit), it avoids a slower access to main physical memory.
- Memory Management Unit (MMU): The CPU does not output physical addresses directly. Instead, it outputs virtual addresses*. A hardware component called the MMU translates these virtual addresses into physical addresses on-the-fly. This is the foundation of modern memory management.
- Burst Transfers: Modern memory systems transfer data in blocks (bursts) rather than single bytes for efficiency.
*We will talk about virtual addresses in depth later on in this document.
Physical memory is the fundamental pool of fast storage, but using it directly for every running program is inefficient and insecure. This led to the development of memory management schemes, starting with Segmentation.
Segmentation was an early memory management technique that provided a way to isolate processes and organize memory in a logical way. Instead of viewing memory as a single, flat array, segmentation divides it into variable-sized blocks called segments.
Each segment is designed to hold a logical unit of a program, such as:
| Segment Type | Purpose | Typical Contents |
|---|---|---|
| Code Segment (.text) | Stores executable instructions | Program instructions |
| Data Segment (.data) | Stores initialized global and static variables | Global/static data |
| Stack Segment (.stack) | Stores the function call stack, local variables, return addresses | Call stack, local variables, return addresses |
| Heap Segment (.heap) | Stores dynamically allocated memory | Memory allocated via malloc or similar |
-
Segment Registers: The CPU contains special registers (e.g.,
CSfor Code Segment,DSfor Data Segment,SSfor Stack Segment) that hold the base address (the starting location) of a segment in physical memory. -
Logical Addressing: A program's address (a logical address) is composed of two parts: a segment selector and an offset.
Logical Address = <Segment Selector, Offset>
-
Translation to Physical Address: The CPU translates the logical address into a physical address by taking the segment's base address and adding the offset.
Physical Address = Segment Base Address + Offset
For example, if the Code Segment register (CS) points to base address 0x4000 and the instruction pointer (IP) is at offset 0x0100, the CPU accesses physical address 0x4000 + 0x0100 = 0x4100.
In practice, segmentation is implemented through the Global Descriptor Table (GDT).
While this was the prevalent memory management scheme for a while, it had many drawbacks. Here's a detailed summary:
Advantages:
- Logical Organization: Memory is organized in a way that matches the program's structure, making it easier for programmers and compilers to manage.
- Protection: Segments can be assigned permissions (e.g., code segment is execute-only, data segment is read-write). An attempt to write to a code segment would generate a protection fault, enhancing security and stability.
- Isolation: Different processes can have their own segments, preventing one program from interfering with another.
Critical Limitations:
- External Fragmentation: Because segments are variable-sized, free memory becomes fragmented into many small, non-contiguous blocks over time. It becomes impossible to load a large segment, even if the total free memory is sufficient, because no single free block is large enough. This requires complex and slow memory compaction algorithms.
- Inefficient Memory Use: A segment must be loaded contiguously into physical memory as a single unit, which is a strict requirement that leads to the fragmentation problem.
The problem of external fragmentation was a major driver for the development of a more flexible memory management scheme: Paging.
The Global Descriptor Table (GDT) is the fundamental data structure that makes protected-mode segmentation possible on x86 architectures. It is essentially an array of segment descriptors that the CPU uses to define the characteristics of every segment in the system.
A segment descriptor is an 8-byte data structure that describes a single segment. It contains all the information the CPU needs to manage and control access to that segment. The key fields in a descriptor include:
- Base Address: The 32-bit linear starting address of the segment in memory.
- Segment Limit: The 20-bit size of the segment (either in bytes or in 4KB pages).
- Type: What kind of segment it is (e.g., Code, Data, Stack) and its access permissions (e.g., Read/Write, Execute-Only).
- Privilege Level (DPL): The Descriptor Privilege Level (0-3), where 0 is the most privileged (kernel) and 3 is the least privileged (user applications). This determines which privilege levels can access this segment.
- Segment Present Flag: Indicates if the segment is currently loaded in memory.
- Granularity Flag: If set, the segment limit is interpreted in 4KB units instead of bytes, allowing segments to be up to 4GB in size.
-
System Setup: The operating system kernel, during boot-up, creates the GDT in memory and loads its address and size into the special GDTR (GDT Register) using the
LGDTassembly instruction. -
Segment Registers as Selectors: In protected mode, the segment registers (
CS,DS,SS, etc.) no longer hold a base address directly. Instead, they hold a segment selector. -
The Selector's Role: A selector is a 16-bit value that acts as an index into the GDT.
- Index (Bits 3-15): Specifies which entry (descriptor) in the GDT to use.
- Table Indicator (TI, Bit 2): If 0, use the GDT. If 1, use a Local Descriptor Table (LDT) for per-process segments.
- Requested Privilege Level (RPL, Bits 0-1): The privilege level of the current operation.
-
Address Translation via the GDT:
- The CPU takes the selector from a segment register (e.g.,
CS). - It uses the
TIbit to choose the GDT (or LDT). - It uses the
Indexto find the corresponding 8-byte segment descriptor in the GDT. - It performs security checks (e.g., is the segment present? does the current privilege level have access?).
- If checks pass, the CPU loads the base address and limit from the descriptor into an invisible, internal cache associated with the segment register.
- For subsequent memory accesses, the CPU uses this cached base address, adding the instruction's offset to generate a linear address.
- The CPU takes the selector from a segment register (e.g.,
- Centralized Management: The GDT provides the OS with a single, secure table to define all memory segments available to the system.
- Hardware Enforcement: The CPU hardware consults the GDT on every memory access, enforcing the protection rules defined by the OS. This prevents user applications from accessing kernel memory or executing data as code.
- Privilege Separation: The DPL field is crucial for implementing the separation between kernel mode (ring 0) and user mode (ring 3).
Important
In modern OSes that use paging as the primary memory management mechanism, the GDT is still present but is often used in a "minimal" or "flat" configuration. Segments are configured to span the entire 4GB (or more) address space with a base of 0, effectively bypassing segmentation's memory splitting function.
However, the GDT's role in enforcing privilege levels (kernel vs. user) remains critical for system security and stability.
Paging is a memory management scheme that fundamentally decouples a program's view of memory from the actual physical hardware (RAM). Its primary goal is to solve the critical flaw of segmentation: external fragmentation.
Instead of requiring that a program's code and data be stored in contiguous blocks of physical RAM (as segmentation does), paging introduces a powerful layer of indirection.
-
Divide Memory into Fixed-Size Blocks:
- Physical Memory is divided into equal-sized chunks called frames.
- A program's address space is divided into chunks of the same size, called pages.
-
The Indirection Map:
- The kernel manages a page table for each process. This table is a map that translates the program's page numbers into the physical frame numbers where those pages are actually stored.
- This mapping is often called the Page Map Table (PMT) or simply the page table.
Important
The kernel itself is also a program, so it has its own page table. The key difference is that the kernel also manages the page tables of all other processes.
The key insight is that any page can be placed into any available frame. Because all pages and frames are the same size, the problem of external fragmentation is completely eliminated. There is no need to find a contiguous block of memory for a large segment; the OS can simply find any free frames, wherever they are located.
Are you confused? Don't be! It’s really simple! Paging just introduces a table of mappings. That's it!
A program has pages (fixed size), which correspond to physical frames (also the same size as pages, but to differentiate, we call them frames) — and these frames can be scattered anywhere in physical memory.
To better understand it, think about this: In segmentation, all the program’s instructions would have to reside in a contiguous .text section in physical memory.
With paging, they can be scattered anywhere across physical memory (wherever there’s a free frame). The Page Map Table (PMT) maps these scattered frames into continuous pages that the program sees. So to the program, the code still appears as a single, continuous .text section, despite it being scattered all throughout physical memory in reality!
When it accesses a page (for example, to execute some code), the PMT simply translates it to the corresponding physical frame. That’s it!
Example:
Suppose a program has 4 pages, and its page table looks like this:
| Page | Frame (Physical Memory) |
|---|---|
| 0 | 5 |
| 1 | 2 |
| 2 | 8 |
| 3 | 1 |
If the program accesses page 2, the PMT translates it to frame 8, and the system reads from that physical memory location. The program still sees its memory as one continuous block through pages, even though the corresponding physical memory frames are scattered.
Tip
What we’ve been intuitively calling the program’s pages are, in practice, referred to as virtual pages.
This layer of indirection creates a powerful abstraction. A program does not work with physical addresses anymore. Instead, it operates within a Virtual Address Space.
- Each process believes it has its own private, large, and contiguous memory space, starting at address 0. This is its virtual address space.
- The addresses a program uses are called virtual addresses.
- The chunks that make up this virtual address space are called virtual pages.
Why are they called "Virtual" Pages? They are "virtual" because they are not tied to a fixed location in physical RAM. A virtual page might be:
- Located in any physical frame.
- Swapped out to disk (not in RAM at all!).
- Shared with another process, so the same physical frame is mapped into two different virtual address spaces.
- Not even allocated until the program first tries to use it (demand paging).
The virtual page is a placeholder, a unit of management within the program's illusory memory world. The page table (PMT) is what connects this virtual world to the physical reality of RAM.
The translation from a virtual address (used by the program) to a physical address (used by the RAM) is performed automatically and transparently by the CPU's Memory Management Unit (MMU).
-
CPU Generates a Virtual Address: The running program attempts to access a memory location, say
0x4000. This is a virtual address. -
MMU Splits the Address: The MMU splits the virtual address into two parts:
- Virtual Page Number (VPN): The high-order bits of the address. This identifies which page the address belongs to.
- Page Offset: The low-order bits of the address. This identifies the specific byte within the page.
Virtual Address: [ Virtual Page Number (VPN) | Page Offset ] -
Consult the Page Table: The MMU uses the VPN as an index into the current process's page table. Each entry in the page table, called a Page Table Entry (PTE), contains the Physical Frame Number (PFN) where that virtual page is located, along with control bits (Present, Read/Write, etc.).
Warning
This is a simplified, high-level overview of how page table translation works. In reality, the process is more complex and often involves 3, 4, or even 5 levels of page tables. We’ll explore these details later in this document.
-
Check Permissions and Presence: The MMU checks the PTE's control bits.
- If the page is not allowed to be written to but the operation is a write, it triggers a segmentation fault.
- If the Present Bit is 0, the page is not in RAM (it's on disk). This triggers a page fault, forcing the OS to load the page from disk into a free frame.
-
Form the Physical Address: If all checks pass, the MMU takes the PFN from the PTE and combines it with the original page offset to form the final physical address.
Physical Address: [ Physical Frame Number (PFN) | Page Offset ]
This entire process happens for every memory access, and is heavily optimized by hardware (e.g., with a Translation Lookaside Buffer (TLB), which is a cache for recent page table translations).
Important
The TLB is a hardware component inside the CPU, so our kernel doesn’t implement it or manage it. That said, we can still interact with it — for example, by flushing it when needed.
Virtual Memory is the overarching abstraction and technique that is implemented using paging (and to a lesser extent, segmentation). It is the illusion provided to each process that it has its own large, private, and contiguous address space, which may be larger than the actual physical RAM available.
Virtual Memory is not a single thing, but a combination of concepts and mechanisms:
-
The Illusion of a Private Address Space: Each process operates as if it has exclusive use of the main memory, simplifying programming and ensuring isolation.
-
Paging as the Primary Mechanism: Paging provides the core translation mechanism that makes this illusion possible by mapping virtual pages to physical frames.
-
Demand Paging: This is a crucial aspect of Virtual Memory. Pages are only loaded into physical memory when the program actually accesses them (a "page fault" occurs). This allows programs to have virtual address spaces much larger than physical RAM. The rest of the program can reside on disk.
-
Swap Space: The area on the disk (a file or partition) used to store pages that are not currently active in RAM. When physical memory fills up, the OS can "swap out" infrequently used pages to disk to free up frames, and "swap them in" later when needed.
Think of the hierarchy:
- Physical Memory (RAM): The real, finite hardware resource.
- Paging: The mechanism that manages physical memory by breaking it into frames and mapping them to virtual pages.
- Virtual Pages: The units of the illusory memory space that programs use.
- Virtual Memory: The final abstraction presented to the process, enabled by the paging mechanism. It encompasses the entire system, including the use of disk storage to extend the available "memory."
In modern operating systems, when we talk about a process's memory, we are almost always referring to its virtual memory. The complex work of translating virtual addresses to physical addresses and shuffling pages between RAM and disk is handled automatically by the OS and MMU, completely hidden from the application.
Paging in a 32-bit kernel is optional. If enabled, it consists of 2 levels of page tables that work together to translate virtual addresses to physical addresses. This hierarchical structure efficiently manages the 4 GB virtual address space (2^32 bytes) that 32-bit systems can address.
| Table Level | Entries | Entry Size | Purpose | Controls |
|---|---|---|---|---|
| Page Directory (PD) | 1,024 | 4 bytes | Top-level table; each entry points to a Page Table | 4 MB region of virtual space per entry |
| Page Table (PT) | 1,024 | 4 bytes | Second-level table; each entry points to a physical page frame | 4 KB page of physical memory |
Why 1,024 entries? Each table is 4KB in size (because it must fit within a single 4KB page). Since each entry is 4 bytes (32 bits), we get 4,096 ÷ 4 = 1,024 entries per table.
- PT Entry: Points to a 4 KB physical page frame. With 1,024 entries, one Page Table maps 1,024 × 4 KB = 4 MB of virtual memory.
- PD Entry: Points to an entire PT, which in turn maps 4 MB. With 1,024 entries, one Page Directory can map 1,024 × 4 MB = 4 GB of virtual memory (the entire 32-bit address space).
A 32-bit virtual address is split into three parts by the hardware Memory Management Unit (MMU):
- Directory Index (10 bits): Selects which entry in the PD to use (2^10 = 1,024 possibilities). This entry points to the target PT.
- Table Index (10 bits): Selects which entry in the target PT to use (2^10 = 1,024 possibilities). This entry points to the physical frame.
- Offset (12 bits): Locates the specific byte within the 4 KB physical page frame (2^12 = 4,096 bytes).
Paging is controlled by the CR0 and CR3 control registers.
- CR3 (Page Directory Base Register): This register must be loaded with the physical address of the current process's PD (highest level page table). The CPU uses this as the starting point for all translations.
- CR0 (Control Register 0): Bit 31 (the PG bit) is the master switch. Setting this bit to 1 globally enables the paging mechanism on the processor.
To enable paging, the kernel must:
- Identity map the necessary code and data structures (so instructions can continue to execute after the switch).
- Configure at least one PD and its corresponding PTs for the identity map.
- Load the physical address of the Page Directory into the
CR3register. - Set the PG bit in
CR0. The moment this bit is set, the MMU immediately begins using the page tables for all subsequent memory accesses.
Note
We will talk about what the identity map is later on in this document.
Let's translate virtual address 0x00402010:
-
Split the address:
- Directory Index:
0x001(bits 22-31) - Table Index:
0x004(bits 12-21) - Offset:
0x010(bits 0-11)
- Directory Index:
-
Translation steps:
- CPU reads CR3 to find Page Directory physical address
- Uses index
0x001to find PDE #1 - PDE #1 points to physical address of Page Table #1
- Uses index
0x004to find PTE #4 in that table - PTE #4 contains physical frame address (e.g.,
0x0000C000) - Final physical address:
0x0000C000 + 0x010 = 0x0000C010
This shows how the seemingly contiguous virtual address 0x00402010 actually maps to a completely different, non-contiguous physical location.
Paging in a 64-bit kernel is mandatory and consists of 4 levels of page tables that work together to translate virtual addresses to physical addresses. This hierarchical structure manages the massive 256 TB virtual address space (2^48 bytes) that 64-bit systems typically implement, though the architecture supports up to 2^64 bytes.
Note
Only the lower 48 bits of the virtual address are currently used. The remaining upper bits (bits 48–63) must be copies of bit 47 — either all 0s or all 1s — to ensure a canonical address. Accessing a non-canonical address will cause a CPU exception.
| Table Level | Entries | Entry Size | Purpose | Controls |
|---|---|---|---|---|
| Page Map Level 4 (PML4) | 512 | 8 bytes | Top-level table; each entry points to a Page Directory Pointer Table | 512 GB region of virtual space per entry |
| Page Directory Pointer Table (PDPT) | 512 | 8 bytes | Second-level table; each entry points to a Page Directory | 1 GB region of virtual space per entry |
| Page Directory (PD) | 512 | 8 bytes | Third-level table; each entry points to a Page Table | 2 MB region of virtual space per entry |
| Page Table (PT) | 512 | 8 bytes | Bottom-level table; each entry points to a physical page frame | 4 KB page of physical memory |
Why 512 entries? Each table must fit within a single 4 KB page (4,096 bytes). Since each entry is 8 bytes (64 bits), we get 4,096 ÷ 8 = 512 entries per table.
- PT Entry: Points to a 4 KB physical page frame. With 512 entries, one PT maps 512 × 4 KB = 2 MB of virtual memory.
- PD Entry: Points to an entire PT, which in turn maps 2 MB. With 512 entries, one PD can map 512 × 2 MB = 1 GB of virtual memory.
- PDPT Entry: Points to an entire PD, which in turn maps 1 GB. With 512 entries, one PDPT can map 512 × 1 GB = 512 GB of virtual memory.
- PML4 Entry: Points to an entire PDPT, which in turn maps 512 GB. With 512 entries, one PML4 can map 512 × 512 GB = 256 TB of virtual memory.
A 64-bit virtual address (though typically only 48 bits are used) is split into five parts by the hardware Memory Management Unit (MMU):
- PML4 Index (9 bits): Selects which entry in the PML4 to use (2^9 = 512 possibilities). This entry points to the target PDPT.
- PDPT Index (9 bits): Selects which entry in the PDPT to use (2^9 = 512 possibilities). This entry points to the target PD.
- PD Index (9 bits): Selects which entry in the PD to use (2^9 = 512 possibilities). This entry points to the target PT.
- PT Index (9 bits): Selects which entry in the PT to use (2^9 = 512 possibilities). This entry points to the physical frame.
- Offset (12 bits): Locates the specific byte within the 4 KB physical page frame (2^12 = 4,096 bytes).
The translation works as a four-step lookup:
- The CPU uses the PML4 Index to find the correct PML4 Entry, which points to a PDPT.
- The CPU uses the PDPT Index to find the correct PDPT Entry, which points to a PD.
- The CPU uses the PD Index to find the correct PD Entry, which points to a PT.
- The CPU uses the PT Index to find the correct PT Entry, which contains the physical frame address.
- The CPU combines this physical frame address with the 12-bit Offset to generate the final physical address.
Paging in 64-bit mode is controlled by the same CR0 and CR3 control registers, but with additional model-specific registers (MSRs).
- CR3 (Page Directory Base Register): This register contains the physical address of the current process's PML4 (highest level page table). The CPU uses this as the starting point for all translations.
- CR0 (Control Register 0): Bit 31 (the PG bit) must be set to enable paging.
- EFER MSR: The Long Mode Enable (LME) bit must be set along with PG to activate 64-bit paging.
To enable 64-bit paging, the kernel must:
- Identity map the necessary code and data structures for the initial transition.
- Configure at least one PML4 and its corresponding PDPs, PDs, and PTs for the identity map.
- Load the physical address of the PML4 into the
CR3register. - Set the PG bit in
CR0and the LME bit in the EFER MSR to activate 64-bit paging mode.
Let's translate the Virtual Address 0x00007FFF00402010
-
Split the address:
- PML4 Index:
0x0FF(bits 39–47) - PDPT Index:
0x1FC(bits 30–38) - PD Index:
0x002(bits 21–29) - PT Index:
0x002(bits 12–20) - Offset:
0x010(bits 0–11)
- PML4 Index:
-
Translation steps:
-
Step 1: CPU reads
CR3to get the physical address of the PML4 table.- Uses PML4 Index
0x0FF(Entry #255) to select the PML4 entry. - This entry points to the physical address of the PDPT table.
- Uses PML4 Index
-
Step 2: Access the PDPT table.
- Uses PDPT Index
0x1FC(Entry #508) to select the PDPT entry. - This entry points to the physical address of the Page Directory (PD).
- Uses PDPT Index
-
Step 3: Access the Page Directory.
- Uses PD Index
0x002(Entry #2) to select the PD entry. - This entry points to the physical address of the Page Table (PT).
- Uses PD Index
-
Step 4: Access the Page Table.
- Uses PT Index
0x002(Entry #2) to select the PT entry. - This entry contains the physical frame address (e.g.,
0x0000C000).
- Uses PT Index
-
Step 5: Combine the physical frame with the offset
0x010to get the final physical address:-
0x0000C000 + 0x010 = 0x0000C010.
-
This demonstrates how 64-bit paging adds two extra levels (PML4 and PDPT) to handle the vastly larger address space, while still using the same 4 KB page size.
Note
In the docs/tools subfolder of the project, I’ve included a Python script named virt_breakdown.py. This script takes a 64-bit virtual address as input and breaks it down into the corresponding indices for the PML4, PDPT, PD, and PT tables. These indices indicate which entries must be populated in each table to map the virtual address to a physical address.
When our kernel is still in 32-bit protected mode before we enable paging, the CPU is executing instructions directly from physical memory.
The moment we enable paging, the Memory Management Unit (MMU) wakes up and starts treating every single address the CPU uses as a virtual address.
Picture this scenario: Your early boot code is running normally at a specific physical address in memory (let's call it X). You carefully write the logic to build your page tables and then execute the instruction that flips the switch: cr0 is set, and paging is now enabled.
Immediately, the very next instruction the CPU needs to fetch (Let's call it X+1) is now treated as a virtual address. The CPU sends this address to the MMU, which dutifully walks the page tables to find out which physical frame it corresponds to.
But what if your page tables haven't created a mapping for the address X+1 (i.e. X+1 has not been mapped to a physical frame in your page tables)?
The MMU looks up the address, finds no valid entry, and raises a page fault. The processor halts, and you're left with a definitive sign of failure: a blank, unresponsive screen.
This is precisely why we need an identity map for this critical transition. An identity map establishes a one-to-one mapping for the kernel's entire physical address range — the region where the kernel's code and data (including its .boot, .text, .data, .bss, .rodata, and .stack sections) reside in physical memory.
Before enabling paging, the CPU executes instructions directly from a specific physical address within this range, say 0x00100000. The purpose of the identity map is to ensure that the moment paging is enabled, the virtual address 0x00100000 translates directly back to the physical address 0x00100000.
More formally, if the kernel occupies the physical memory range [0x0, KPHYS_END], an identity map defines a corresponding virtual address range [0x0, KVIRT_END] = [0x0, KPHYS_END], where every virtual address within it is mapped to its identical physical address. Through the page tables, we enforce the rule that for any address in this range, virt = phys. This guarantees that the CPU, immediately after switching to virtual memory, can continue to access the kernel's code and data at their original locations without causing a fault.
Without it, the instruction pointer would immediately jump to a virtual address with no defined mapping, causing a page fault. This provides a stable bridge, allowing the code that enabled paging to continue running without interruption before any more complex virtual memory layouts are established.
Tip
It is good practice to identity map the entire range [0x0, KPHYS_END] rather than just [KPHYS_START, KPHYS_END]. The lower physical memory region, [0x0, KPHYS_START), usually contains critical early boot data structures such as the GDT, or multiboot information — placed there by the bootloader.
If this low-memory region is not identity-mapped, the kernel will incur a page fault when it subsequently attempts to access these structures, as the MMU will find no valid translation for their addresses.
Ideally, some kernels (like GatOS) identity map a large, fixed range (e.g, the first 1GB of physical memory) instead of just the exact kernel range (always rounded up to 4KiB, since that's the page size we use). This provides significant memory leeway, giving your kernel a large pool of pre-mapped addresses to use later.
The question of how much memory to identity map is up to you, but it is critical to ensure you reserve enough page tables to accommodate the entire chosen range.
While an identity map is a necessary tool for the transition to paging, relying on it permanently is not ideal. A more sophisticated and robust design, used by most modern operating systems, is the Higher Half Kernel. In this model, the kernel is mapped into the upper region of the virtual address space (for instance, starting at virtual address 0xFFFFFFFF80000000), while the lower portion is reserved for user-space applications.
Note
Placing the kernel at 0xFFFFFFFF80000000 offers a significant advantage: it enables the use of a efficient code model for the kernel known as the -mcmodel=kernel code model (GCC flag).
This model assumes that all code and statically defined data will be located in the top 2GB of the virtual address space. The key benefit is that instructions can use 32-bit signed immediate displacements for addressing, which are smaller and faster, while still being able to reach any address within that 2GB window relative to the instruction pointer (RIP-relative addressing).
This is more efficient than needing to load full 64-bit addresses for every memory reference.
To understand why a higher-half design is superior, consider the limitations of permanently running the kernel from an identity-mapped region in the lower addresses:
-
User Space Fragility: In a unified address space, a user application's code, heap, and stack would also be mapped into the lower virtual addresses. A single null pointer dereference in a userspace program (attempting to access virtual address
0x0) would actually access a valid, low physical address — likely belonging to a critical kernel data structure. This would corrupt kernel state and crash the system, violating memory protection entirely. - Wasted Virtual Space: The virtual address space is a valuable resource. By placing the kernel in the lower gigabytes, you fragment the contiguous virtual memory available to a single large user process.
The higher-half model elegantly solves these problems by segregating the address space:
-
Lower Half (e.g.,
0x0to0x00007FFFFFFFFFFF): Reserved exclusively for user processes. Each process gets its own independent, private view of this region through its per-process page tables. -
Higher Half (e.g.,
0xFFFF800000000000to0xFFFFFFFFFFFFFFFF): Reserved for the kernel. This mapping is global, meaning it is present and identical in the page tables of every process.
This separation yields critical benefits:
-
Strong Memory Protection: A userspace application can only manipulate addresses in the lower half. Any attempt to access the kernel's higher half without privilege will trigger a page fault. Similarly, a null pointer dereference (
0x0) in userspace will correctly trigger a segmentation fault, as that virtual page is unmapped, instead of silently corrupting the kernel. - Efficient Context Switching: When switching from one user process to another, the operating system can perform a "context switch" by simply loading the new process's page directory. The kernel's mappings remain unchanged and accessible, allowing the kernel code to run uninterrupted without needing to remap itself for each process.
- Simpler Virtual Memory Management: The kernel has a single, predictable virtual address for its own code and data, regardless of which user process is currently active. This greatly simplifies memory management within the kernel itself.
Important
The higher-half range in the 64-bit virtual address space spans from 0xFFFF800000000000 to 0xFFFFFFFFFFFFFFFF. This entire upper region is managed by the kernel.
However, the kernel's code and data are not typically located at the very start of this range (0xFFFF800000000000). Instead, they are placed at a specific offset within it, commonly defined as KERNEL_VIRTUAL_BASE = 0xFFFFFFFF80000000.
This specific placement, as previously explained, is strategic. It positions the kernel within the last 2 gigabytes of the address space, enabling efficient RIP-relative addressing with 32-bit displacements.
The vast portion of the higher-half range before KERNEL_VIRTUAL_BASE remains available for other kernel purposes, such as mapping physical memory (physmap), I/O spaces, or other system structures, while the kernel's core image resides at this optimized, fixed base address.
Implementing a higher-half kernel introduces a key complexity during bootstrapping. The kernel is initially loaded by the bootloader into a physical address in low memory (e.g., KPHYS_START). However, the kernel's code is ultimately intended to be linked to run from a virtual address in the high half (e.g., KERNEL_VIRTUAL_BASE + KPHYS_START).
This creates a dilemma: the moment paging is enabled, the CPU expects to find the next instruction at a high virtual address, but the kernel's code is still physically located in low memory. The solution is a two-stage mapping at boot time:
- Temporary Identity Map: As described in the previous section, you must create a temporary identity map for the kernel's physical location. This allows the code that enables paging to continue executing without crashing.
-
Higher-Half Map: Simultaneously, you set up the permanent higher-half mapping, linking the kernel's target virtual range (
[KERNEL_VIRTUAL_BASE + KPHYS_START, KERNEL_VIRTUAL_BASE + KPHYS_END]) to its physical range ([KPHYS_START, KPHYS_END]). This is essentially the same range adjusted by addingKERNEL_VIRTUAL_BASE.
Tip
Once again, for safety, it is better to map the entire physical range [0x0, KPHYS_END] to the virtual range [KERNEL_VIRTUAL_BASE, KERNEL_VIRTUAL_BASE + KPHYS_END]. We will refer to this resulting virtual address space as [KERNEL_VIRTUAL_BASE, KVIRT_END] from this point forward.
Once paging is enabled and the CPU is successfully executing through the temporary identity map, the kernel immediately jumps to its higher-half virtual address (this jump involves adding the KERNEL_VIRTUAL_BASE value to all relevant registers, e.g., the instruction pointer RIP, the stack pointer RSP, and any other registers holding pointers to kernel data structures).
Since the kernel's higher-half mapping was created by adding the same KERNEL_VIRTUAL_BASE offset to its physical base address, adjusting these registers will cause them to resolve to the exact same physical addresses they did through the identity map, ensuring a seamless transition.
After this jump, the temporary identity map can often be removed, cleaning up the virtual address space and leaving only the clean, higher-half kernel mapping.
Have you been paying attention? If you’ve been following along carefully, this is the moment everything clicks. Every little choice we made up until now, it all comes together here. Let me show you what all the setup has been leading toward. This part is pure GatOS :)
Caution
Welcome to the payoff. Strap in because it's about to get bumpy.
Remember the toolchain from the last chapter? We chose GRUB as our bootloader because it gives us the multiboot2 struct, which we can parse to grab critical machine info (like how much RAM we’ve got). It also saved us the headache of writing a custom bootloader.
That means GatOS starts out in 32-bit protected mode, so we can focus on writing assembly there instead of mucking around with real mode. Now think back to the linker script we broke down: it linked everything at KERNEL_VIRTUAL_BASE. To quote myself:
The kernel is linked to run at a high virtual address (
KERNEL_VIRTUAL_BASE = 0xFFFFFFFF80000000), but the bootloader actually loads it at a lower physical address (KPHYS_START = 0x10000).The
AT()directive tells the linker:“Place this section in the binary at this physical load address, but reference it in the code using this virtual address.”
Example (
.bootsection):
- Virtual Address (VMA):
0xFFFFFFFF80010000- Physical Load Address (LMA):
0x10000This separation lets the kernel start executing at its physical address, then seamlessly continue at its virtual address once paging kicks in.
So everything is loaded at low addresses but linked at high addresses. In practice, the kernel thinks it’s always running in the higher half. If you run objdump on the binary, you’ll see:
SYMBOL TABLE:
ffffffff80010000 l .boot 0000000000000000 header_start
ffffffff80010018 l .boot 0000000000000000 header_end
...
Every symbol shows up in higher-half space. Neat, right?
Not so fast. In 32-bit mode, our registers are only 32 bits wide. If we want to access any symbol where it's actually loaded, we’d need to subtract KERNEL_VIRTUAL_BASE from it — which is a full 64-bit constant.
Okay, so what? What's the problem? Well, in 32-bit x86 NASM,
- Registers (
eax,ebx, etc.) are only 32 bits. - You can’t shove a 64-bit immediate into a single instruction.
- So something like this is illegal:
sub eax, 0x123456789ABCDEF0NASM will just error out — the constant doesn’t fit. So… oops? What the hell? How in the world do we actually subtract KERNEL_VIRTUAL_BASE?
Well, that’s why we chose the GNU Assembler (GAS) over NASM. Yours truly told you why that coice would eventually pay off :)
Since GAS is integrated with the C preprocessor, we can just include C headers and let macros do the heavy lifting for us.
For example, in paging.h:
#define KERNEL_VIRTUAL_BASE 0xFFFFFFFF80000000
#ifdef __ASSEMBLER__
#define KERNEL_V2P(a) ((a) - KERNEL_VIRTUAL_BASE)
#define KERNEL_P2V(a) ((a) + KERNEL_VIRTUAL_BASE)
#else
#include <stdint.h>
#define KERNEL_V2P(a) ((uintptr_t)(a) & ~KERNEL_VIRTUAL_BASE)
#define KERNEL_P2V(a) ((uintptr_t)(a) | KERNEL_VIRTUAL_BASE)
#endifAnd then in assembly (.S file):
#include <paging.h>
.intel_syntax noprefix
.extern KERNEL_STACK_TOP # Defined in the linker
start:
# Example: set the stack pointer to the top of the stack
mov esp, offset KERNEL_V2P(KERNEL_STACK_TOP)And boom! The preprocessor handles the calculation for us. Instead of worrying about subtracting giant 64-bit constants in 32-bit mode, we just use the macros and get the correct load address from the linker’s AT() directive.
With these in place, we can set up everything as needed in 32-bit mode, perform the long mode jump, and then, in 64-bit mode, jump directly to a higher-half-linked function. Assuming the Higher Half page mappings are correct, execution continues seamlessly.
But wait, there's more!
As mentioned earlier in this document, GatOS preallocates page tables to map the first 1GB of physical memory. These tables create two distinct virtual mappings for the same physical range:
-
[0x0, 1GB]physical →[0x0, 1GB]virtual (Identity Map) -
[0x0, 1GB]physical →[KERNEL_VIRTUAL_BASE, KERNEL_VIRTUAL_BASE + 1GB]virtual (Higher Half Map)
The kernel's own range, [KPHYS_START, KPHYS_END], resides within this first gigabyte of physical memory.
Also rememeber how I mentioned that this 1GB fixed mapping gives a lot of leeway for GatOS to handle memory? Well, you're about to see why.
Note
For clarity, the range [KERNEL_VIRTUAL_BASE + KPHYS_START, KERNEL_VIRTUAL_BASE + KPHYS_END] will be referred to as [KVIRT_START, KVIRT_END].
All kernel sections (.text, .data, etc.) reside within the physical range [KPHYS_START, KPHYS_END] and, consequently, the virtual range [KVIRT_START, KVIRT_END]. However, our page tables have already mapped the entire larger range up to KERNEL_VIRTUAL_BASE + 1GB.
Here is how GatOS uses this extra space:
- It parses the Multiboot2 structure from GRUB to determine the total size of the system's RAM.
- It calculates how many page tables are required to map all of this physical memory into a virtual address space, and then calculates the total memory,
Xbytes, needed to store these page tables. - GatOS then internally adjusts
KVIRT_ENDtoKVIRT_END + X. This is safe because we knowKVIRT_END + Xwill not exceedKERNEL_VIRTUAL_BASE + 1GBfor any reasonable amount of RAM. - Using this newly reserved space
X, GatOS populates the page tables to map the entire physical memory (the physmap) into the higher half virtual address starting atPHYSMAP_VIRTUAL_BASE = 0xFFFF800000000000. The kernel's own mapping atKERNEL_VIRTUAL_BASEremains intact within this new page structure. - Finally, it unmaps all virtual memory regions except for two essential ones:
-
[KERNEL_VIRTUAL_BASE, KVIRT_END]for the kernel itself. -
[PHYSMAP_VIRTUAL_BASE, PHYSMAP_VIRTUAL_BASE + RAM_SIZE]for the physmap.
-
We will see in future documents why we need this physmap, but it has to do with bootstrapping our Physical Memory Allocator (PMM) without causing page faults because the addresses aren't mapped.
Ain't ya friggin' excited?