Kernel98 is a small educational 32-bit operating system kernel, featuring:
- Copy-on-Write for
fork(): When a process is forked, only thetask_structis duplicated initially. All other memory pages are shared between parent and child using Copy-on-Write (CoW). Physical pages are only copied when either process writes to them. - Shared pages for
execv(): When multiple processes execute the same binary, code pages are shared between them to reduce memory usage. - Demand paging: Pages are only allocated from, or loaded into physical memory (depending on if the address is mapped to disk or not) when they are actually accessed.
- Buffer cache for block devices: To speed up disk IO, all reads/writes are cached using reference-counted buffers. Among the buffer caches, we maintain two kinds of double linked lists: the first connects all buffers to maintain a Least Recently Used (LRU) cache, and the second connects buffers with the same hash to resolve collisions. Each buffer has flags indicating whether it is up-to-date or dirty. Dirty buffers must be flushed to disk before reuse. Buffers are protected using mutexes to ensure safe concurrent access.
- Character device support with ring buffers and separate read/write queues.
- Support for Minix file system.
To build the kernel, run make. You can run it with Bochs.
Here's a high-level overview of the boot and runtime design:
When powered on, the CPU starts in 16-bit real mode, executing BIOS firmware.
- BIOS loads the bootloader (first 512 bytes of the disk) into memory at
0x7c00and jumps to it. - The bootloader quickly copies
setup.Sto a higher address (0x90000) to avoid being overwritten. - It collects hardware information and loads the system image from disk to
0x10000. - A Global Descriptor Table (GDT) is set up with flat segments.
- A20 line is enabled and protected mode is entered.
- BIOS is no longer usable, so the system image is moved to
0x0. - Paging is enabled using a single Page Directory Table (PDT) and four Page Tables to create a 1:1 identity mapping covering 16 GB.
- Physical memory is divided into 4 KB pages with a reference count array that tracks usage.
- All external interrupts are temporarily disabled during this setup.
- The VGA text buffer is mapped at
0xb8000. Cursor position is controlled by writing to specific VGA I/O ports. - A minimal
printfimplementation walks the stack according to the control string, and format it into a buffer, which is then written directly to VGA memory.
- Interrupt Descriptor Table (IDT) is populated with handler entries.
- External interrupts enabled include:
- Keyboard (for user input)
- Timer (for preemptive multitasking)
- Disk controller (for block IO)
- Each process's memory is isolated using a local descriptor with a 64 MB limit.
- A new page is allocated for each process's
task_struct. This structure has atss_structat the bottom (storing process state) and a kernel stack at the top. Interrupt handlers share these kernel stacks. - Regular timer interrupts schedules the processes based on time slices.
- Wait lists (implemented as linked lists on different kernel stacks) allow blocked syscalls to put themselves to sleep and yield the CPU by calling
schedule().