Build your own Operating System #7

Gayan Malinda
9 min readSep 6, 2021

--

Virtual Memory and Paging

Hello everyone!

This is the seventh article of the “Build your own Operating System“ article series. I like to suggest, please refer to previous articles, before reading this. Before ending this article I mentioned my previous articles. Please go through and refer them before reading this. It will help you to a better understanding of this article.

Let’s move to today’s topic. First of all, I give a short Intro to what is virtual memory. Let’s move.

Virtual Memory

The virtual memory subsystem of a processor implements the virtual address spaces provided to each process. This makes each process think it is alone in the system. The list of advantages of virtual memory are described in detail elsewhere so they will not be repeated here.

A virtual address space is implemented by the Memory Management Unit (MMU) of the CPU. The OS has to fill out the page table data structures, but most CPUs do the rest of the work themselves. This is actually a pretty complicated mechanism; the best way to understand it is to introduce the data structures used to describe the virtual address space.

The input to the address translation performed by the MMU is a virtual address. There are usually few — if any — restrictions on its value. Virtual addresses are 32-bit values on 32-bit systems, and 64-bit values on 64-bit systems. On some systems, for instance x86 and x86–64, the addresses used actually involve another level of indirection: these architectures use segments which simply cause an offset to be added to every logical address.

Virtual Memory Through Segmentation?

You could skip paging entirely and just use segmentation for virtual memory. Each user mode process would get its own segment, with base address and limit properly set up. This way no process can see the memory of another process. A problem with this is that the physical memory for a process needs to be contiguous (or at least it is very convenient if it is). Either we need to know in advance how much memory the program will require, or we can move the memory segments to places where they can grow when the limit is reached (expensive, causes fragmentation — can result in “out of memory” even though enough memory is available). Paging solves both these problems.

It is interesting to note that in x86_64 (the 64-bit version of the x86 architecture), segmentation is almost completely removed.

Paging

Paging is a storage mechanism used to retrieve processes from the secondary storage into the main memory in the form of pages. The main idea behind the paging is to divide each process in the form of pages. The main memory will also be divided in the form of frames. One page of the process is to be stored in one of the frames of the memory. The pages can be stored at the different locations of the memory but the priority is always to find the contiguous frames or holes. Pages of the process are brought into the main memory only when they are required otherwise they reside in the secondary storage.

Segmentation translates a logical address into a linear address. Paging translates these linear addresses onto the physical address space, and determines access rights and how the memory should be cached.

Paging is optional, and some operating systems do not make use of it. But if we want to mark certain areas of memory accessible only to code running at a certain privilege level (to be able to have processes running at different privilege levels), paging is the neatest way to do it.

Page Entries

Virtual memory regions are generally independent of one another because each process has its own set of page mappings. Pages are fixed at 4KB in the x86 architecture (32-bit). Each page has a description word that directs the processor to which frame it should be mapped. The least significant 12 bits of the 32-bit word are always zero because pages and frames must be aligned on 4KB boundaries (4KB being 0x1000 bytes). This is taken advantage of by the architecture, which uses them to store information about the page, such as whether it is there, whether it is kernel-mode or user-mode, and so on. This word’s arrangement can be seen in the image to the below.

P  -     Set if the page is present in memory.R/W  -   If set, that page is writeable. If unset, the page is read-only. This does not apply when code is running in kernel-mode (unless a flag in CR0 is set).U/S  -   If set, this is a user-mode page. Else it is a supervisor (kernel)-mode page. User-mode code cannot write to or read from kernel-mode pages.Reserved  - These are used by the CPU internally and cannot be trampled.A  -      Set if the page has been accessed (Gets set by the CPU).D  -      Set if the page has been written to (dirty).AVAIL  -  These 3 bits are unused and available for kernel-use.Page frame address  -  The high 20 bits of the frame address in physical memory.

Identity Paging

The simplest kind of paging is when we map each virtual address onto the same physical address, called identity paging. This can be done at compile time by creating a page directory where each entry points to its corresponding 4 MB frame. In NASM this can be done with macros and commands (%rep, times, and dd). It also can be done at run-time by using ordinary assembly code instructions.

Enabling Paging

Paging is enabled by first writing the address of a page directory to cr3 and then setting bit 31 (the PG “paging-enable” bit) of cr0 to 1. To use 4 MB pages, set the PSE bit (Page Size Extensions, bit 4) of cr4.

To do this add paging_enable.s file to your working directory with the given code.

It is important to note that all addresses within the page directory, page tables and in cr3 need to be physical addresses to the structures, never virtual.

An instruction that is useful when an updating a PDT or PT is invlpg. It invalidates the Translation Lookaside Buffer (TLB) entry for a virtual address. The TLB is a cache for translated addresses, mapping physical addresses corresponding to virtual addresses. This is only required when changing a PDE or PTE that was previously mapped to something else. If the PDE or PTE had previously been marked as not present (bit 0 was set to 0), executing invlpg is unnecessary. Changing the value of cr3 will cause all entries in the TLB to be invalidated.

An example of invalidating a TLB entry is shown below:

; invalidate any TLB references to virtual address 0invlpg [0]

Paging and the Kernel

Generally, during the linking process, the linker assumes that the code will be loaded into the memory position 0x00000000. Therefore, when resolving absolute references, 0x00000000 will be the base address for calculating the exact position. But if the kernel is mapped on the beginning of the virtual address space (0x00000000, “Size of kernel”), the user mode process cannot be loaded at virtual address 0x00000000. Therefore, the assumption from the linker that the user mode process is loaded into memory at position 0x00000000 becomes wrong. Although this can be corrected by using a linker script which tells the linker to assume a different starting address, but it will be unmanageable for the users of the operating system.

That being the case, the kernel should be placed at a very high virtual memory address, for example 0xC0000000(3 GB). Now the only way that a user mode process can conflict with the kernel is it’s being 3 GB large, but it is not a common scenario.

When the kernel uses virtual addresses at 3 GB and above it is called a higher-half kernel. Selecting the correct address depends on how much virtual memory should be available for the kernel and how much virtual memory should be available for the process. If the user mode process is larger than 3 GB, some pages will be needed to swap out by the kernel. But, we are not discussing about swapping pages here.

Placing the Kernel at 0xC0000000

To start with, it is better to place the kernel at 0xC0100000 than 0xC0000000, since this makes it possible to map (0x00000000, 0x00100000) to (0xC0000000, 0xC0100000). This way, the entire range (0x00000000, “size of kernel”) of memory is mapped to the range (0xC0000000, 0xC0000000 + “size of kernel”).

Placing the kernel at 0xC0100000 isn’t hard, but it does require some thought. This is once again a linking problem. When the linker resolves all absolute references in the kernel, it will assume that our kernel is loaded at physical memory location 0x00100000, not 0x00000000, since relocation is used in the linker script. However, we want the jumps to be resolved using 0xC0100000 as base address, since otherwise a kernel jump will jump straight into the user mode process code (remember that the user mode process is loaded at virtual memory 0x00000000).

However, we can’t simply tell the linker to assume that the kernel starts at 0xC01000000, since we want it to be loaded at the physical address 0x00100000. The reason for having the kernel loaded at 1 MB is because it can’t be loaded at 0x00000000, since there is BIOS and GRUB code loaded below 1 MB. Furthermore, we cannot assume that we can load the kernel at 0xC0100000, since the machine might not have 3 GB of physical memory.

This can be solved by using both relocation (.=0xC0100000) and the AT instruction in the linker script. Relocation specifies that non-relative memory-references should should use the relocation address as base in address calculations. AT specifies where the kernel should be loaded into memory. Relocation is done at link time by GNU ld, the load address specified by AT is handled by GRUB when loading the kernel, and is part of the ELF format.

Higher-half Linker Script

To implement this we have to modify the link.ld. The following code can be used for it.

Entering the Higher Half

When GRUB jumps to the kernel code, there will be no paging table. Therefore, all references to 0xC0100000 + x won’t be mapped to the correct physical address, and will therefore cause a general protection exception or the computer will just crash.

It’s time to create initiate paging using the C language. To do this create common.h, common.c, paging.h, paging.c, kheap.h, and kheap.c files with given codes in your working directory. Let’s create files mentioned for this one by one.

common.h
common.c
paging.h
paging.c
kheap.h
kheap.c

Finally, update your kmain.c file as bellow code and add mentioned OBJECTS to Makefile.

kmain.c
Update Makefile OBJECTS

If you followed the given steps correctly, You can see the below-shown console after running make run command.

Virtual Memory Through Paging

Paging enables two things that are good for virtual memory. First, it allows for fine-grained access control to memory. You can mark pages as read-only, read-write, only for PL0 etc. Second, it creates the illusion of contiguous memory. User mode processes, and the kernel, can access memory as if it were contiguous, and the contiguous memory can be extended without the need to move data around in memory. We can also allow the user mode programs access to all memory below 3 GB, but unless they actually use it, we don’t have to assign page frames to the pages. This allows processes to have code located near 0x00000000 and the stack at just below 0xC0000000, and still not require more than two actual pages.

I hope you understand steps in executing a program in kernel mode. Let’s meet with the next article of ‘Build your own Operating System’ series. Thank you so much for reading!

The Little OS Book: https://littleosbook.github.io/book.pdf

LWN.net has an article on virtual memory: http://lwn.net/Articles/253361/

Gustavo Duarte has also written an article about virtual memory: https://manybutfinite.com/post/memory-translation-and-segmentation/

https://en.wikipedia.org/wiki/Memory_paging

The OSDev wiki has a page on paging: http://wiki.osdev.org/Paging and a tutorial for making a higher-half kernel: http://wiki.osdev.org/Higher_Half_bare_bones

http://flint.cs.yale.edu/cs422/doc/ELF_Format.pdf

Previous articles:

--

--

Gayan Malinda

Software Engineering Undergraduate - University of Kelaniya, Sri Lanka