Build your own Operating System #4

14 min readAug 13, 2021

Segmentation in x86

Hello everyone!

This is the fourth article of the “Build your own Operating System“ article series. First of all, I like to suggest, please refer to previous articles, before reading this. I am linking the last article in case you haven’t followed them.

Build your own Operating System #3

Integrating Outputs

gayan1999malinda.medium.com

What is Memory Segmentation?

Memory is one of the most important resources on a computing system, and its management is primary in every environment. In a bid to use memory efficiently and effectively a number of techniques have been developed to properly manage it. One of these memory management techniques is known as Memory Segmentation (MS). Memory Segmentation is defined as a system of segmenting processes and loading them into different non-contiguous addressed spaces in memory. They are referenced using memory addresses. The processes are first ‘segmented’ (or split) most commonly into three segments, one to house the data, another to house the code and a third to house the stack.

Programmers may use different variables to achieve this in their program development. The data segment represents all the variables which will be used in running the program. The code segment is the actual execution of the process, while the stack segment monitors the progress and status of the different elements of the program. Now, depending on how complex a program is or its level of sophistication the program may be comprised of many more segments.

Once the process of segmentation occurs, the entire process can be loaded into different areas in memory instead of one contiguous space. This allows for the loading of smaller segments of the process into memory, allowing the physical memory to be used more efficiently. This loading is done by a placement algorithm with processes provided the exact memory space they require as in dynamic partitioning. This technique allows for better memory management, reducing the occurrences of fragmentation. Programmers armed with this technique segment their programs according to the corresponding program logic. This makes segmentation more realistic to the programmer.

Segmentation and Process Loading

We know that memory is divided into different sections. The operating system, as the diagram indicates, occupies a dedicated section in memory.

A program’s processes are divided into segments as seen in Figure 1. The processes are loaded into memory (see Figure 3) using a special placement algorithm similar to dynamic partitioning. If you are familiar with dynamic partitioning, you know that it is a system of loading the process into memory with memory allocations matching the exact size of processes eliminating internal fragmentation.

Now the processes of the program are loaded into memory using a placement algorithm. The algorithm determines which process segment is loaded into which memory location. As such, the different processes that comprise the program can be loaded into different parts of memory. They are referenced using memory addresses in their non-contiguous spaces as seen in Figure 3. Segmentation does not incur internal memory fragmentation but does incur external fragmentation.

The capability for both processes and memory to be segmented allows for optimum utilization of memory as well as allowing multiple running processes to share the memory space. The operating system is isolated in its own partition. Though the different segments that comprise a process occupy non-contiguous areas in memory, this separation does not in any way affect the running of the processes. Processes run unaware that the memory is being shared.

Now we know segmentation is an operating system memory management technique of division of addressable memory space into protected address spaces called segments. As mentioned earlier, segmentation in x86 means accessing the memory through segments. Segments are portions of the address space, possibly overlapping, specified by a base address and a limit.

In order to address a byte in segmented memory a 48-bit logical address is used. In 48 bits, 16 bits specifies the segment and 32-bits specifies the offset that will be used within that segment.

The offset is added to the base address of the segment, and the resulting linear address is checked against the segment’s limit. The process is visualized in the figure shown below:

Translation of logical addresses to linear addresses

If everything works out fine the result is a linear address. Segmentation translates a logical address into a linear address while paging translates these linear addresses onto the physical address space. When paging is disabled, the linear address space is mapped 1:1 onto the physical address space, and the physical memory can be accessed.

Segment Selector

A segment selector is the unique identifier of a segment and is used in the first part of logical address. It is a special pointer that identifies a segment in memory. The content of a segment selector is described in the figure below:

The value of a segment selector is hold in a segment register. To access a particular segment in memory, the segment selector for that segment must be present in the appropriate segment register.

The processor has six 6, 16-bit segment registers that are totally independent of one another as follows:

CS — Code Segment

DS — Data Segment

SS — Stack Segment

ES — Extra Segment

FS/GS — General Purpose Segments

The OS is free to use the registers ES, GS and FS however it want. Most of the time when accessing memory there is no need to explicitly specify the segment to use.

Below is an example showing implicit use of the segment registers:

func:      mov eax, [esp+4]      mov ebx, [eax]      add ebx, 8      mov [eax], ebx      ret

Below is an example for the explicit use of the segment registers:

func:      mov eax, [ss:esp+4]      mov ebx, [ds:eax]      add ebx, 8      mov [ds:eax], ebx      ret

In the explicit use it is not needed to use SS for storing the stack segment selector, or DS for the data segment selector. Instead stack segment selector can be stored in DS and vice versa. However, in order to use the implicit style shown above, storing the segment selectors in their indented registers is a must.

Segment Descriptor

Segment descriptors describe the memory segment referred in the logical address. The segment descriptor contains the following fields:

A segment base address
The segment limit which specifies the segment size
Access rights byte containing the protection mechanism information
Control bits

The x86 and x86–64 segment descriptor has the following form:

Fields stand for:

Base Address : 32 bit starting memory address of the segment

Segment Limit : 20 bit length of the segment. (More specifically, the address of the last accessible data, so the length is one more that the value stored here.)

G=Granularity : If clear, the limit is in units of bytes, with a maximum of 220 bytes. If set, the limit is in units of 4096-byte pages, for a maximum of 232 bytes.

D=Default operand size : If clear, this is a 16-bit code segment; if set, this is a 32-bit segment.

B=Big : If set, the maximum offset size for a data segment is increased to 32-bit 0xffffffff. Otherwise it’s the 16-bit max 0x0000ffff.

L=Long : If set, this is a 64-bit segment (and D must be zero), and code in this segment uses the 64-bit instruction encoding. “L” cannot be set at the same time as “D” and “B”.

AVL=Available : For software use, not used by hardware

P=Present : If clear, a “segment not present” exception is generated on any reference to this segment

DPL=Descriptor privilege level: Privilege level required to access this descriptor

Type : If set, this is a code segment descriptor. If clear, this is a data/stack segment descriptor. Can not be both writable and executable at the same time.

C=Conforming : Code in this segment may be called from less-privileged levels.

E=Expand-Down : If clear, the segment expands from base address up to base + limit. If set, it expands from maximum offset down to limit, a behavior usually used for stacks.

R=Readable : If clear, the segment may be executed but not read from.

W=Writable : If clear, the data segment may be read but not written to.

A=Accessed : This bit is set to 1 by hardware when the segment is accessed, and cleared by software.

In order to enable segmentation a table called segment descriptor table, that describes each segment should be set up. In x86, there are two types of descriptor tables as The Global Descriptor Table (GDT) and Local Descriptor Tables (LDT).

Local Descriptor Table (LDT)

A Local Descriptor Table (LDT) is like the Global Descriptor Table in that it holds Segment descriptors for access to memory. The difference is that every Task/thread can have its own LDT, and the OS can change the LDT Register (LDTR) on every Task switch.

That means that every program can have its own list of memory Segment descriptors, and keep them private from other programs:

· The Code, Data and Heap segments can be private in the LDT — separate from other programs, but available to this program;

· Each Task/thread within this program can have its own Stack in the LDT, and yet still be able to access the above Segments;

· Sharing within the program is automatic: just ‘know’ the correct Descriptor reference;

· If another program (with another LDT) was to attempt to access one of these Segments, it would access its own LDT’s Segment rather than the target Segment.

An LDT is set up and managed by user-space processes, and all processes have their own LDT. There will be generally one LDT per user process, describing privately held memory. The operating system will switch the current LDT when scheduling a new process, using the LLDT machine instruction or when using a TSS. LDTs can be used if a more complex segmentation model is desired. In this article, we will use Global Descriptor Table(GDT).

The Global Descriptor Table (GDT)

A vital part of the 386’s various protection measures is the Global Descriptor Table, otherwise called a GDT. The GDT defines base access privileges for certain parts of memory. We can use an entry in the GDT to generate segment violation exceptions that give the kernel an opportunity to end a process that is doing something it shouldn’t. Most modern operating systems use a mode of memory called “Paging” to do this: It is alot more versatile and allows for higher flexibility. The GDT can also define if a section in memory is executable or if it is infact, data. The GDT is also capable of defining what are called Task State Segments (TSSes). A TSS is used in hardware-based multitasking, and is not discussed here. Please note that a TSS is not the only way to enable multitasking.

Note that GRUB already installs a GDT for you, but if we overwrite the area of memory that GRUB was loaded to, we will trash the GDT and this will cause what is called a ‘triple fault’. In short, it’ll reset the machine. What we should do to prevent that problem is to set up our own GDT in a place in memory that we know and can access. This involves building our own GDT, telling the processor where it is, and finally loading the processor’s CS, DS, ES, FS, and GS registers with our new entries. The CS register is also known as the Code Segment. The Code Segment tells the processor which offset into the GDT that it will find the access privileges in which to execute the current code. The DS register is the same idea, but it’s not for code, it’s the Data segment and defines the access privileges for the current data. ES, FS, and GS are simply alternate DS registers, and are not important to us.

Mainly the Global Descriptor Table (GDT) defines the characteristics of the various memory areas called segments, used during program execution, including the base address, the size, and access privileges like executability and writability. The GDT is shared by everyone therefore it’s global. Both GDT/LDT is an array of 8-byte segment descriptors.

The first descriptor in the GDT is always a null descriptor and it can never be used to access memory. At least two segment descriptors along with the null descriptor are needed for the GDT, because the descriptor contains more information than just the base and limit fields.

The two most relevant fields in segment descriptors, are the Type field and the Descriptor Privilege Level (DPL) field I explained earlier in the article.

Type field can not be both writable and executable at the same time. So, two segments are needed whereas one segment for executing code to put in CS (Type is Execute-only or Execute-Read) and another segment for reading and writing data (Type is Read or Write) to put in the other segment registers.

The DPL specifies the privilege levels required to use the segment. x86 allows four privilege levels PL0, PL1, PL2 and PL3, where PL0 is the most privileged. The kernel should be able to do anything, therefore it uses segments with DPL set to 0. This is also called as kernel mode. The current privilege level (CPL) is determined by the segment selector in CS.

The segments needed are described in the table below:

Note that the segments overlap — they both encompass the entire linear address space. In our minimal setup we’ll only use segmentation to get privilege levels.

Loading the GDT

Loading the GDT into the processor is done with the lgdt assembly code instruction, which takes the address of a struct that specifies the start and size of the GDT. It is easiest to encode this information using a “packed struct” as shown in the following example:

struct gdt {       unsigned int address;       unsigned short size;} __attribute__((packed));

What is a packed struct?

this section you will be given some basic knowledge on structure packing and padding in C programming.
Configuration bytes are a collection of bits in a very specific order. Shown below is an example with 32 bits:

Bit:     | 31 24 |  23 8   |   7 0  |Content: | index | address | config |

Rather than using an unsigned integer ( unsigned int), for handling such configurations, it is much more easier to use “packed structures”, as shown below:

struct example {
      unsigned char config; /* bit 0–7 */
      unsigned short address; /* bit 8–23 */
      unsigned char index; /* bit 24–31 */
 };

Structure padding is a concept in C that adds one or more empty bytes between the memory addresses to align the data in memory. When using the struct, the compiler can add some padding between elements for various reasons. But, in the usage of a struct to represent configuration bytes, it is very important that the compiler does not add any padding, because the in the end struct will be treated as a 32 bit unsigned integer by the hardware.

The attribute packed can be used, as shown below to force GCC to avoid adding any padding between elements.

struct example {
      unsigned char config; /* bit 0–7 */
      unsigned short address; /* bit 8–23 */
      unsigned char index; /* bit 24–31 */
 } __attribute__((packed));

Remember __attribute__((packed)) is not part of the C standard so, it might not work with all C compilers.

If the content of the eax register is the address to such a struct, then the GDT can be loaded with the assembly code shown below:

lgdt [eax]

It is good if you make this lgdt instruction available from C, the same way as assembly code instructions in and out in the previous article. lgdt assembly code instructions can be wrapped in a function in assembly code that can be accessed from C language using the cdecl calling standard. For this first create a file with the name gdt.s in your working directory (These names can be changed as you wish but these are more understandable) and save the following code in it:

After the GDT has been loaded the segment registers needs to be loaded with their corresponding segment selectors. The content of a segment selector is described in the figure and table below:

Bit:     |      15 3      |  2  | 1 0 |Content: | offset (index) |  ti | rpl |

The layout of segment selectors

 Name                      Descriptionrpl             Requested Privilege Level — we want to execute in PL0 for now.ti              Table Indicator. 0 means that this specifies a GDT segment, 1 means an LDT Segment.offset (index)  Offset within descriptor table.

The layout of a segment selector was explained under the segment selector section. There, the offset of the segment selector is added to the start of the GDT to get the address of the segment descriptor (0x08 for the first descriptor and 0x10 for the second, since each descriptor is 8 bytes). The Requested Privilege Level (RPL) should be 0 as the kernel of the OS should execute in privilege level 0.

Loading the segment selector registers is easy for the data registers, we just have to copy the correct offsets to the registers as follows:

 mov ds, 0x10
 mov ss, 0x10
 mov es, 0x10
 .
 .
 .

For this update your gdt.s file with the following assembly code:

The process of loading GDT can be implemented using C programming language as follows. First let’s create memory_segments.h file in our working directory and save this code in it:

Then we can create memory_segments.c file to include function definitions of above declarations. The following source code can be used for it:

Then, update your kmain.c file to call segments_install_gdt() function. Now your kmain.c file look like this:

Finally, update OBJECTS variable of Makefile as shown in the figure below:

To load cs we have to do a “far jump”. A far jump is a jump where we explicitly specify the full 48-bit logical address: the segment selector to use and the absolute address to jump to. It will first set cs to 0x08 and then jump to flush_cs using its absolute address.

To do this pdate your gdt.s file as follows:

; code here uses the previous csjmp 0x08:flush_cs     ; specify cs when jumping to flush_csflush_cs:     ret              ; now we’ve changed cs to 0x08

Finally, your gdt.s file look like this.

This will first set CS to 0x08 and then jump to flush_cs using its absolute address.

Using the “make run” command boot your OS, if the process end successfully you have integrated segmentation.

Quit Bochs and display the log generated by Bochs with the cat bochslog.txt command.

After running cat bochslog.txt command, if your integrating segmentation is successful you could see this kind of text on your terminal.

Congratulations! Now you have finished integrating segmentation in x86, which is accessing the memory through segments. In the next article, you will be able to study interrupt handling and reading inputs from the keyboard. Thank you so much for reading!

References:

The Little OS Book: https://littleosbook.github.io/book.pdf

x86 Memory Segmentation — Wikipedia

Segment Descriptor — Wikipedia

Global Descriptor Table — Wikipedia

Global Descriptor Table

GDT Tutorial

Build your own Operating System #4

Build your own Operating System #3

Integrating Outputs

Written by Gayan Malinda