mirror of
https://github.com/0xAX/linux-insides.git
synced 2025-01-04 04:40:54 +00:00
Make grammar and formatting fixes in linux-bootstrap-3.md
This commit is contained in:
parent
360b9a62b4
commit
c81fa6763b
@ -4,11 +4,11 @@ Kernel booting process. Part 3.
|
||||
Video mode initialization and transition to protected mode
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
This is the third part of the `Kernel booting process` series. In the previous [part](linux-bootstrap-2.md#kernel-booting-process-part-2), we stopped right before the call of the `set_video` routine from [main.c](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/arch/x86/boot/main.c#L181). In this part, we will see:
|
||||
This is the third part of the `Kernel booting process` series. In the previous [part](linux-bootstrap-2.md#kernel-booting-process-part-2), we stopped right before the call to the `set_video` routine from [main.c](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/arch/x86/boot/main.c#L181). In this part, we will look at:
|
||||
|
||||
* video mode initialization in the kernel setup code,
|
||||
* preparation before switching into protected mode,
|
||||
* transition to protected mode
|
||||
* the preparations made before switching into protected mode,
|
||||
* the transition to protected mode
|
||||
|
||||
**NOTE** If you don't know anything about protected mode, you can find some information about it in the previous [part](linux-bootstrap-2.md#protected-mode). Also, there are a couple of [links](linux-bootstrap-2.md#links) which can help you.
|
||||
|
||||
@ -18,7 +18,7 @@ As I wrote above, we will start from the `set_video` function which is defined i
|
||||
u16 mode = boot_params.hdr.vid_mode;
|
||||
```
|
||||
|
||||
which we filled in the `copy_boot_params` function (you can read about it in the previous post). The `vid_mode` is an obligatory field which is filled by the bootloader. You can find information about it in the kernel `boot protocol`:
|
||||
which we filled in the `copy_boot_params` function (you can read about it in the previous post). `vid_mode` is an obligatory field which is filled by the bootloader. You can find information about it in the kernel `boot protocol`:
|
||||
|
||||
```
|
||||
Offset Proto Name Meaning
|
||||
@ -38,7 +38,7 @@ vga=<mode>
|
||||
line is parsed.
|
||||
```
|
||||
|
||||
So we can add `vga` option to the grub or another bootloader configuration file and it will pass this option to the kernel command line. This option can have different values as mentioned in the description. For example, it can be an integer number `0xFFFD` or `ask`. If you pass `ask` to `vga`, you will see a menu like this:
|
||||
So we can add the `vga` option to the grub or another bootloader configuration file and it will pass this option to the kernel command line. This option can have different values as mentioned in the description. For example, it can be an integer number `0xFFFD` or `ask`. If you pass `ask` to `vga`, you will see a menu like this:
|
||||
|
||||
![video mode setup menu](http://oi59.tinypic.com/ejcz81.jpg)
|
||||
|
||||
@ -65,13 +65,13 @@ After we get `vid_mode` from `boot_params.hdr` in the `set_video` function, we c
|
||||
#define RESET_HEAP() ((void *)( HEAP = _end ))
|
||||
```
|
||||
|
||||
If you have read the second part, you will remember that we initialized the heap with the [`init_heap`](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/arch/x86/boot/main.c#L116) function. We have a couple of utility functions for heap which are defined in `boot.h`. They are:
|
||||
If you have read the second part, you will remember that we initialized the heap with the [`init_heap`](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/arch/x86/boot/main.c#L116) function. We have a couple of utility functions for managing the heap which are defined in `boot.h`. They are:
|
||||
|
||||
```C
|
||||
#define RESET_HEAP()
|
||||
```
|
||||
|
||||
As we saw just above, it resets the heap by setting the `HEAP` variable equal to `_end`, where `_end` is just `extern char _end[];`
|
||||
As we saw just above, it resets the heap by setting the `HEAP` variable to `_end`, where `_end` is just `extern char _end[];`
|
||||
|
||||
Next is the `GET_HEAP` macro:
|
||||
|
||||
@ -82,11 +82,11 @@ Next is the `GET_HEAP` macro:
|
||||
|
||||
for heap allocation. It calls the internal function `__get_heap` with 3 parameters:
|
||||
|
||||
* size of a type in bytes, which need be allocated
|
||||
* `__alignof__(type)` shows how variables of this type are aligned
|
||||
* `n` tells how many items to allocate
|
||||
* the size of the datatype to be allocated for
|
||||
* `__alignof__(type)` specifies how variables of this type are to be aligned
|
||||
* `n` specifies how many items to allocate
|
||||
|
||||
Implementation of `__get_heap` is:
|
||||
The implementation of `__get_heap` is:
|
||||
|
||||
```C
|
||||
static inline char *__get_heap(size_t s, size_t a, size_t n)
|
||||
@ -106,7 +106,7 @@ and we will further see its usage, something like:
|
||||
saved.data = GET_HEAP(u16, saved.x * saved.y);
|
||||
```
|
||||
|
||||
Let's try to understand how `__get_heap` works. We can see here that `HEAP` (which is equal to `_end` after `RESET_HEAP()`) is the address of aligned memory according to the `a` parameter. After this we save the memory address from `HEAP` to the `tmp` variable, move `HEAP` to the end of the allocated block and return `tmp` which is the start address of allocated memory.
|
||||
Let's try to understand how `__get_heap` works. We can see here that `HEAP` (which is equal to `_end` after `RESET_HEAP()`) is assigned the address of the aligned memory according to the `a` parameter. After this we save the memory address from `HEAP` to the `tmp` variable, move `HEAP` to the end of the allocated block and return `tmp` which is the start address of allocated memory.
|
||||
|
||||
And the last function is:
|
||||
|
||||
@ -117,7 +117,7 @@ static inline bool heap_free(size_t n)
|
||||
}
|
||||
```
|
||||
|
||||
which subtracts value of the `HEAP` from the `heap_end` (we calculated it in the previous [part](linux-bootstrap-2.md)) and returns 1 if there is enough memory for `n`.
|
||||
which subtracts value of the `HEAP` pointer from the `heap_end` (we calculated it in the previous [part](linux-bootstrap-2.md)) and returns 1 if there is enough memory available for `n`.
|
||||
|
||||
That's all. Now we have a simple API for heap and can setup video mode.
|
||||
|
||||
@ -126,20 +126,20 @@ Set up video mode
|
||||
|
||||
Now we can move directly to video mode initialization. We stopped at the `RESET_HEAP()` call in the `set_video` function. Next is the call to `store_mode_params` which stores video mode parameters in the `boot_params.screen_info` structure which is defined in [include/uapi/linux/screen_info.h](https://github.com/0xAX/linux/blob/0e271fd59fe9e6d8c932309e7a42a4519c5aac6f/include/uapi/linux/screen_info.h).
|
||||
|
||||
If we look at the `store_mode_params` function, we can see that it starts with the call to the `store_cursor_position` function. As you can understand from the function name, it gets information about cursor and stores it.
|
||||
If we look at the `store_mode_params` function, we can see that it starts with a call to the `store_cursor_position` function. As you can understand from the function name, it gets information about the cursor and stores it.
|
||||
|
||||
First of all, `store_cursor_position` initializes two variables which have type `biosregs` with `AH = 0x3`, and calls `0x10` BIOS interruption. After the interruption is successfully executed, it returns row and column in the `DL` and `DH` registers. Row and column will be stored in the `orig_x` and `orig_y` fields from the `boot_params.screen_info` structure.
|
||||
First of all, `store_cursor_position` initializes two variables which have type `biosregs` with `AH = 0x3`, and calls the `0x10` BIOS interruption. After the interruption is successfully executed, it returns row and column in the `DL` and `DH` registers. Row and column will be stored in the `orig_x` and `orig_y` fields of the `boot_params.screen_info` structure.
|
||||
|
||||
After `store_cursor_position` is executed, the `store_video_mode` function will be called. It just gets the current video mode and stores it in `boot_params.screen_info.orig_video_mode`.
|
||||
|
||||
After this, the `store_mode_params` checks the current video mode and sets the `video_segment`. After the BIOS transfers control to the boot sector, the following addresses are for video memory:
|
||||
After this, `store_mode_params` checks the current video mode and sets the `video_segment`. After the BIOS transfers control to the boot sector, the following addresses are for video memory:
|
||||
|
||||
```
|
||||
0xB000:0x0000 32 Kb Monochrome Text Video Memory
|
||||
0xB800:0x0000 32 Kb Color Text Video Memory
|
||||
```
|
||||
|
||||
So we set the `video_segment` variable to `0xb000` if the current video mode is MDA, HGC, or VGA in monochrome mode and to `0xb800` if the current video mode is in color mode. After setting up the address of the video segment, font size needs to be stored in `boot_params.screen_info.orig_video_points` with:
|
||||
So we set the `video_segment` variable to `0xb000` if the current video mode is MDA, HGC, or VGA in monochrome mode and to `0xb800` if the current video mode is in color mode. After setting up the address of the video segment, the font size needs to be stored in `boot_params.screen_info.orig_video_points` with:
|
||||
|
||||
```C
|
||||
set_fs(0);
|
||||
@ -156,7 +156,7 @@ y = (adapter == ADAPTER_CGA) ? 25 : rdfs8(0x484)+1;
|
||||
|
||||
Next, we get the amount of columns by address `0x44a` and rows by address `0x484` and store them in `boot_params.screen_info.orig_video_cols` and `boot_params.screen_info.orig_video_lines`. After this, execution of `store_mode_params` is finished.
|
||||
|
||||
Next we can see the `save_screen` function which just saves screen content to the heap. This function collects all data which we got in the previous functions like rows and columns amount etc. and stores it in the `saved_screen` structure, which is defined as:
|
||||
Next we can see the `save_screen` function which just saves the contents of the screen to the heap. This function collects all the data which we got in the previous functions (like the rows and columns, and stuff) and stores it in the `saved_screen` structure, which is defined as:
|
||||
|
||||
```C
|
||||
static struct saved_screen {
|
||||
@ -175,7 +175,7 @@ if (!heap_free(saved.x*saved.y*sizeof(u16)+512))
|
||||
|
||||
and allocates space in the heap if it is enough and stores `saved_screen` in it.
|
||||
|
||||
The next call is `probe_cards(0)` from [arch/x86/boot/video-mode.c](https://github.com/0xAX/linux/blob/0e271fd59fe9e6d8c932309e7a42a4519c5aac6f/arch/x86/boot/video-mode.c#L33). It goes over all video_cards and collects the number of modes provided by the cards. Here is the interesting moment, we can see the loop:
|
||||
The next call is `probe_cards(0)` from [arch/x86/boot/video-mode.c](https://github.com/0xAX/linux/blob/0e271fd59fe9e6d8c932309e7a42a4519c5aac6f/arch/x86/boot/video-mode.c#L33). It goes over all video_cards and collects the number of modes provided by the cards. Here is the interesting part, we can see the loop:
|
||||
|
||||
```C
|
||||
for (card = video_cards; card < video_cards_end; card++) {
|
||||
@ -183,7 +183,7 @@ for (card = video_cards; card < video_cards_end; card++) {
|
||||
}
|
||||
```
|
||||
|
||||
but `video_cards` is not declared anywhere. The answer is simple: every video mode presented in the x86 kernel setup code has definition like this:
|
||||
but `video_cards` is not declared anywhere. The answer is simple: every video mode presented in the x86 kernel setup code has a definition that looks like this:
|
||||
|
||||
```C
|
||||
static __videocard video_vga = {
|
||||
@ -199,7 +199,7 @@ where `__videocard` is a macro:
|
||||
#define __videocard struct card_info __attribute__((used,section(".videocards")))
|
||||
```
|
||||
|
||||
which means that `card_info` structure:
|
||||
which means that the `card_info` structure:
|
||||
|
||||
```C
|
||||
struct card_info {
|
||||
@ -224,13 +224,13 @@ is in the `.videocards` segment. Let's look in the [arch/x86/boot/setup.ld](http
|
||||
}
|
||||
```
|
||||
|
||||
It means that `video_cards` is just a memory address and all `card_info` structures are placed in this segment. It means that all `card_info` structures are placed between `video_cards` and `video_cards_end`, so we can use it in a loop to go over all of it. After `probe_cards` executes we have all structures like `static __videocard video_vga` with filled `nmodes` (number of video modes).
|
||||
It means that `video_cards` is just a memory address and all `card_info` structures are placed in this segment. It means that all `card_info` structures are placed between `video_cards` and `video_cards_end`, so we can use a loop to go over all of it. After `probe_cards` executes we have a bunch of structures like `static __videocard video_vga` with the `nmodes` (the number of video modes) filled in.
|
||||
|
||||
After `probe_cards` execution is finished, we move to the main loop in the `set_video` function. There is an infinite loop which tries to set up video mode with the `set_mode` function or prints a menu if we passed `vid_mode=ask` to the kernel command line or video mode is undefined.
|
||||
After the `probe_cards` function is done, we move to the main loop in the `set_video` function. There is an infinite loop which tries to set up the video mode with the `set_mode` function or prints a menu if we passed `vid_mode=ask` to the kernel command line or if video mode is undefined.
|
||||
|
||||
The `set_mode` function is defined in [video-mode.c](https://github.com/0xAX/linux/blob/0a07b238e5f488b459b6113a62e06b6aab017f71/arch/x86/boot/video-mode.c#L147) and gets only one parameter, `mode`, which is the number of video modes (we got it from the menu or in the start of `setup_video`, from the kernel setup header).
|
||||
The `set_mode` function is defined in [video-mode.c](https://github.com/0xAX/linux/blob/0a07b238e5f488b459b6113a62e06b6aab017f71/arch/x86/boot/video-mode.c#L147) and gets only one parameter, `mode`, which is the number of video modes (we got this value from the menu or in the start of `setup_video`, from the kernel setup header).
|
||||
|
||||
The `set_mode` function checks the `mode` and calls the `raw_set_mode` function. The `raw_set_mode` calls the `set_mode` function for the selected card i.e. `card->set_mode(struct mode_info*)`. We can get access to this function from the `card_info` structure. Every video mode defines this structure with values filled depending upon the video mode (for example for `vga` it is the `video_vga.set_mode` function. See above example of `card_info` structure for `vga`). `video_vga.set_mode` is `vga_set_mode`, which checks the vga mode and calls the respective function:
|
||||
The `set_mode` function checks the `mode` and calls the `raw_set_mode` function. The `raw_set_mode` calls the selected card's `set_mode` function, i.e. `card->set_mode(struct mode_info*)`. We can get access to this function from the `card_info` structure. Every video mode defines this structure with values filled depending upon the video mode (for example for `vga` it is the `video_vga.set_mode` function. See the above example of the `card_info` structure for `vga`). `video_vga.set_mode` is `vga_set_mode`, which checks the vga mode and calls the respective function:
|
||||
|
||||
```C
|
||||
static int vga_set_mode(struct mode_info *mode)
|
||||
@ -268,22 +268,22 @@ static int vga_set_mode(struct mode_info *mode)
|
||||
|
||||
Every function which sets up video mode just calls the `0x10` BIOS interrupt with a certain value in the `AH` register.
|
||||
|
||||
After we have set video mode, we pass it to `boot_params.hdr.vid_mode`.
|
||||
After we have set the video mode, we pass it to `boot_params.hdr.vid_mode`.
|
||||
|
||||
Next `vesa_store_edid` is called. This function simply stores the [EDID](https://en.wikipedia.org/wiki/Extended_Display_Identification_Data) (**E**xtended **D**isplay **I**dentification **D**ata) information for kernel use. After this `store_mode_params` is called again. Lastly, if `do_restore` is set, the screen is restored to an earlier state.
|
||||
Next, `vesa_store_edid` is called. This function simply stores the [EDID](https://en.wikipedia.org/wiki/Extended_Display_Identification_Data) (**E**xtended **D**isplay **I**dentification **D**ata) information for kernel use. After this `store_mode_params` is called again. Lastly, if `do_restore` is set, the screen is restored to an earlier state.
|
||||
|
||||
After this, we have set video mode and now we can switch to the protected mode.
|
||||
Having done this, the video mode setup is complete and now we can switch to the protected mode.
|
||||
|
||||
Last preparation before transition into protected mode
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
We can see the last function call - `go_to_protected_mode` - in [main.c](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/arch/x86/boot/main.c#L184). As the comment says: `Do the last things and invoke protected mode`, so let's see these last things and switch into protected mode.
|
||||
We can see the last function call - `go_to_protected_mode` - in [main.c](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/arch/x86/boot/main.c#L184). As the comment says: `Do the last things and invoke protected mode`, so let's see what these last things are and switch into protected mode.
|
||||
|
||||
The `go_to_protected_mode` is defined in [arch/x86/boot/pm.c](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/arch/x86/boot/pm.c#L104). It contains some functions which make the last preparations before we can jump into protected mode, so let's look at it and try to understand what they do and how it works.
|
||||
The `go_to_protected_mode` function is defined in [arch/x86/boot/pm.c](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/arch/x86/boot/pm.c#L104). It contains some functions which make the last preparations before we can jump into protected mode, so let's look at it and try to understand what it does and how it works.
|
||||
|
||||
First is the call to the `realmode_switch_hook` function in `go_to_protected_mode`. This function invokes the real mode switch hook if it is present and disables [NMI](http://en.wikipedia.org/wiki/Non-maskable_interrupt). Hooks are used if the bootloader runs in a hostile environment. You can read more about hooks in the [boot protocol](https://www.kernel.org/doc/Documentation/x86/boot.txt) (see **ADVANCED BOOT LOADER HOOKS**).
|
||||
|
||||
The `realmode_switch` hook presents a pointer to the 16-bit real mode far subroutine which disables non-maskable interrupts. After `realmode_switch` hook (it isn't present for me) is checked, disabling of Non-Maskable Interrupts(NMI) occurs:
|
||||
The `realmode_switch` hook presents a pointer to the 16-bit real mode far subroutine which disables non-maskable interrupts. After the `realmode_switch` hook (it isn't present for me) is checked, Non-Maskable Interrupts(NMI) is disabled:
|
||||
|
||||
```assembly
|
||||
asm volatile("cli");
|
||||
@ -291,11 +291,11 @@ outb(0x80, 0x70); /* Disable NMI */
|
||||
io_delay();
|
||||
```
|
||||
|
||||
At first, there is an inline assembly instruction with a `cli` instruction which clears the interrupt flag (`IF`). After this, external interrupts are disabled. The next line disables NMI (non-maskable interrupt).
|
||||
At first, there is an inline assembly statement with a `cli` instruction which clears the interrupt flag (`IF`). After this, external interrupts are disabled. The next line disables NMI (non-maskable interrupt).
|
||||
|
||||
An interrupt is a signal to the CPU which is emitted by hardware or software. After getting the signal, the CPU suspends the current instruction sequence, saves its state and transfers control to the interrupt handler. After the interrupt handler has finished it's work, it transfers control to the interrupted instruction. Non-maskable interrupts (NMI) are interrupts which are always processed, independently of permission. It cannot be ignored and is typically used to signal for non-recoverable hardware errors. We will not dive into details of interrupts now but will discuss it in the next posts.
|
||||
An interrupt is a signal to the CPU which is emitted by hardware or software. After getting such a signal, the CPU suspends the current instruction sequence, saves its state and transfers control to the interrupt handler. After the interrupt handler has finished it's work, it transfers control back to the interrupted instruction. Non-maskable interrupts (NMI) are interrupts which are always processed, independently of permission. They cannot be ignored and are typically used to signal for non-recoverable hardware errors. We will not dive into the details of interrupts now but we will be discussing them in the coming posts.
|
||||
|
||||
Let's get back to the code. We can see that second line is writing `0x80` (disabled bit) byte to `0x70` (CMOS Address register). After that, a call to the `io_delay` function occurs. `io_delay` causes a small delay and looks like:
|
||||
Let's get back to the code. We can see in the second line that we are writing the byte `0x80` (disabled bit) to `0x70` (the CMOS Address register). After that, a call to the `io_delay` function occurs. `io_delay` causes a small delay and looks like:
|
||||
|
||||
```C
|
||||
static inline void io_delay(void)
|
||||
@ -305,9 +305,9 @@ static inline void io_delay(void)
|
||||
}
|
||||
```
|
||||
|
||||
To output any byte to the port `0x80` should delay exactly 1 microsecond. So we can write any value (value from `AL` register in our case) to the `0x80` port. After this delay `realmode_switch_hook` function has finished execution and we can move to the next function.
|
||||
To output any byte to the port `0x80` should delay exactly 1 microsecond. So we can write any value (the value from `AL` in our case) to the `0x80` port. After this delay the `realmode_switch_hook` function has finished execution and we can move to the next function.
|
||||
|
||||
The next function is `enable_a20`, which enables [A20 line](http://en.wikipedia.org/wiki/A20_line). This function is defined in [arch/x86/boot/a20.c](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/arch/x86/boot/a20.c) and it tries to enable the A20 gate with different methods. The first is the `a20_test_short` function which checks if A20 is already enabled or not with the `a20_test` function:
|
||||
The next function is `enable_a20`, which enables the [A20 line](http://en.wikipedia.org/wiki/A20_line). This function is defined in [arch/x86/boot/a20.c](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/arch/x86/boot/a20.c) and it tries to enable the A20 gate with different methods. The first is the `a20_test_short` function which checks if A20 is already enabled or not with the `a20_test` function:
|
||||
|
||||
```C
|
||||
static int a20_test(int loops)
|
||||
@ -333,11 +333,11 @@ static int a20_test(int loops)
|
||||
}
|
||||
```
|
||||
|
||||
First of all, we put `0x0000` in the `FS` register and `0xffff` in the `GS` register. Next, we read the value in address `A20_TEST_ADDR` (it is `0x200`) and put this value into the `saved` variable and `ctr`.
|
||||
First of all, we put `0x0000` in the `FS` register and `0xffff` in the `GS` register. Next, we read the value at the address `A20_TEST_ADDR` (it is `0x200`) and put this value into the variables `saved` and `ctr`.
|
||||
|
||||
Next, we write an updated `ctr` value into `fs:gs` with the `wrfs32` function, then delay for 1ms, and then read the value from the `GS` register by address `A20_TEST_ADDR+0x10`, if it's not zero we already have enabled the A20 line. If A20 is disabled, we try to enable it with a different method which you can find in the `a20.c`. For example with call of `0x15` BIOS interrupt with `AH=0x2041` etc.
|
||||
Next, we write an updated `ctr` value into `fs:gs` with the `wrfs32` function, then delay for 1ms, and then read the value from the `GS` register into the address `A20_TEST_ADDR+0x10`, if it's not zero we've already enabled the A20 line. If A20 is disabled, we try to enable it with a different method which you can find in `a20.c`. For example, it can be done with a call to the `0x15` BIOS interrupt with `AH=0x2041`.
|
||||
|
||||
If the `enabled_a20` function finished with fail, print an error message and call function `die`. You can remember it from the first source code file where we started - [arch/x86/boot/header.S](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/arch/x86/boot/header.S):
|
||||
If the `enabled_a20` function finished with a failure, print an error message and call the function `die`. You can remember it from the first source code file where we started - [arch/x86/boot/header.S](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/arch/x86/boot/header.S):
|
||||
|
||||
```assembly
|
||||
die:
|
||||
@ -366,10 +366,10 @@ This masks all interrupts on the secondary PIC (Programmable Interrupt Controlle
|
||||
|
||||
And after all of these preparations, we can see the actual transition into protected mode.
|
||||
|
||||
Set up Interrupt Descriptor Table
|
||||
Set up the Interrupt Descriptor Table
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
Now we set up the Interrupt Descriptor table (IDT). `setup_idt`:
|
||||
Now we set up the Interrupt Descriptor table (IDT) in the `setup_idt` function:
|
||||
|
||||
```C
|
||||
static void setup_idt(void)
|
||||
@ -379,7 +379,7 @@ static void setup_idt(void)
|
||||
}
|
||||
```
|
||||
|
||||
which sets up the Interrupt Descriptor Table (describes interrupt handlers and etc.). For now, the IDT is not installed (we will see it later), but now we just the load IDT with the `lidtl` instruction. `null_idt` contains address and size of IDT, but now they are just zero. `null_idt` is a `gdt_ptr` structure, it as defined as:
|
||||
which sets up the Interrupt Descriptor Table (describes interrupt handlers and etc.). For now, the IDT is not installed (we will see it later), but now we just load the IDT with the `lidtl` instruction. `null_idt` contains the address and size of the IDT, but for now they are just zero. `null_idt` is a `gdt_ptr` structure, it is defined as:
|
||||
|
||||
```C
|
||||
struct gdt_ptr {
|
||||
@ -393,7 +393,7 @@ where we can see the 16-bit length(`len`) of the IDT and the 32-bit pointer to i
|
||||
Set up Global Descriptor Table
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
Next is the setup of the Global Descriptor Table (GDT). We can see the `setup_gdt` function which sets up GDT (you can read about it in the [Kernel booting process. Part 2.](linux-bootstrap-2.md#protected-mode)). There is a definition of the `boot_gdt` array in this function, which contains the definition of the three segments:
|
||||
Next is the setup of the Global Descriptor Table (GDT). We can see the `setup_gdt` function which sets up the GDT (you can read about it in the post [Kernel booting process. Part 2.](linux-bootstrap-2.md#protected-mode)). There is a definition of the `boot_gdt` array in this function, which contains the definition of the three segments:
|
||||
|
||||
```C
|
||||
static const u64 boot_gdt[] __attribute__((aligned(16))) = {
|
||||
@ -403,7 +403,7 @@ static const u64 boot_gdt[] __attribute__((aligned(16))) = {
|
||||
};
|
||||
```
|
||||
|
||||
for code, data and TSS (Task State Segment). We will not use the task state segment for now, it was added there to make Intel VT happy as we can see in the comment line (if you're interested you can find commit which describes it - [here](https://github.com/torvalds/linux/commit/88089519f302f1296b4739be45699f06f728ec31)). Let's look at `boot_gdt`. First of all note that it has the `__attribute__((aligned(16)))` attribute. It means that this structure will be aligned by 16 bytes.
|
||||
for code, data and TSS (Task State Segment). We will not use the task state segment for now, it was added there to make Intel VT happy as we can see in the comment line (if you're interested you can find the commit which describes it - [here](https://github.com/torvalds/linux/commit/88089519f302f1296b4739be45699f06f728ec31)). Let's look at `boot_gdt`. First of all note that it has the `__attribute__((aligned(16)))` attribute. It means that this structure will be aligned by 16 bytes.
|
||||
|
||||
Let's look at a simple example:
|
||||
|
||||
@ -430,7 +430,7 @@ int main(void)
|
||||
}
|
||||
```
|
||||
|
||||
Technically a structure which contains one `int` field must be 4 bytes, but here `aligned` structure will be 16 bytes:
|
||||
Technically a structure which contains one `int` field must be 4 bytes in size, but an `aligned` structure will need 16 bytes to store in memory:
|
||||
|
||||
```
|
||||
$ gcc test.c -o test && test
|
||||
@ -438,9 +438,9 @@ Not aligned - 4
|
||||
Aligned - 16
|
||||
```
|
||||
|
||||
The `GDT_ENTRY_BOOT_CS` has index - 2 here, `GDT_ENTRY_BOOT_DS` is `GDT_ENTRY_BOOT_CS + 1` and etc. It starts from 2, because first is a mandatory null descriptor (index - 0) and the second is not used (index - 1).
|
||||
The `GDT_ENTRY_BOOT_CS` has index - 2 here, `GDT_ENTRY_BOOT_DS` is `GDT_ENTRY_BOOT_CS + 1` and etc. It starts from 2, because the first is a mandatory null descriptor (index - 0) and the second is not used (index - 1).
|
||||
|
||||
The `GDT_ENTRY` is a macro which takes flags, base, limit and builds GDT entry. For example, let's look at the code segment entry. `GDT_ENTRY` takes following values:
|
||||
`GDT_ENTRY` is a macro which takes flags, base, limit and builds a GDT entry. For example, let's look at the code segment entry. `GDT_ENTRY` takes the following values:
|
||||
|
||||
* base - 0
|
||||
* limit - 0xfffff
|
||||
@ -481,7 +481,7 @@ Next we get a pointer to the GDT with:
|
||||
gdt.ptr = (u32)&boot_gdt + (ds() << 4);
|
||||
```
|
||||
|
||||
Here we just get the address of `boot_gdt` and add it to the address of the data segment left-shifted by 4 bits (remember we're in the real mode now).
|
||||
Here we just get the address of `boot_gdt` and add it to the address of the data segment left-shifted by 4 bits (remember we're in real mode now).
|
||||
|
||||
Lastly we execute the `lgdtl` instruction to load the GDT into the GDTR register:
|
||||
|
||||
@ -492,7 +492,7 @@ asm volatile("lgdtl %0" : : "m" (gdt));
|
||||
Actual transition into protected mode
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
This is the end of the `go_to_protected_mode` function. We loaded IDT, GDT, disable interruptions and now can switch the CPU into protected mode. The last step is calling the `protected_mode_jump` function with two parameters:
|
||||
This is the end of the `go_to_protected_mode` function. We loaded the IDT and GDT, disabled interrupts and now can switch the CPU into protected mode. The last step is calling the `protected_mode_jump` function with two parameters:
|
||||
|
||||
```C
|
||||
protected_mode_jump(boot_params.hdr.code32_start, (u32)&boot_params + (ds() << 4));
|
||||
@ -502,12 +502,12 @@ which is defined in [arch/x86/boot/pmjump.S](https://github.com/torvalds/linux/b
|
||||
|
||||
It takes two parameters:
|
||||
|
||||
* address of protected mode entry point
|
||||
* address of the protected mode entry point
|
||||
* address of `boot_params`
|
||||
|
||||
Let's look inside `protected_mode_jump`. As I wrote above, you can find it in `arch/x86/boot/pmjump.S`. The first parameter will be in the `eax` register and the second one is in `edx`.
|
||||
|
||||
First of all, we put the address of `boot_params` in the `esi` register and the address of code segment register `cs` (0x1000) in `bx`. After this, we shift `bx` by 4 bits and add it to the memory location labeled `2` (which is `bx << 4 + in_pm32`, the physical address to jump after transitioned to 32-bit mode) and jump to label `1`. Next we put data segment and task state segment in the `cx` and `di` registers with:
|
||||
First of all, we put the address of `boot_params` in the `esi` register and the address of the code segment register `cs` (0x1000) in `bx`. After this, we shift `bx` by 4 bits and add it to the memory location labeled `2` (which is `bx << 4 + in_pm32`, the physical address to jump after transitioned to 32-bit mode) and jump to label `1`. Next we put the data segment and the task state segment in the `cx` and `di` registers with:
|
||||
|
||||
```assembly
|
||||
movw $__BOOT_DS, %cx
|
||||
@ -537,16 +537,16 @@ where:
|
||||
* `0x66` is the operand-size prefix which allows us to mix 16-bit and 32-bit code
|
||||
* `0xea` - is the jump opcode
|
||||
* `in_pm32` is the segment offset
|
||||
* `__BOOT_CS` is the code segment
|
||||
* `__BOOT_CS` is the code segment we want to jump to.
|
||||
|
||||
After this we are finally in the protected mode:
|
||||
After this we are finally in protected mode:
|
||||
|
||||
```assembly
|
||||
.code32
|
||||
.section ".text32","ax"
|
||||
```
|
||||
|
||||
Let's look at the first steps in protected mode. First of all we set up the data segment with:
|
||||
Let's look at the first steps taken in protected mode. First of all we set up the data segment with:
|
||||
|
||||
```assembly
|
||||
movl %ecx, %ds
|
||||
@ -558,13 +558,13 @@ movl %ecx, %ss
|
||||
|
||||
If you paid attention, you can remember that we saved `$__BOOT_DS` in the `cx` register. Now we fill it with all segment registers besides `cs` (`cs` is already `__BOOT_CS`).
|
||||
|
||||
And setup valid stack for debugging purposes:
|
||||
And setup a valid stack for debugging purposes:
|
||||
|
||||
```assembly
|
||||
addl %ebx, %esp
|
||||
```
|
||||
|
||||
The last step before jump into 32-bit entry point is clearing of general purpose registers:
|
||||
The last step before the jump into 32-bit entry point is to clear the general purpose registers:
|
||||
|
||||
```assembly
|
||||
xorl %ecx, %ecx
|
||||
@ -582,12 +582,12 @@ jmpl *%eax
|
||||
|
||||
Remember that `eax` contains the address of the 32-bit entry (we passed it as the first parameter into `protected_mode_jump`).
|
||||
|
||||
That's all. We're in the protected mode and stop at it's entry point. We will see what happens next in the next part.
|
||||
That's all. We're in protected mode and stop at its entry point. We will see what happens next in the next part.
|
||||
|
||||
Conclusion
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
This is the end of the third part about linux kernel insides. In next part, we will see first steps in the protected mode and transition into the [long mode](http://en.wikipedia.org/wiki/Long_mode).
|
||||
This is the end of the third part about linux kernel insides. In the next part, we will look at the first steps we take in protected mode and transition into [long mode](http://en.wikipedia.org/wiki/Long_mode).
|
||||
|
||||
If you have any questions or suggestions write me a comment or ping me at [twitter](https://twitter.com/0xAX).
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user