mirror of
https://github.com/0xAX/linux-insides.git
synced 2024-12-22 14:48:08 +00:00
Last update of the Booting/linux-bootstrap-4.md
This commit is contained in:
parent
0cfd3af65f
commit
f6c57be9bb
@ -4,7 +4,7 @@ Kernel booting process. Part 4.
|
||||
Transition to 64-bit mode
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
It is the fourth part of the `Kernel booting process` and we will see first steps in the [protected mode](http://en.wikipedia.org/wiki/Protected_mode), like checking that cpu supports the [long mode](http://en.wikipedia.org/wiki/Long_mode) and [SSE](http://en.wikipedia.org/wiki/Streaming_SIMD_Extensions), [paging](http://en.wikipedia.org/wiki/Paging) and initialization of the page tables and transition to the long mode in in the end of this part.
|
||||
It is the fourth part of the `Kernel booting process` and we will see first steps in the [protected mode](http://en.wikipedia.org/wiki/Protected_mode), like checking that cpu supports the [long mode](http://en.wikipedia.org/wiki/Long_mode) and [SSE](http://en.wikipedia.org/wiki/Streaming_SIMD_Extensions), [paging](http://en.wikipedia.org/wiki/Paging) and initialization of the page tables and transition to the [long mode](https://en.wikipedia.org/wiki/Long_mode) in in the end of this part.
|
||||
|
||||
**NOTE: will be much assembly code in this part, so if you have poor knowledge, read a book about it**
|
||||
|
||||
@ -386,44 +386,44 @@ Now we are almost finished with all preparations before we can move into 64-bit
|
||||
Long mode
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
Long mode is the native mode for x86_64 processors. First of all let's look at some differences between `x86_64` and `x86`.
|
||||
The [long mode](https://en.wikipedia.org/wiki/Long_mode) is the native mode for [x86_64](https://en.wikipedia.org/wiki/X86-64) processors. First of all let's look at some differences between the `x86_64` and the `x86`.
|
||||
|
||||
It provides features such as:
|
||||
The `64-bit` mode provides features such as:
|
||||
|
||||
* New 8 general purpose registers from `r8` to `r15` + all general purpose registers are 64-bit now
|
||||
* 64-bit instruction pointer - `RIP`
|
||||
* New operating mode - Long mode
|
||||
* 64-Bit Addresses and Operands
|
||||
* RIP Relative Addressing (we will see an example if it in the next parts)
|
||||
* New 8 general purpose registers from `r8` to `r15` + all general purpose registers are 64-bit now;
|
||||
* 64-bit instruction pointer - `RIP`;
|
||||
* New operating mode - Long mode;
|
||||
* 64-Bit Addresses and Operands;
|
||||
* RIP Relative Addressing (we will see an example if it in the next parts).
|
||||
|
||||
Long mode is an extension of legacy protected mode. It consists of two sub-modes:
|
||||
|
||||
* 64-bit mode
|
||||
* compatibility mode
|
||||
* 64-bit mode;
|
||||
* compatibility mode.
|
||||
|
||||
To switch into 64-bit mode we need to do following things:
|
||||
To switch into `64-bit` mode we need to do following things:
|
||||
|
||||
* enable PAE (we already did it, see above)
|
||||
* build page tables and load the address of the top level page table into the `cr3` register
|
||||
* enable `EFER.LME`
|
||||
* enable paging
|
||||
* To enable [PAE](https://en.wikipedia.org/wiki/Physical_Address_Extension);
|
||||
* To build page tables and load the address of the top level page table into the `cr3` register;
|
||||
* To enable `EFER.LME`;
|
||||
* To enable paging.
|
||||
|
||||
We already enabled `PAE` by setting the PAE bit in the `cr4` register. Now let's look at paging.
|
||||
We already enabled `PAE` by setting the `PAE` bit in the `cr4` control register. Our next goal is to build structure for [paging](https://en.wikipedia.org/wiki/Paging). We will see this in next paragraph.
|
||||
|
||||
Early page tables initialization
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
Before we can move into 64-bit mode, we need to build page tables, so, let's look at the building of early 4G boot page tables.
|
||||
So, we already know that before we can move into `64-bit` mode, we need to build page tables, so, let's look at the building of early `4G` boot page tables.
|
||||
|
||||
**NOTE: I will not describe theory of virtual memory here, if you need to know more about it, see links in the end**
|
||||
**NOTE: I will not describe theory of virtual memory here, if you need to know more about it, see links in the end of this part**
|
||||
|
||||
The Linux kernel uses 4-level paging, and generally we build 6 page tables:
|
||||
The Linux kernel uses `4-level` paging, and generally we build 6 page tables:
|
||||
|
||||
* One PML4 table
|
||||
* One PDP table
|
||||
* Four Page Directory tables
|
||||
* One `PML4` or `Page Map Level 4` table;
|
||||
* One `PDP` or `Page Directory Pointer` table;
|
||||
* Four Page Directory tables.
|
||||
|
||||
Let's look at the implementation of it. First of all we clear the buffer for the page tables in memory. Every table is 4096 bytes, so we need 24 kilobytes buffer:
|
||||
Let's look at the implementation of this. First of all we clear the buffer for the page tables in memory. Every table is `4096` bytes, so we need clear `24` kilobytes buffer:
|
||||
|
||||
```assembly
|
||||
leal pgtable(%ebx), %edi
|
||||
@ -432,7 +432,9 @@ Let's look at the implementation of it. First of all we clear the buffer for the
|
||||
rep stosl
|
||||
```
|
||||
|
||||
We put the address stored in `ebx` (remember that `ebx` contains the address to relocate the kernel for decompression) with `pgtable` offset to the `edi` register. `pgtable` is defined in the end of `head_64.S` and looks:
|
||||
We put the address of the `pgtable` relative to `ebx` (remember that `ebx` contains the address to relocate the kernel for decompression) to the `edi` register, clear `eax` register and `6144` to the `ecx` register. The `rep stosl` instruction will write value of the `eax` to the `edi`, increase value of the `edi` register on `4` and decrease value of the `ecx` register on `4`. This operation will be repeated while value of the `ecx` register will be greater than zero. That's why we put magic `6144` to the `ecx`.
|
||||
|
||||
The `pgtable` is defined in the end of [arch/x86/boot/compressed/head_64.S](https://github.com/torvalds/linux/blob/master/arch/x86/boot/compressed/head_64.S) assembly file and looks:
|
||||
|
||||
```assembly
|
||||
.section ".pgtable","a",@nobits
|
||||
@ -441,9 +443,9 @@ pgtable:
|
||||
.fill 6*4096, 1, 0
|
||||
```
|
||||
|
||||
It is in the `.pgtable` section and its size is 24 kilobytes. After we put the address in `edi`, we zero out the `eax` register and write zeros to the buffer with the `rep stosl` instruction.
|
||||
As we can see, it is located in the `.pgtable` section and its size is `24` kilobytes.
|
||||
|
||||
Now we can build the top level page table - `PML4` - with:
|
||||
After we have got buffer for the `pgtable` structure, we can start to build the top level page table - `PML4` - with:
|
||||
|
||||
```assembly
|
||||
leal pgtable + 0(%ebx), %edi
|
||||
@ -451,9 +453,9 @@ Now we can build the top level page table - `PML4` - with:
|
||||
movl %eax, 0(%edi)
|
||||
```
|
||||
|
||||
Here we get the address stored in the `ebx` with `pgtable` offset and put it in `edi`. Next we put this address with offset `0x1007` in the `eax` register. `0x1007` is 4096 bytes (size of the PML4) + 7 (PML4 entry flags - `PRESENT+RW+USER`) and puts `eax` in `edi`. After this manipulation `edi` will contain the address of the first Page Directory Pointer Entry with flags - `PRESENT+RW+USER`.
|
||||
Here again, we put the address of the `pgtable` relative to `ebx` or in other words relative to address of the `startup_32` to the `edi` register. Next we put this address with offset `0x1007` in the `eax` register. The `0x1007` is `4096` bytes which is the size of the `PML4` plus `7`. The `7` here represents flags of the `PML4` entry. In our case, these flags are `PRESENT+RW+USER`. In the end we just write first the address of the first `PDP` entry to the `PML4`.
|
||||
|
||||
In the next step we build 4 Page Directory entries in the Page Directory Pointer table with `0x7` flags or present, write, userspace (`PRESENT WRITE | USER`):
|
||||
In the next step we will build four `Page Directory` entries in the `Page Directory Pointer` table with the same `PRESENT+RW+USE` flags:
|
||||
|
||||
```assembly
|
||||
leal pgtable + 0x1000(%ebx), %edi
|
||||
@ -466,11 +468,7 @@ In the next step we build 4 Page Directory entries in the Page Directory Pointer
|
||||
jnz 1b
|
||||
```
|
||||
|
||||
We put the base address of the page directory pointer table in `edi` and the address of the first page directory pointer entry in `eax`. Put `4` in the `ecx` register, it will be a counter in the following loop and write the address of the first page directory pointer table entry to the `edi` register.
|
||||
|
||||
After this `edi` will contain the address of the first page directory pointer entry with flags `0x7`. Next we just calculate the address of following page directory pointer entries where each entry is 8 bytes, and write their addresses to `eax`.
|
||||
|
||||
The next step is building the `2048` page table entries with 2-MByte page:
|
||||
We put the base address of the page directory pointer which is `4096` or `0x1000` offset from the `pgtable` table in `edi` and the address of the first page directory pointer entry in `eax` register. Put `4` in the `ecx` register, it will be a counter in the following loop and write the address of the first page directory pointer table entry to the `edi` register. After this `edi` will contain the address of the first page directory pointer entry with flags `0x7`. Next we just calculate the address of following page directory pointer entries where each entry is `8` bytes, and write their addresses to `eax`. The next step is the building the `2048` page table entries with `2-MByte` pages:
|
||||
|
||||
```assembly
|
||||
leal pgtable + 0x2000(%ebx), %edi
|
||||
@ -483,16 +481,21 @@ The next step is building the `2048` page table entries with 2-MByte page:
|
||||
jnz 1b
|
||||
```
|
||||
|
||||
Here we do almost the same as in the previous example, all entries will be with flags - `$0x00000183` - `PRESENT + WRITE + MBZ`. In the end we will have 2048 pages with 2-MByte page.
|
||||
Here we do almost the same as in the previous example, all entries will be with flags - `$0x00000183` - `PRESENT + WRITE + MBZ`. In the end we will have `2048` pages with `2-MByte` page or:
|
||||
|
||||
Our early page table structure are done, it maps 4 gigabytes of memory and now we can put the address of the high-level page table - `PML4` - in `cr3` control register:
|
||||
```python
|
||||
>>> 2048 * 0x00200000
|
||||
4294967296
|
||||
```
|
||||
|
||||
`4G` page table. We just finished to build our early page table structure which maps `4` gigabytes of memory and now we can put the address of the high-level page table - `PML4` - in `cr3` control register:
|
||||
|
||||
```assembly
|
||||
leal pgtable(%ebx), %eax
|
||||
movl %eax, %cr3
|
||||
```
|
||||
|
||||
That's all. Now we can see transition to the long mode.
|
||||
That's all. All preparation are finished and now we can see transition to the long mode.
|
||||
|
||||
Transition to long mode
|
||||
--------------------------------------------------------------------------------
|
||||
|
Loading…
Reference in New Issue
Block a user