1
0
mirror of https://github.com/0xAX/linux-insides.git synced 2025-01-03 04:10:56 +00:00

Fix typos in repository

This commit is contained in:
Anton Davydov 2015-04-19 22:15:28 +03:00
parent fd4520fee6
commit 984ef074ec
12 changed files with 47 additions and 47 deletions

View File

@ -11,7 +11,7 @@ Protected mode
Before we can move to the native Intel64 [Long mode](http://en.wikipedia.org/wiki/Long_mode), the kernel must switch the CPU into protected mode. What is the protected mode? The Protected mode was first added to the x86 architecture in 1982 and was the main mode of Intel processors from [80286](http://en.wikipedia.org/wiki/Intel_80286) processor until Intel 64 and long mode. The Main reason to move away from the real mode that there is very limited access to the RAM. As you can remember from the previous part, there is only 2^20 bytes or 1 megabyte, sometimes even only 640 kilobytes. Before we can move to the native Intel64 [Long mode](http://en.wikipedia.org/wiki/Long_mode), the kernel must switch the CPU into protected mode. What is the protected mode? The Protected mode was first added to the x86 architecture in 1982 and was the main mode of Intel processors from [80286](http://en.wikipedia.org/wiki/Intel_80286) processor until Intel 64 and long mode. The Main reason to move away from the real mode that there is very limited access to the RAM. As you can remember from the previous part, there is only 2^20 bytes or 1 megabyte, sometimes even only 640 kilobytes.
Protected mode brought many changes, but the main is a different memory management.The 24-bit address bus was replaced with a 32-bit address bus. It allows to access to 4 gigabytes of physical adress space. Also [paging](http://en.wikipedia.org/wiki/Paging) support was added which we will see in the next parts. Protected mode brought many changes, but the main is a different memory management.The 24-bit address bus was replaced with a 32-bit address bus. It allows to access to 4 gigabytes of physical address space. Also [paging](http://en.wikipedia.org/wiki/Paging) support was added which we will see in the next parts.
Memory management in the protected mode is divided into two, almost independent parts: Memory management in the protected mode is divided into two, almost independent parts:

View File

@ -317,7 +317,7 @@ struct mem_vector {
The next step after we collected all unsafe memory regions in the `mem_avoid` array will be search of the random address which does not overlap with the unsafe regions with the `find_random_addr` function. The next step after we collected all unsafe memory regions in the `mem_avoid` array will be search of the random address which does not overlap with the unsafe regions with the `find_random_addr` function.
First of all we can see allign of the output address in the `find_random_addr` function: First of all we can see align of the output address in the `find_random_addr` function:
```C ```C
minimum = ALIGN(minimum, CONFIG_PHYSICAL_ALIGN); minimum = ALIGN(minimum, CONFIG_PHYSICAL_ALIGN);
@ -399,7 +399,7 @@ static unsigned long slots[CONFIG_RANDOMIZE_BASE_MAX_OFFSET /
static unsigned long slot_max; static unsigned long slot_max;
``` ```
After `process_e820_entry` will be executed, we will have array of the addressess which are safe for the decompressed kernel. Next we call `slots_fetch_random` function for getting random item from this array: After `process_e820_entry` will be executed, we will have array of the addresses which are safe for the decompressed kernel. Next we call `slots_fetch_random` function for getting random item from this array:
```C ```C
if (slot_max == 0) if (slot_max == 0)
@ -418,7 +418,7 @@ After all these checks will see the familiar message:
Decompressing Linux... Decompressing Linux...
``` ```
and call `decompress` function which will decompress the kernel. `decompress` function depends on what decompression algorithm was choosen during kernel compilartion: and call `decompress` function which will decompress the kernel. `decompress` function depends on what decompression algorithm was chosen during kernel compilartion:
```C ```C
#ifdef CONFIG_KERNEL_GZIP #ifdef CONFIG_KERNEL_GZIP

View File

@ -25,6 +25,6 @@ If you want to contribute to [linux-insides](https://github.com/0xAX/linux-insid
**IMPORTANT** **IMPORTANT**
Please, make the actual changes. While you made your changes, I can merge changes from somebody else and your changes can conflict with `master` branch content. Please rebase on master everytime before you're going to push your changes and check that your branch doesn't conflict with `master`. Please, make the actual changes. While you made your changes, I can merge changes from somebody else and your changes can conflict with `master` branch content. Please rebase on master every time before you're going to push your changes and check that your branch doesn't conflict with `master`.
Thank you. Thank you.

View File

@ -19,7 +19,7 @@ set_cpu_present(cpu, true);
set_cpu_possible(cpu, true); set_cpu_possible(cpu, true);
``` ```
`set_cpu_possible` is a set of cpu ID's which can be plugged in anytime during the life of that system boot. `cpu_present` represents which CPUs are currently plugged in. `cpu_online` represents subset of the `cpu_present` and indicates CPUs which are available for scheduling. These masks depends on `CONFIG_HOTPLUG_CPU` configuration option and if this option is disabled `possible == present` and `active == online`. Implementation of the all of these functions are very similar. Every function checks the second paramter. If it is `true`, calls `cpumask_set_cpu` or `cpumask_clear_cpu` otherwise. `set_cpu_possible` is a set of cpu ID's which can be plugged in anytime during the life of that system boot. `cpu_present` represents which CPUs are currently plugged in. `cpu_online` represents subset of the `cpu_present` and indicates CPUs which are available for scheduling. These masks depends on `CONFIG_HOTPLUG_CPU` configuration option and if this option is disabled `possible == present` and `active == online`. Implementation of the all of these functions are very similar. Every function checks the second parameter. If it is `true`, calls `cpumask_set_cpu` or `cpumask_clear_cpu` otherwise.
There are two ways for a `cpumask` creation. First is to use `cpumask_t`. It defined as: There are two ways for a `cpumask` creation. First is to use `cpumask_t`. It defined as:
@ -27,7 +27,7 @@ There are two ways for a `cpumask` creation. First is to use `cpumask_t`. It def
typedef struct cpumask { DECLARE_BITMAP(bits, NR_CPUS); } cpumask_t; typedef struct cpumask { DECLARE_BITMAP(bits, NR_CPUS); } cpumask_t;
``` ```
It wraps `cpumask` structure which contains one bitmak `bits` field. `DECLARE_BITMAP` macro gets two paramters: It wraps `cpumask` structure which contains one bitmak `bits` field. `DECLARE_BITMAP` macro gets two parameters:
* bitmap name; * bitmap name;
* number of bits. * number of bits.
@ -70,7 +70,7 @@ The second way to define cpumask is to use `DECLARE_BITMAP` macro directly and `
: (void *)sizeof(__check_is_bitmap(bitmap)))) : (void *)sizeof(__check_is_bitmap(bitmap))))
``` ```
We can see ternary operator operator here which is `true` everytime. `__check_is_bitmap` inline function defined as: We can see ternary operator operator here which is `true` every time. `__check_is_bitmap` inline function defined as:
```C ```C
static inline int __check_is_bitmap(const unsigned long *bitmap) static inline int __check_is_bitmap(const unsigned long *bitmap)
@ -79,7 +79,7 @@ static inline int __check_is_bitmap(const unsigned long *bitmap)
} }
``` ```
And returns `1` everytime. We need in it here only for one purpose: In compile time it checks that given `bitmap` is a bitmap, or with another words it checks that given `bitmap` has type - `unsigned long *`. So we just pass `cpu_possible_bits` to the `to_cpumask` macro for converting array of `unsigned long` to the `struct cpumask *`. And returns `1` every time. We need in it here only for one purpose: In compile time it checks that given `bitmap` is a bitmap, or with another words it checks that given `bitmap` has type - `unsigned long *`. So we just pass `cpu_possible_bits` to the `to_cpumask` macro for converting array of `unsigned long` to the `struct cpumask *`.
cpumask API cpumask API
-------------------------------------------------------------------------------- --------------------------------------------------------------------------------
@ -103,7 +103,7 @@ void set_cpu_online(unsigned int cpu, bool online)
} }
``` ```
First of all it checks the second `state` paramter and calls `cpumask_set_cpu` or `cpumask_clear_cpu` depends on it. Here we can see casting to the `struct cpumask *` of the second paramter in the `cpumask_set_cpu`. In our case it is `cpu_online_bits` which is bitmap and defined as: First of all it checks the second `state` parameter and calls `cpumask_set_cpu` or `cpumask_clear_cpu` depends on it. Here we can see casting to the `struct cpumask *` of the second parameter in the `cpumask_set_cpu`. In our case it is `cpu_online_bits` which is bitmap and defined as:
```C ```C
static DECLARE_BITMAP(cpu_online_bits, CONFIG_NR_CPUS) __read_mostly; static DECLARE_BITMAP(cpu_online_bits, CONFIG_NR_CPUS) __read_mostly;
@ -118,7 +118,7 @@ static inline void cpumask_set_cpu(unsigned int cpu, struct cpumask *dstp)
} }
``` ```
`set_bit` function takes two paramter too, and sets a given bit (first paramter) in the memory (second paramter or `cpu_online_bits` bitmap). We can see here that before `set_bit` will be called, its two paramter will be passed to the `set_bit` function takes two parameter too, and sets a given bit (first parameter) in the memory (second parameter or `cpu_online_bits` bitmap). We can see here that before `set_bit` will be called, its two parameter will be passed to the
* cpumask_check; * cpumask_check;
* cpumask_bits. * cpumask_bits.
@ -153,7 +153,7 @@ This function looks scarry, but it is not so hard as it seems. First of all it p
#define IS_IMMEDIATE(nr) (__builtin_constant_p(nr)) #define IS_IMMEDIATE(nr) (__builtin_constant_p(nr))
``` ```
`__builtin_constant_p` checks that given paramter is known constant at compile-time. As our `cpu` is not compile-time constant, `else` clause will be executed: `__builtin_constant_p` checks that given parameter is known constant at compile-time. As our `cpu` is not compile-time constant, `else` clause will be executed:
```C ```C
asm volatile(LOCK_PREFIX "bts %1,%0" : BITOP_ADDR(addr) : "Ir" (nr) : "memory"); asm volatile(LOCK_PREFIX "bts %1,%0" : BITOP_ADDR(addr) : "Ir" (nr) : "memory");
@ -163,7 +163,7 @@ Let's try to understand how it works step by step:
`LOCK_PREFIX` is a x86 `lock` instruction. This instruction tells to the cpu to occupy the system bus while instruction will be executed. This allows to synchronize memory access, preventing simultaneous access of multiple processors (or devices - DMA controller for example) to one memory cell. `LOCK_PREFIX` is a x86 `lock` instruction. This instruction tells to the cpu to occupy the system bus while instruction will be executed. This allows to synchronize memory access, preventing simultaneous access of multiple processors (or devices - DMA controller for example) to one memory cell.
`BITOP_ADDR` casts given paramter to the `(*(volatile long *)` and adds `+m` constraints. `+` means that this operand is bot read and written by the instruction. `m` shows that this is memory operand. `BITOP_ADDR` is defined as: `BITOP_ADDR` casts given parameter to the `(*(volatile long *)` and adds `+m` constraints. `+` means that this operand is bot read and written by the instruction. `m` shows that this is memory operand. `BITOP_ADDR` is defined as:
```C ```C
#define BITOP_ADDR(x) "+m" (*(volatile long *) (x)) #define BITOP_ADDR(x) "+m" (*(volatile long *) (x))
@ -187,7 +187,7 @@ cpumask provides the set of macro for getting amount of the CPUs with different
#define num_online_cpus() cpumask_weight(cpu_online_mask) #define num_online_cpus() cpumask_weight(cpu_online_mask)
``` ```
This macro returns amount of the `online` CPUs. It calls `cpumask_weight` function with the `cpu_online_mask` bitmap (read about about it). `cpumask_wieght` function makes an one call of the `bitmap_wiegt` function with two paramters: This macro returns amount of the `online` CPUs. It calls `cpumask_weight` function with the `cpu_online_mask` bitmap (read about about it). `cpumask_wieght` function makes an one call of the `bitmap_wiegt` function with two parameters:
* cpumask bitmap; * cpumask bitmap;
* `nr_cpumask_bits` - which is `NR_CPUS` in our case. * `nr_cpumask_bits` - which is `NR_CPUS` in our case.

View File

@ -14,13 +14,13 @@ Kernel provides API for creating per-cpu variables - `DEFINE_PER_CPU` macro:
This macro defined in the [include/linux/percpu-defs.h](https://github.com/torvalds/linux/blob/master/include/linux/percpu-defs.h) as many other macros for work with per-cpu variables. Now we will see how this feature implemented. This macro defined in the [include/linux/percpu-defs.h](https://github.com/torvalds/linux/blob/master/include/linux/percpu-defs.h) as many other macros for work with per-cpu variables. Now we will see how this feature implemented.
Take a look on `DECLARE_PER_CPU` definition. We see that it takes 2 paramters: `type` and `name`. So we can use it for creation per-cpu variable, for example like this: Take a look on `DECLARE_PER_CPU` definition. We see that it takes 2 parameters: `type` and `name`. So we can use it for creation per-cpu variable, for example like this:
```C ```C
DEFINE_PER_CPU(int, per_cpu_n) DEFINE_PER_CPU(int, per_cpu_n)
``` ```
We pass type of our variable and name. `DEFI_PER_CPU` calls `DEFINE_PER_CPU_SECTION` macro and passes the same two paramaters and empty string to it. Let's look on the defintion of the `DEFINE_PER_CPU_SECTION`: We pass type of our variable and name. `DEFI_PER_CPU` calls `DEFINE_PER_CPU_SECTION` macro and passes the same two paramaters and empty string to it. Let's look on the definition of the `DEFINE_PER_CPU_SECTION`:
```C ```C
#define DEFINE_PER_CPU_SECTION(type, name, sec) \ #define DEFINE_PER_CPU_SECTION(type, name, sec) \
@ -83,7 +83,7 @@ and
#define raw_cpu_ptr(ptr) per_cpu_ptr(ptr, 0) #define raw_cpu_ptr(ptr) per_cpu_ptr(ptr, 0)
``` ```
where `per_cpu_ptr` returns a pointer to the per-cpu variable for the given cpu (second paramter). After that we got per-cpu variables and made any manipulations on it, we must call `put_cpu_var` macro which enables preemption with call of `preempt_enable` function. So the typical usage of a per-cpu variable is following: where `per_cpu_ptr` returns a pointer to the per-cpu variable for the given cpu (second parameter). After that we got per-cpu variables and made any manipulations on it, we must call `put_cpu_var` macro which enables preemption with call of `preempt_enable` function. So the typical usage of a per-cpu variable is following:
```C ```C
get_cpu_var(var); get_cpu_var(var);
@ -115,7 +115,7 @@ do {
which makes given `ptr` type of `const void __percpu *`, which makes given `ptr` type of `const void __percpu *`,
After this we can see the call of the `SHIFT_PERCPU_PTR` macro with two paramters. At first paramter we pass our ptr and sencond we pass cpu number to the `per_cpu_offset` macro which: After this we can see the call of the `SHIFT_PERCPU_PTR` macro with two parameters. At first parameter we pass our ptr and sencond we pass cpu number to the `per_cpu_offset` macro which:
```C ```C
#define per_cpu_offset(x) (__per_cpu_offset[x]) #define per_cpu_offset(x) (__per_cpu_offset[x])
@ -141,7 +141,7 @@ That's all! Of course it is not full API, but the general part. It can be hard f
Let's again look on the algorithm of getting pointer on per-cpu variable: Let's again look on the algorithm of getting pointer on per-cpu variable:
* Kernel creates multiply `.data..percpu` sections (ones perc-pu) during intialization process; * Kernel creates multiply `.data..percpu` sections (ones perc-pu) during initialization process;
* All variables created with the `DEFINE_PER_CPU` macro will be reloacated to the first section or for CPU0; * All variables created with the `DEFINE_PER_CPU` macro will be reloacated to the first section or for CPU0;
* `__per_cpu_offset` array filled with the distance (`BOOT_PERCPU_OFFSET`) between `.data..percpu` sections; * `__per_cpu_offset` array filled with the distance (`BOOT_PERCPU_OFFSET`) between `.data..percpu` sections;
* When `per_cpu_ptr` called for example for getting pointer on the certain per-cpu variable for the third CPU, `__per_cpu_offset` array will be accessed, where every index points to the certain CPU. * When `per_cpu_ptr` called for example for getting pointer on the certain per-cpu variable for the third CPU, `__per_cpu_offset` array will be accessed, where every index points to the certain CPU.

View File

@ -133,7 +133,7 @@ static inline void list_add(struct list_head *new, struct list_head *head)
} }
``` ```
It just calls internal function `__list_add` with the 3 given paramters: It just calls internal function `__list_add` with the 3 given parameters:
* new - new entry; * new - new entry;
* head - list head after which will be inserted new item; * head - list head after which will be inserted new item;
@ -230,7 +230,7 @@ The next offsetof macro calculates offset from the beginning of the structure to
#define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER) #define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER)
``` ```
Let's summarize all about `container_of` macro. `container_of` macro returns address of the structure by the given address of the structure's field with `list_head` type, the name of the structure field with `list_head` type and type of the container structure. At the first line this macro declares the `__mptr` pointer which points to the field of the structure that `ptr` points to and assigns it to the `ptr`. Now `ptr` and `__mptr` points have the same address. Techincally we no need in this line, it needs for the type checking. First line ensures that that given structure (`type` paramter) has a member which called `member`. In the second line it calculates offset of the field from the structure with the `offsetof` macro and substracts it from the structure address. That's all. Let's summarize all about `container_of` macro. `container_of` macro returns address of the structure by the given address of the structure's field with `list_head` type, the name of the structure field with `list_head` type and type of the container structure. At the first line this macro declares the `__mptr` pointer which points to the field of the structure that `ptr` points to and assigns it to the `ptr`. Now `ptr` and `__mptr` points have the same address. Techincally we no need in this line, it needs for the type checking. First line ensures that that given structure (`type` parameter) has a member which called `member`. In the second line it calculates offset of the field from the structure with the `offsetof` macro and subtracts it from the structure address. That's all.
Of course `list_add` and `list_entry` is not only functions which provides `<linux/list.h>`. Implementation of the doubly linked list provides the following API: Of course `list_add` and `list_entry` is not only functions which provides `<linux/list.h>`. Implementation of the doubly linked list provides the following API:

View File

@ -3,7 +3,7 @@
You will find here a couple of posts which describe the full cycle of kernel initialization from its first steps after the kernel has decompressed to the start of the first process run by the kernel itself. You will find here a couple of posts which describe the full cycle of kernel initialization from its first steps after the kernel has decompressed to the start of the first process run by the kernel itself.
* [First steps after kernel decompression](https://github.com/0xAX/linux-insides/blob/master/Initialization/linux-initialization-1.md) - describes first steps in the kernel. * [First steps after kernel decompression](https://github.com/0xAX/linux-insides/blob/master/Initialization/linux-initialization-1.md) - describes first steps in the kernel.
* [Early interupt and exception handling](https://github.com/0xAX/linux-insides/blob/master/Initialization/linux-initialization-2.md) - describes early interrupts initialization and early page fault handler. * [Early interrupt and exception handling](https://github.com/0xAX/linux-insides/blob/master/Initialization/linux-initialization-2.md) - describes early interrupts initialization and early page fault handler.
* [Last preparations before the kernel entry point](https://github.com/0xAX/linux-insides/blob/master/Initialization/linux-initialization-3.md) - describes the last preparations before the call of the `start_kernel`. * [Last preparations before the kernel entry point](https://github.com/0xAX/linux-insides/blob/master/Initialization/linux-initialization-3.md) - describes the last preparations before the call of the `start_kernel`.
* [Kernel entry point](https://github.com/0xAX/linux-insides/blob/master/Initialization/linux-initialization-4.md) - describes first steps in the kernel generic code. * [Kernel entry point](https://github.com/0xAX/linux-insides/blob/master/Initialization/linux-initialization-4.md) - describes first steps in the kernel generic code.
* [Continue of architecture-specific initializations](https://github.com/0xAX/linux-insides/blob/master/Initialization/linux-initialization-5.md) - describes architecture-specific initialization. * [Continue of architecture-specific initializations](https://github.com/0xAX/linux-insides/blob/master/Initialization/linux-initialization-5.md) - describes architecture-specific initialization.

View File

@ -387,7 +387,7 @@ As we got `init_per_cpu__gdt_page` in `INIT_PER_CPU_VAR` and `INIT_PER_CPU` macr
Generally per-CPU variables is a 2.6 kernel feature. You can understand what is it from it's name. When we create `per-CPU` variable, each CPU will have will have it's own copy of this variable. Here we creating `gdt_page` per-CPU variable. There are many advantages for variables of this type, like there are no locks, because each CPU works with it's own copy of variable and etc... So every core on multiprocessor will have it's own `GDT` table and every entry in the table will represent a memory segment which can be accessed from the thread which runned on the core. You can read in details about `per-CPU` variables in the [Theory/per-cpu](http://0xax.gitbooks.io/linux-insides/content/Theory/per-cpu.html) post. Generally per-CPU variables is a 2.6 kernel feature. You can understand what is it from it's name. When we create `per-CPU` variable, each CPU will have will have it's own copy of this variable. Here we creating `gdt_page` per-CPU variable. There are many advantages for variables of this type, like there are no locks, because each CPU works with it's own copy of variable and etc... So every core on multiprocessor will have it's own `GDT` table and every entry in the table will represent a memory segment which can be accessed from the thread which runned on the core. You can read in details about `per-CPU` variables in the [Theory/per-cpu](http://0xax.gitbooks.io/linux-insides/content/Theory/per-cpu.html) post.
As we loaded new Global Descriptor Table, we reload segments as we did it everytime: As we loaded new Global Descriptor Table, we reload segments as we did it every time:
```assembly ```assembly
xorl %eax,%eax xorl %eax,%eax

View File

@ -136,7 +136,7 @@ As we have `init_level4_pgt` filled with zeros, we set the last `init_level4_pgt
init_level4_pgt[511] = early_level4_pgt[511]; init_level4_pgt[511] = early_level4_pgt[511];
``` ```
Remeber that we dropped all `early_level4_pgt` entries in the `reset_early_page_table` function and kept only kernel high mapping there. Remember that we dropped all `early_level4_pgt` entries in the `reset_early_page_table` function and kept only kernel high mapping there.
The last step in the `x86_64_start_kernel` function is the call of the: The last step in the `x86_64_start_kernel` function is the call of the:
@ -250,7 +250,7 @@ lowmem = min(lowmem, LOWMEM_CAP);
memblock_reserve(lowmem, 0x100000 - lowmem); memblock_reserve(lowmem, 0x100000 - lowmem);
``` ```
`memblock_reserve` function is defined at [mm/block.c](https://github.com/torvalds/linux/blob/master/mm/block.c) and takes two paramters: `memblock_reserve` function is defined at [mm/block.c](https://github.com/torvalds/linux/blob/master/mm/block.c) and takes two parameters:
* base physical address; * base physical address;
* region size. * region size.
@ -266,7 +266,7 @@ In the previous paragraph we stopped at the call of the `memblock_reserve` funct
memblock_reserve_region(base, size, MAX_NUMNODES, 0); memblock_reserve_region(base, size, MAX_NUMNODES, 0);
``` ```
function and passes 4 paramters there: function and passes 4 parameters there:
* physical base address of the memory region; * physical base address of the memory region;
* size of the memory region; * size of the memory region;
@ -382,7 +382,7 @@ NUMA node id depends on `MAX_NUMNODES` macro which is defined in the [include/li
#define MAX_NUMNODES (1 << NODES_SHIFT) #define MAX_NUMNODES (1 << NODES_SHIFT)
``` ```
where `NODES_SHIFT` depends on `CONFIG_NODES_SHIFT` configuration paramter and defined as: where `NODES_SHIFT` depends on `CONFIG_NODES_SHIFT` configuration parameter and defined as:
```C ```C
#ifdef CONFIG_NODES_SHIFT #ifdef CONFIG_NODES_SHIFT

View File

@ -235,13 +235,13 @@ Remember that we have passed `cpu_number` as `pcp` to the `this_cpu_read` from t
}) })
``` ```
Yes, it look a little strange, but it's easy. First of all we can see defintion of the `pscr_ret__` variable with the `int` type. Why int? Ok, `variable` is `common_cpu` and it was declared as per-cpu int variable: Yes, it look a little strange, but it's easy. First of all we can see definition of the `pscr_ret__` variable with the `int` type. Why int? Ok, `variable` is `common_cpu` and it was declared as per-cpu int variable:
```C ```C
DECLARE_PER_CPU_READ_MOSTLY(int, cpu_number); DECLARE_PER_CPU_READ_MOSTLY(int, cpu_number);
``` ```
In the next step we call `__verify_pcpu_ptr` with the address of `cpu_number`. `__veryf_pcpu_ptr` used to verifying that given paramter is an per-cpu pointer. After that we set `pscr_ret__` value which depends on the size of the variable. Our `common_cpu` variable is `int`, so it 4 bytes size. It means that we will get `this_cpu_read_4(common_cpu)` in `pscr_ret__`. In the end of the `__pcpu_size_call_return` we just call it. `this_cpu_read_4` is a macro: In the next step we call `__verify_pcpu_ptr` with the address of `cpu_number`. `__veryf_pcpu_ptr` used to verifying that given parameter is an per-cpu pointer. After that we set `pscr_ret__` value which depends on the size of the variable. Our `common_cpu` variable is `int`, so it 4 bytes size. It means that we will get `this_cpu_read_4(common_cpu)` in `pscr_ret__`. In the end of the `__pcpu_size_call_return` we just call it. `this_cpu_read_4` is a macro:
```C ```C
#define this_cpu_read_4(pcp) percpu_from_op("mov", pcp) #define this_cpu_read_4(pcp) percpu_from_op("mov", pcp)
@ -276,9 +276,9 @@ set_cpu_present(cpu, true);
set_cpu_possible(cpu, true); set_cpu_possible(cpu, true);
``` ```
All of these functions use the concept - `cpumask`. `cpu_possible` is a set of cpu ID's which can be plugged in anytime during the life of that system boot. `cpu_present` represents which CPUs are currently plugged in. `cpu_online` represents subset of the `cpu_present` and indicates CPUs which are available for scheduling. These masks depends on `CONFIG_HOTPLUG_CPU` configuration option and if this option is disabled `possible == present` and `active == online`. Implementation of the all of these functions are very similar. Every function checks the second paramter. If it is `true`, calls `cpumask_set_cpu` or `cpumask_clear_cpu` otherwise. All of these functions use the concept - `cpumask`. `cpu_possible` is a set of cpu ID's which can be plugged in anytime during the life of that system boot. `cpu_present` represents which CPUs are currently plugged in. `cpu_online` represents subset of the `cpu_present` and indicates CPUs which are available for scheduling. These masks depends on `CONFIG_HOTPLUG_CPU` configuration option and if this option is disabled `possible == present` and `active == online`. Implementation of the all of these functions are very similar. Every function checks the second parameter. If it is `true`, calls `cpumask_set_cpu` or `cpumask_clear_cpu` otherwise.
For example let's look on `set_cpu_possible`. As we passed `true` as the second paramter, the: For example let's look on `set_cpu_possible`. As we passed `true` as the second parameter, the:
```C ```C
cpumask_set_cpu(cpu, to_cpumask(cpu_possible_bits)); cpumask_set_cpu(cpu, to_cpumask(cpu_possible_bits));
@ -304,7 +304,7 @@ As we can see from its definition, `DECLARE_BITMAP` macro expands to the array o
: (void *)sizeof(__check_is_bitmap(bitmap)))) : (void *)sizeof(__check_is_bitmap(bitmap))))
``` ```
I don't know how about you, but it looked really weird for me at the first time. We can see ternary operator operator here which is `true` everytime, but why the `__check_is_bitmap` here? It's simple, let's look on it: I don't know how about you, but it looked really weird for me at the first time. We can see ternary operator operator here which is `true` every time, but why the `__check_is_bitmap` here? It's simple, let's look on it:
```C ```C
static inline int __check_is_bitmap(const unsigned long *bitmap) static inline int __check_is_bitmap(const unsigned long *bitmap)
@ -313,7 +313,7 @@ static inline int __check_is_bitmap(const unsigned long *bitmap)
} }
``` ```
Yeah, it just returns `1` everytime. Actually we need in it here only for one purpose: In compile time it checks that given `bitmap` is a bitmap, or with another words it checks that given `bitmap` has type - `unsigned long *`. So we just pass `cpu_possible_bits` to the `to_cpumask` macro for converting array of `unsigned long` to the `struct cpumask *`. Now we can call `cpumask_set_cpu` function with the `cpu` - 0 and `struct cpumask *cpu_possible_bits`. This function makes only one call of the `set_bit` function which sets the given `cpu` in the cpumask. All of these `set_cpu_*` functions work on the same principle. Yeah, it just returns `1` every time. Actually we need in it here only for one purpose: In compile time it checks that given `bitmap` is a bitmap, or with another words it checks that given `bitmap` has type - `unsigned long *`. So we just pass `cpu_possible_bits` to the `to_cpumask` macro for converting array of `unsigned long` to the `struct cpumask *`. Now we can call `cpumask_set_cpu` function with the `cpu` - 0 and `struct cpumask *cpu_possible_bits`. This function makes only one call of the `set_bit` function which sets the given `cpu` in the cpumask. All of these `set_cpu_*` functions work on the same principle.
If you're not sure that this `set_cpu_*` operations and `cpumask` are not clear for you, don't worry about it. You can get more info by reading of the special part about it - [cpumask](http://0xax.gitbooks.io/linux-insides/content/Concepts/cpumask.html) or [documentation](https://www.kernel.org/doc/Documentation/cpu-hotplug.txt). If you're not sure that this `set_cpu_*` operations and `cpumask` are not clear for you, don't worry about it. You can get more info by reading of the special part about it - [cpumask](http://0xax.gitbooks.io/linux-insides/content/Concepts/cpumask.html) or [documentation](https://www.kernel.org/doc/Documentation/cpu-hotplug.txt).
@ -335,7 +335,7 @@ as you can see it just expands to the `printk` call. For this moment we use `pr_
pr_notice("%s", linux_banner); pr_notice("%s", linux_banner);
``` ```
which is just kernel version with some additional paramters: which is just kernel version with some additional parameters:
``` ```
Linux version 4.0.0-rc6+ (alex@localhost) (gcc version 4.9.1 (Ubuntu 4.9.1-16ubuntu6) ) #319 SMP Linux version 4.0.0-rc6+ (alex@localhost) (gcc version 4.9.1 (Ubuntu 4.9.1-16ubuntu6) ) #319 SMP
@ -352,7 +352,7 @@ This function starts from the reserving memory block for the kernel `_text` and
memblock_reserve(__pa_symbol(_text), (unsigned long)__bss_stop - (unsigned long)_text); memblock_reserve(__pa_symbol(_text), (unsigned long)__bss_stop - (unsigned long)_text);
``` ```
You can read about `memblock` in the [Linux kernel memory management Part 1.](http://0xax.gitbooks.io/linux-insides/content/mm/linux-mm-1.html). As you can remember `memblock_reserve` function takes two paramters: You can read about `memblock` in the [Linux kernel memory management Part 1.](http://0xax.gitbooks.io/linux-insides/content/mm/linux-mm-1.html). As you can remember `memblock_reserve` function takes two parameters:
* base physical address of a memory block; * base physical address of a memory block;
* size of a memor block. * size of a memor block.
@ -364,7 +364,7 @@ Base physical address of the `_text` symbol we will get with the `__pa_symbol` m
__phys_addr_symbol(__phys_reloc_hide((unsigned long)(x))) __phys_addr_symbol(__phys_reloc_hide((unsigned long)(x)))
``` ```
First of all it calls `__phys_reloc_hide` macro on the given paramter. `__phys_reloc_hide` macro does nothing for `x86_64` and just returns the given paramter. Implementation of the `__phys_addr_symbol` macro is easy. It just substracts the symbol address from the base addres of the kernel text mapping base virtual address (you can remember that it is `__START_KERNEL_map`) and adds `phys_base` which is base address of the `_text`: First of all it calls `__phys_reloc_hide` macro on the given parameter. `__phys_reloc_hide` macro does nothing for `x86_64` and just returns the given parameter. Implementation of the `__phys_addr_symbol` macro is easy. It just subtracts the symbol address from the base address of the kernel text mapping base virtual address (you can remember that it is `__START_KERNEL_map`) and adds `phys_base` which is base address of the `_text`:
```C ```C
#define __phys_addr_symbol(x) \ #define __phys_addr_symbol(x) \
@ -376,7 +376,7 @@ After we got physical address of the `_text` symbol, `memblock_reserve` can rese
Reserve memory for initrd Reserve memory for initrd
--------------------------------------------------------------------------------- ---------------------------------------------------------------------------------
In the next step after we reserved place for the kernel text and data is resering place for the [initrd](http://en.wikipedia.org/wiki/Initrd). We will not see details about `initrd` in this post, you just may know that it is temprary root file system stored in memory and used by the kernel during its startup. `early_reserve_initrd` function does all work. First of all this function get the base address of the ram disk, its size and the end address with: In the next step after we reserved place for the kernel text and data is resering place for the [initrd](http://en.wikipedia.org/wiki/Initrd). We will not see details about `initrd` in this post, you just may know that it is temporary root file system stored in memory and used by the kernel during its startup. `early_reserve_initrd` function does all work. First of all this function get the base address of the ram disk, its size and the end address with:
```C ```C
u64 ramdisk_image = get_ramdisk_image(); u64 ramdisk_image = get_ramdisk_image();
@ -384,7 +384,7 @@ u64 ramdisk_size = get_ramdisk_size();
u64 ramdisk_end = PAGE_ALIGN(ramdisk_image + ramdisk_size); u64 ramdisk_end = PAGE_ALIGN(ramdisk_image + ramdisk_size);
``` ```
All of these paramters it takes from the `boot_params`. If you have read chapter abot [Linux Kernel Booting Process](http://0xax.gitbooks.io/linux-insides/content/Booting/index.html), you must remember that we filled `boot_params` structure during boot time. Kerne setup header contains a couple of fields which describes ramdisk, for example: All of these parameters it takes from the `boot_params`. If you have read chapter abot [Linux Kernel Booting Process](http://0xax.gitbooks.io/linux-insides/content/Booting/index.html), you must remember that we filled `boot_params` structure during boot time. Kerne setup header contains a couple of fields which describes ramdisk, for example:
``` ```
Field name: ramdisk_image Field name: ramdisk_image

View File

@ -52,7 +52,7 @@ As you can read above, we passed address of the `#DB` handler as `&debug` in the
asmlinkage void debug(void); asmlinkage void debug(void);
``` ```
We can see `asmlinkage` attribute which tells to us that `debug` is function written with [assembly](http://en.wikipedia.org/wiki/Assembly_language). Yeah, again and again assembly :). Implementation of the `#DB` handler as other handlers is in ths [arch/x86/kernel/entry_64.S](https://github.com/torvalds/linux/blob/master/arch/x86/kernel/entry_64.S) and defined with the `idtentry` assembly macro: We can see `asmlinkage` attribute which tells to us that `debug` is function written with [assembly](http://en.wikipedia.org/wiki/Assembly_language). Yeah, again and again assembly :). Implementation of the `#DB` handler as other handlers is in this [arch/x86/kernel/entry_64.S](https://github.com/torvalds/linux/blob/master/arch/x86/kernel/entry_64.S) and defined with the `idtentry` assembly macro:
```assembly ```assembly
idtentry debug do_debug has_error_code=0 paranoid=1 shift_ist=DEBUG_STACK idtentry debug do_debug has_error_code=0 paranoid=1 shift_ist=DEBUG_STACK
@ -163,7 +163,7 @@ The next step is initialization of early `ioremap`. In general there are two way
We already saw first method (`outb/inb` instructions) in the part about linux kernel booting [process](http://0xax.gitbooks.io/linux-insides/content/Booting/linux-bootstrap-3.html). The second method is to map I/O physical addresses to virtual addresses. When a physical address is accessed by the CPU, it may refer to a portion of physical RAM which can be mapped on memory of the I/O device. So `ioremap` used to map device memory into kernel address space. We already saw first method (`outb/inb` instructions) in the part about linux kernel booting [process](http://0xax.gitbooks.io/linux-insides/content/Booting/linux-bootstrap-3.html). The second method is to map I/O physical addresses to virtual addresses. When a physical address is accessed by the CPU, it may refer to a portion of physical RAM which can be mapped on memory of the I/O device. So `ioremap` used to map device memory into kernel address space.
As i wrote above next function is the `early_ioremap_init` which re-maps I/O memory to kernel address space so it can access it. We need to initialize early ioremap for early intialization code which needs to temporarily map I/O or memory regions before the normal mapping functions like `ioremap` are available. Implementation of this function is in the [arch/x86/mm/ioremap.c](https://github.com/torvalds/linux/blob/master/arch/x86/mm/ioremap.c). At the start of the `early_ioremap_init` we can see definition of the `pmd` point with `pmd_t` type (which presents page middle directory entry `typedef struct { pmdval_t pmd; } pmd_t;` where `pmdval_t` is `unsigned long`) and make a check that `fixmap` aligned in a correct way: As i wrote above next function is the `early_ioremap_init` which re-maps I/O memory to kernel address space so it can access it. We need to initialize early ioremap for early initialization code which needs to temporarily map I/O or memory regions before the normal mapping functions like `ioremap` are available. Implementation of this function is in the [arch/x86/mm/ioremap.c](https://github.com/torvalds/linux/blob/master/arch/x86/mm/ioremap.c). At the start of the `early_ioremap_init` we can see definition of the `pmd` point with `pmd_t` type (which presents page middle directory entry `typedef struct { pmdval_t pmd; } pmd_t;` where `pmdval_t` is `unsigned long`) and make a check that `fixmap` aligned in a correct way:
```C ```C
pmd_t *pmd; pmd_t *pmd;
@ -196,7 +196,7 @@ After early `ioremap` was initialized, you can see the following code:
ROOT_DEV = old_decode_dev(boot_params.hdr.root_dev); ROOT_DEV = old_decode_dev(boot_params.hdr.root_dev);
``` ```
This code obtains major and minor numbers for the root device where `initrd` will be mounted later in the `do_mount_root` function. Major number of the device identifies a driver associated with the device. Minor number reffered on the device controlled by driver. Note that `old_decode_dev` takes one parameter from the `boot_params_structure`. As we can read from the x86 linux kernel boot protocol: This code obtains major and minor numbers for the root device where `initrd` will be mounted later in the `do_mount_root` function. Major number of the device identifies a driver associated with the device. Minor number referred on the device controlled by driver. Note that `old_decode_dev` takes one parameter from the `boot_params_structure`. As we can read from the x86 linux kernel boot protocol:
``` ```
Field name: root_dev Field name: root_dev

View File

@ -61,7 +61,7 @@ first of all it check that given index of `fixed_addresses` enum is not greater
#define __fix_to_virt(x) (FIXADDR_TOP - ((x) << PAGE_SHIFT)) #define __fix_to_virt(x) (FIXADDR_TOP - ((x) << PAGE_SHIFT))
``` ```
Here we shift left the given `fix-mapped` address index on the `PAGE_SHIFT` which determines size of a page as I wrote above and substract it from the `FIXADDR_TOP` which is the highest address of the `fix-mapped` area. There is inverse function for getting `fix-mapped` address from a virtual address: Here we shift left the given `fix-mapped` address index on the `PAGE_SHIFT` which determines size of a page as I wrote above and subtract it from the `FIXADDR_TOP` which is the highest address of the `fix-mapped` area. There is inverse function for getting `fix-mapped` address from a virtual address:
```C ```C
static inline unsigned long virt_to_fix(const unsigned long vaddr) static inline unsigned long virt_to_fix(const unsigned long vaddr)
@ -79,7 +79,7 @@ static inline unsigned long virt_to_fix(const unsigned long vaddr)
A PFN is simply in index within physical memory that is counted in page-sized units. PFN for a physical address could be trivially defined as (page_phys_addr >> PAGE_SHIFT); A PFN is simply in index within physical memory that is counted in page-sized units. PFN for a physical address could be trivially defined as (page_phys_addr >> PAGE_SHIFT);
`__virt_to_fix` clears first 12 bits in the given address, substracts it from the last address the of `fix-mapped` area (`FIXADDR_TOP`) and shifts right result on `PAGE_SHIFT` which is `12`. Let I explain how it works. As i already wrote we will crear first 12 bits in the given address with `x & PAGE_MASK`. As we substract this from the `FIXADDR_TOP`, we will get last 12 bits of the `FIXADDR_TOP` which are represent. We know that first 12 bits of the virtual address present offset in the page frame. With the shiting it on `PAGE_SHIFT` we will get `Page frame number` which is just all bits in a virtual address besides first 12 offset bits. `Fix-mapped` addresses are used in different [places](http://lxr.free-electrons.com/ident?i=fix_to_virt) of the linux kernel. `IDT` descriptor stored there, [Intel Trusted Execution Technology](http://en.wikipedia.org/wiki/Trusted_Execution_Technology) UUID stored in the `fix-mapped` area started from `FIX_TBOOT_BASE` index, [Xen](http://en.wikipedia.org/wiki/Xen) bootmap and many more... We already saw a little about `fix-mapped` addresses in the fifth [part](http://0xax.gitbooks.io/linux-insides/content/Initialization/linux-initialization-5.html) about linux kernel initialization. We used `fix-mapped` area in the early `ioremap` initialization. Let's look on it and try to understand what is it `ioremap`, how it implemented in the kernel and how it releated with the `fix-mapped` addresses. `__virt_to_fix` clears first 12 bits in the given address, subtracts it from the last address the of `fix-mapped` area (`FIXADDR_TOP`) and shifts right result on `PAGE_SHIFT` which is `12`. Let I explain how it works. As i already wrote we will crear first 12 bits in the given address with `x & PAGE_MASK`. As we subtract this from the `FIXADDR_TOP`, we will get last 12 bits of the `FIXADDR_TOP` which are represent. We know that first 12 bits of the virtual address present offset in the page frame. With the shiting it on `PAGE_SHIFT` we will get `Page frame number` which is just all bits in a virtual address besides first 12 offset bits. `Fix-mapped` addresses are used in different [places](http://lxr.free-electrons.com/ident?i=fix_to_virt) of the linux kernel. `IDT` descriptor stored there, [Intel Trusted Execution Technology](http://en.wikipedia.org/wiki/Trusted_Execution_Technology) UUID stored in the `fix-mapped` area started from `FIX_TBOOT_BASE` index, [Xen](http://en.wikipedia.org/wiki/Xen) bootmap and many more... We already saw a little about `fix-mapped` addresses in the fifth [part](http://0xax.gitbooks.io/linux-insides/content/Initialization/linux-initialization-5.html) about linux kernel initialization. We used `fix-mapped` area in the early `ioremap` initialization. Let's look on it and try to understand what is it `ioremap`, how it implemented in the kernel and how it releated with the `fix-mapped` addresses.
ioremap ioremap
-------------------------------------------------------------------------------- --------------------------------------------------------------------------------
@ -164,7 +164,7 @@ struct resource iomem_resource = {
.flags = IORESOURCE_MEM, .flags = IORESOURCE_MEM,
}; };
As I wrote about `request_regions` is used for registering of I/O port region and this macro used in many [places](http://lxr.free-electrons.com/ident?i=request_region) in the kernel. For example let's look on the [drivers/char/rtc.c](https://github.com/torvalds/linux/blob/master/char/rtc.c). This source code file provides [Real Time Clock](http://en.wikipedia.org/wiki/Real-time_clock) interface in the linux kernel. As every kernel module, `rtc` module contains `module_init` defintion: As I wrote about `request_regions` is used for registering of I/O port region and this macro used in many [places](http://lxr.free-electrons.com/ident?i=request_region) in the kernel. For example let's look on the [drivers/char/rtc.c](https://github.com/torvalds/linux/blob/master/char/rtc.c). This source code file provides [Real Time Clock](http://en.wikipedia.org/wiki/Real-time_clock) interface in the linux kernel. As every kernel module, `rtc` module contains `module_init` definition:
```C ```C
module_init(rtc_init); module_init(rtc_init);
@ -254,7 +254,7 @@ static inline const char *e820_type_to_string(int e820_type)
and we can see it in the `/proc/iomem` (read above). and we can see it in the `/proc/iomem` (read above).
Now let's try to understand how `ioremap` works. We already know little about `ioremap`, we saw it in the fifth [part](http://0xax.gitbooks.io/linux-insides/content/Initialization/linux-initialization-5.html) about linux kernel initalization. If you have read this part, you can remember call of the `early_ioremap_init` function from the [arch/x86/mm/ioremap.c](https://github.com/torvalds/linux/blob/master/arch/x86/mm/ioremap.c). Initialization of the `ioremap` splitten on two parts: there is early part which we can use before normal `ioremap` is available and normal `ioremap` which is available after `vmalloc` initialization and call of the `paging_init`. We do not know anything about `vmalloc` for now, so let's consider early initialization of the `ioremap`. First of all `early_ioremap_init` checks that `fixmap` is aligned on page middle directory boundary: Now let's try to understand how `ioremap` works. We already know little about `ioremap`, we saw it in the fifth [part](http://0xax.gitbooks.io/linux-insides/content/Initialization/linux-initialization-5.html) about linux kernel initialization. If you have read this part, you can remember call of the `early_ioremap_init` function from the [arch/x86/mm/ioremap.c](https://github.com/torvalds/linux/blob/master/arch/x86/mm/ioremap.c). Initialization of the `ioremap` splitten on two parts: there is early part which we can use before normal `ioremap` is available and normal `ioremap` which is available after `vmalloc` initialization and call of the `paging_init`. We do not know anything about `vmalloc` for now, so let's consider early initialization of the `ioremap`. First of all `early_ioremap_init` checks that `fixmap` is aligned on page middle directory boundary:
```C ```C
BUILD_BUG_ON((fix_to_virt(0) + PAGE_SIZE) & ((1 << PMD_SHIFT) - 1)); BUILD_BUG_ON((fix_to_virt(0) + PAGE_SIZE) & ((1 << PMD_SHIFT) - 1));
@ -479,7 +479,7 @@ static inline void __native_flush_tlb_single(unsigned long addr)
} }
``` ```
or call `__flush_tlb` which just updates `cr3` register as we saw it above. After this step execution of the `__early_set_fixmap` function is finsihed and we can back to the `__early_ioremap` implementation. As we set fixmap area for the given addres, need to save the base virtual address of the I/O Re-mapped area in the `prev_map` with the `slot` index: or call `__flush_tlb` which just updates `cr3` register as we saw it above. After this step execution of the `__early_set_fixmap` function is finsihed and we can back to the `__early_ioremap` implementation. As we set fixmap area for the given address, need to save the base virtual address of the I/O Re-mapped area in the `prev_map` with the `slot` index:
```C ```C
prev_map[slot] = (void __iomem *)(offset + slot_virt[slot]); prev_map[slot] = (void __iomem *)(offset + slot_virt[slot]);