mirror of
https://github.com/0xAX/linux-insides.git
synced 2025-01-06 22:01:06 +00:00
Fix style and typos.
Signed-off-by: Jakub Duchniewicz <j.duchniewicz@gmail.com>
This commit is contained in:
parent
859d98c037
commit
212d935372
@ -4,28 +4,28 @@ Kernel initialization. Part 4.
|
|||||||
Kernel entry point
|
Kernel entry point
|
||||||
================================================================================
|
================================================================================
|
||||||
|
|
||||||
If you have read the previous part - [Last preparations before the kernel entry point](https://github.com/0xAX/linux-insides/blob/master/Initialization/linux-initialization-3.md), you can remember that we finished all pre-initialization stuff and stopped right before the call to the `start_kernel` function from the [init/main.c](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/init/main.c). The `start_kernel` is the entry of the generic and architecture independent kernel code, although we will return to the `arch/` folder many times. If you look inside of the `start_kernel` function, you will see that this function is very big. For this moment it contains about `86` function calls. Yes, it's very big and of course this part will not cover all the processes that occur in this function. In the current part we will only start to do it. This part and all the next which will be in the [Kernel initialization process](https://github.com/0xAX/linux-insides/blob/master/Initialization/README.md) chapter will cover it.
|
If you have read the previous part - [Last preparations before the kernel entry point](https://github.com/0xAX/linux-insides/blob/master/Initialization/linux-initialization-3.md), you can remember that we finished all pre-initialization stuff and stopped right before the call to the `start_kernel` function from the [init/main.c](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/init/main.c). The `start_kernel` is the entry of the generic and architecture independent kernel code, although we will return to the `arch/` folder many times. If you look inside of the `start_kernel` function, you will see that this function is lengthy. As of now it contains about `86` function calls. Yes, it's very big and of course this part will not cover all the processes that occur in this function. In the current part we will only start covering it. This part and all the next in the [Kernel initialization process](https://github.com/0xAX/linux-insides/blob/master/Initialization/README.md) chapter will cover it.
|
||||||
|
|
||||||
The main purpose of the `start_kernel` to finish kernel initialization process and launch the first `init` process. Before the first process will be started, the `start_kernel` must do many things such as: to enable [lock validator](https://www.kernel.org/doc/Documentation/locking/lockdep-design.txt), to initialize processor id, to enable early [cgroups](http://en.wikipedia.org/wiki/Cgroups) subsystem, to setup per-cpu areas, to initialize different caches in [vfs](http://en.wikipedia.org/wiki/Virtual_file_system), to initialize memory manager, rcu, vmalloc, scheduler, IRQs, ACPI and many many more. Only after these steps will we see the launch of the first `init` process in the last part of this chapter. So much kernel code awaits us, let's start.
|
The main purpose of the `start_kernel` function is to finish kernel initialization process and launch the first `init` process. Before the first process is started, the `start_kernel` must do many things such as: enabling the [lock validator](https://www.kernel.org/doc/Documentation/locking/lockdep-design.txt), initializing processor id, enabling early [cgroups](http://en.wikipedia.org/wiki/Cgroups) subsystem, setting up per-cpu areas, initializing different caches in [vfs](http://en.wikipedia.org/wiki/Virtual_file_system), initializing the memory manager, rcu, vmalloc, scheduler, IRQs, ACPI and many many more. Only after these steps will we see the launch of the first `init` process in the last part of this chapter. So much kernel code awaits us, let's start.
|
||||||
|
|
||||||
**NOTE: All parts from this big chapter `Linux Kernel initialization process` will not cover anything about debugging. There will be a separate chapter about kernel debugging tips.**
|
**NOTE: All parts from this big chapter `Linux Kernel initialization process` will not cover anything about debugging. There will be a separate chapter about kernel debugging tips.**
|
||||||
|
|
||||||
A little about function attributes
|
A little about function attributes
|
||||||
---------------------------------------------------------------------------------
|
---------------------------------------------------------------------------------
|
||||||
|
|
||||||
As I wrote above, the `start_kernel` function is defined in the [init/main.c](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/init/main.c). This function defined with the `__init` attribute and as you already may know from other parts, all functions which are defined with this attribute are necessary during kernel initialization.
|
As I wrote above, the `start_kernel` function is defined in the [init/main.c](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/init/main.c). This function is defined with the `__init` attribute and as you already may know from other parts, all functions defined with this attribute are necessary during kernel initialization.
|
||||||
|
|
||||||
```C
|
```C
|
||||||
#define __init __section(.init.text) __cold notrace
|
#define __init __section(.init.text) __cold notrace
|
||||||
```
|
```
|
||||||
|
|
||||||
After the initialization process have finished, the kernel will release these sections with a call to the `free_initmem` function. Note also that `__init` is defined with two attributes: `__cold` and `notrace`. The purpose of the first `cold` attribute is to mark that the function is rarely used and the compiler must optimize this function for size. The second `notrace` is defined as:
|
After the initialization process is finished, the kernel will release these sections with a call to the `free_initmem` function. Also, note that `__init` is defined with two attributes: `__cold` and `notrace`. The purpose of the first `cold` attribute is to mark that the function is rarely used and the compiler must optimize this function for size. The second: `notrace` is defined as:
|
||||||
|
|
||||||
```C
|
```C
|
||||||
#define notrace __attribute__((no_instrument_function))
|
#define notrace __attribute__((no_instrument_function))
|
||||||
```
|
```
|
||||||
|
|
||||||
where `no_instrument_function` says to the compiler not to generate profiling function calls.
|
where `no_instrument_function` tells the compiler not to generate any profiling function calls.
|
||||||
|
|
||||||
In the definition of the `start_kernel` function, you can also see the `__visible` attribute which expands to the:
|
In the definition of the `start_kernel` function, you can also see the `__visible` attribute which expands to the:
|
||||||
|
|
||||||
@ -33,7 +33,7 @@ In the definition of the `start_kernel` function, you can also see the `__visibl
|
|||||||
#define __visible __attribute__((externally_visible))
|
#define __visible __attribute__((externally_visible))
|
||||||
```
|
```
|
||||||
|
|
||||||
where `externally_visible` tells to the compiler that something uses this function or variable, to prevent marking this function/variable as `unusable`. You can find the definition of this and other macro attributes in [include/linux/init.h](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/include/linux/init.h).
|
where `externally_visible` tells the compiler that something uses this function or variable, to prevent marking it as `unusable`. You can find the definition of this and other macro attributes in [include/linux/init.h](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/include/linux/init.h).
|
||||||
|
|
||||||
First steps in the start_kernel
|
First steps in the start_kernel
|
||||||
--------------------------------------------------------------------------------
|
--------------------------------------------------------------------------------
|
||||||
@ -45,7 +45,7 @@ char *command_line;
|
|||||||
char *after_dashes;
|
char *after_dashes;
|
||||||
```
|
```
|
||||||
|
|
||||||
The first represents a pointer to the kernel command line and the second will contain the result of the `parse_args` function which parses an input string with parameters in the form `name=value`, looking for specific keywords and invoking the right handlers. We will not go into the details related with these two variables at this time, but will see it in the next parts. In the next step we can see a call to the `set_task_stack_end_magic` function. This function takes address of the `init_task` and sets `STACK_END_MAGIC` (`0x57AC6E9D`) as canary for it. `init_task` represents the initial task structure:
|
The first represents a pointer to the kernel command line and the second will contain the result of the `parse_args` function that parses an input string with parameters in the form `name=value`, looking for specific keywords and invoking the right handlers. We will not go into the details related to these two variables at this time, but will see it in the next parts. In the next step we can see a call to the `set_task_stack_end_magic` function. This function takes the address of the `init_task` and sets `STACK_END_MAGIC` (`0x57AC6E9D`) as a canary for it. `init_task` represents the initial task structure:
|
||||||
|
|
||||||
```C
|
```C
|
||||||
struct task_struct init_task = INIT_TASK(init_task);
|
struct task_struct init_task = INIT_TASK(init_task);
|
||||||
@ -53,11 +53,11 @@ struct task_struct init_task = INIT_TASK(init_task);
|
|||||||
|
|
||||||
where `task_struct` stores all the information about a process. I will not explain this structure in this book because it's very big. You can find its definition in [include/linux/sched.h](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/include/linux/sched.h#L1278). At this moment `task_struct` contains more than `100` fields! Although you will not see the explanation of the `task_struct` in this book, we will use it very often since it is the fundamental structure which describes the `process` in the Linux kernel. I will describe the meaning of the fields of this structure as we meet them in practice.
|
where `task_struct` stores all the information about a process. I will not explain this structure in this book because it's very big. You can find its definition in [include/linux/sched.h](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/include/linux/sched.h#L1278). At this moment `task_struct` contains more than `100` fields! Although you will not see the explanation of the `task_struct` in this book, we will use it very often since it is the fundamental structure which describes the `process` in the Linux kernel. I will describe the meaning of the fields of this structure as we meet them in practice.
|
||||||
|
|
||||||
You can see the definition of the `init_task` and it initialized by the `INIT_TASK` macro. This macro is from [include/linux/init_task.h](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/include/linux/init_task.h) and it just fills the `init_task` with the values for the first process. For example it sets:
|
You can see the definition of the `init_task` and it is initialized with the `INIT_TASK` macro. This macro is from [include/linux/init_task.h](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/include/linux/init_task.h) and it just fills the `init_task` with the values for the first process. For example it sets:
|
||||||
|
|
||||||
* init process state to zero or `runnable`. A runnable process is one which is waiting only for a CPU to run on;
|
* init process state to zero or `runnable`. A runnable process is one that is waiting for a CPU to run on only;
|
||||||
* init process flags - `PF_KTHREAD` which means - kernel thread;
|
* init process flags - `PF_KTHREAD` meaning - a kernel thread;
|
||||||
* a list of runnable task;
|
* a list of runnable tasks;
|
||||||
* process address space;
|
* process address space;
|
||||||
* init process stack to the `&init_thread_info` which is `init_thread_union.thread_info` and `initthread_union` has type - `thread_union` which contains `thread_info` and process stack:
|
* init process stack to the `&init_thread_info` which is `init_thread_union.thread_info` and `initthread_union` has type - `thread_union` which contains `thread_info` and process stack:
|
||||||
|
|
||||||
@ -68,7 +68,7 @@ union thread_union {
|
|||||||
};
|
};
|
||||||
```
|
```
|
||||||
|
|
||||||
Every process has its own stack and it is 16 kilobytes or 4 page frames in `x86_64`. We can note that it is defined as array of `unsigned long`. The next field of the `thread_union` is - `thread_info` defined as:
|
Every process has its own stack and it is 16 kilobytes or 4 page frames in `x86_64`. We can note that it is defined as an array of `unsigned long`. The next field of the `thread_union` is - `thread_info` defined as:
|
||||||
|
|
||||||
```C
|
```C
|
||||||
struct thread_info {
|
struct thread_info {
|
||||||
@ -86,7 +86,7 @@ struct thread_info {
|
|||||||
};
|
};
|
||||||
```
|
```
|
||||||
|
|
||||||
and occupies 52 bytes. The `thread_info` structure contains architecture-specific information on the thread. We know that on `x86_64` the stack grows down and `thread_union.thread_info` is stored at the bottom of the stack in our case. So the process stack is 16 kilobytes and `thread_info` is at the bottom. The remaining thread size will be `16 kilobytes - 62 bytes = 16332 bytes`. Note that `thread_union` represented as the [union](http://en.wikipedia.org/wiki/Union_type) and not structure, it means that `thread_info` and stack share the memory space.
|
and occupies 52 bytes. The `thread_info` structure contains architecture-specific information on the thread. We know that on `x86_64` the stack grows down and `thread_union.thread_info` is stored at the bottom of the stack in our case. So the process stack is 16 kilobytes and `thread_info` is at the bottom. The remaining thread size will be `16 kilobytes - 62 bytes = 16332 bytes`. Note that `thread_union` is represented as the [union](http://en.wikipedia.org/wiki/Union_type) and not structure, it means that `thread_info` and stack share the memory space.
|
||||||
|
|
||||||
Schematically it can be represented as follows:
|
Schematically it can be represented as follows:
|
||||||
|
|
||||||
@ -111,7 +111,7 @@ http://www.quora.com/In-Linux-kernel-Why-thread_info-structure-and-the-kernel-st
|
|||||||
|
|
||||||
So the `INIT_TASK` macro fills these `task_struct's` fields and many many more. As I already wrote above, I will not describe all the fields and values in the `INIT_TASK` macro but we will see them soon.
|
So the `INIT_TASK` macro fills these `task_struct's` fields and many many more. As I already wrote above, I will not describe all the fields and values in the `INIT_TASK` macro but we will see them soon.
|
||||||
|
|
||||||
Now let's go back to the `set_task_stack_end_magic` function. This function defined in the [kernel/fork.c](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/kernel/fork.c#L297) and sets a [canary](http://en.wikipedia.org/wiki/Stack_buffer_overflow) to the `init` process stack to prevent stack overflow.
|
Now let's go back to the `set_task_stack_end_magic` function. This function is defined in the [kernel/fork.c](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/kernel/fork.c#L297) and sets a [canary](http://en.wikipedia.org/wiki/Stack_buffer_overflow) to the `init` process stack to prevent stack overflow.
|
||||||
|
|
||||||
```C
|
```C
|
||||||
void set_task_stack_end_magic(struct task_struct *tsk)
|
void set_task_stack_end_magic(struct task_struct *tsk)
|
||||||
@ -134,7 +134,7 @@ where `task_thread_info` just returns the stack which we filled with the `INIT_T
|
|||||||
#define task_thread_info(task) ((struct thread_info *)(task)->stack)
|
#define task_thread_info(task) ((struct thread_info *)(task)->stack)
|
||||||
```
|
```
|
||||||
|
|
||||||
From the Linux kernel `v4.9-rc1` release, `thread_info` structure may contains only flags and stack pointer resides in `task_struct` structure which represents a thread in the Linux kernel. This depends on `CONFIG_THREAD_INFO_IN_TASK` kernel configuration option which is enabled by default for `x86_64`. You can be sure in this if you will look in the [init/main.c](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/init/main.c) configuration build file:
|
From the Linux kernel `v4.9-rc1` release, `thread_info` structure may contain only flags and stack pointer residing in the `task_struct` structure that represents a thread in the Linux kernel. This depends on `CONFIG_THREAD_INFO_IN_TASK` kernel configuration option enabled by default for `x86_64`. You can be sure of his if you look in the [init/main.c](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/init/main.c) configuration build file:
|
||||||
|
|
||||||
```
|
```
|
||||||
config THREAD_INFO_IN_TASK
|
config THREAD_INFO_IN_TASK
|
||||||
@ -162,7 +162,7 @@ config X86
|
|||||||
...
|
...
|
||||||
```
|
```
|
||||||
|
|
||||||
So, in this way we may just get end of a thread stack from the given `task_struct` structure:
|
So, this way we may just get end of a thread stack from the given `task_struct` structure:
|
||||||
|
|
||||||
```C
|
```C
|
||||||
#ifdef CONFIG_THREAD_INFO_IN_TASK
|
#ifdef CONFIG_THREAD_INFO_IN_TASK
|
||||||
@ -191,9 +191,9 @@ void __init __weak smp_setup_processor_id(void)
|
|||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
as it not implemented for all architectures, but some such as [s390](http://en.wikipedia.org/wiki/IBM_ESA/390) and [arm64](http://en.wikipedia.org/wiki/ARM_architecture#64.2F32-bit_architecture).
|
as it is not implemented for all architectures, barring a few such as [s390](http://en.wikipedia.org/wiki/IBM_ESA/390) and [arm64](http://en.wikipedia.org/wiki/ARM_architecture#64.2F32-bit_architecture).
|
||||||
|
|
||||||
The next function in `start_kernel` is `debug_objects_early_init`. Implementation of this function is almost the same as `lockdep_init`, but fills hashes for object debugging. As I wrote above, we will not see the explanation of this and other functions which are for debugging purposes in this chapter.
|
The next function in `start_kernel` is `debug_objects_early_init`. Implementation of this function is almost the same as `lockdep_init`, but fills hashes for object debugging. As I wrote above, we will not see the explanation of this and other functions used for debugging purposes in this chapter.
|
||||||
|
|
||||||
After the `debug_object_early_init` function we can see the call of the `boot_init_stack_canary` function which fills `task_struct->canary` with the `canary` value for the `-fstack-protector` gcc feature. This function depends on the `CONFIG_CC_STACKPROTECTOR` configuration option and if this option is disabled, `boot_init_stack_canary` does nothing, otherwise it generates random numbers based on random pool and the [TSC](http://en.wikipedia.org/wiki/Time_Stamp_Counter):
|
After the `debug_object_early_init` function we can see the call of the `boot_init_stack_canary` function which fills `task_struct->canary` with the `canary` value for the `-fstack-protector` gcc feature. This function depends on the `CONFIG_CC_STACKPROTECTOR` configuration option and if this option is disabled, `boot_init_stack_canary` does nothing, otherwise it generates random numbers based on random pool and the [TSC](http://en.wikipedia.org/wiki/Time_Stamp_Counter):
|
||||||
|
|
||||||
@ -215,7 +215,7 @@ and write this value to the top of the IRQ stack with the:
|
|||||||
this_cpu_write(irq_stack_union.stack_canary, canary); // read below about this_cpu_write
|
this_cpu_write(irq_stack_union.stack_canary, canary); // read below about this_cpu_write
|
||||||
```
|
```
|
||||||
|
|
||||||
Again, we will not dive into details here, we will cover it in the part about [IRQs](http://en.wikipedia.org/wiki/Interrupt_request_%28PC_architecture%29). As `canary` is set, we disable local and early boot IRQs and register the bootstrap CPU in the CPU maps. We disable local IRQs (interrupts for current CPU) with the `local_irq_disable` macro which expands to the call of the `arch_local_irq_disable` function from [include/linux/percpu-defs.h](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/include/linux/percpu-defs.h):
|
Again, we will not dive into details here, we will cover it in the part about [IRQs](http://en.wikipedia.org/wiki/Interrupt_request_%28PC_architecture%29). As `canary` is set, we disable local and early boot IRQs and register the bootstrap CPU in the CPU maps. We disable local IRQs (interrupts for the current CPU) with the `local_irq_disable` macro that expands to the call to the `arch_local_irq_disable` function from [include/linux/percpu-defs.h](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/include/linux/percpu-defs.h):
|
||||||
|
|
||||||
```C
|
```C
|
||||||
static inline notrace void arch_local_irq_disable(void)
|
static inline notrace void arch_local_irq_disable(void)
|
||||||
@ -241,7 +241,7 @@ For now it is just zero. If the `CONFIG_DEBUG_PREEMPT` configuration option is d
|
|||||||
#define raw_smp_processor_id() (this_cpu_read(cpu_number))
|
#define raw_smp_processor_id() (this_cpu_read(cpu_number))
|
||||||
```
|
```
|
||||||
|
|
||||||
`this_cpu_read` as many other function like this (`this_cpu_write`, `this_cpu_add` and etc...) defined in the [include/linux/percpu-defs.h](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/include/linux/percpu-defs.h) and presents `this_cpu` operation. These operations provide a way of optimizing access to the [per-cpu](https://0xax.gitbook.io/linux-insides/summary/concepts/linux-cpu-1) variables which are associated with the current processor. In our case it is `this_cpu_read`:
|
`this_cpu_read` as many other functions like this (`this_cpu_write`, `this_cpu_add` and etc...) are defined in the [include/linux/percpu-defs.h](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/include/linux/percpu-defs.h) and present `this_cpu` operation. Those operations provide a way of optimizing access to the [per-cpu](https://0xax.gitbook.io/linux-insides/summary/concepts/linux-cpu-1) variables which are associated with the current processor. In our case it is `this_cpu_read`:
|
||||||
|
|
||||||
```
|
```
|
||||||
__pcpu_size_call_return(this_cpu_read_, pcp)
|
__pcpu_size_call_return(this_cpu_read_, pcp)
|
||||||
@ -266,13 +266,13 @@ Remember that we have passed `cpu_number` as `pcp` to the `this_cpu_read` from t
|
|||||||
})
|
})
|
||||||
```
|
```
|
||||||
|
|
||||||
Yes, it looks a little strange but it's easy. First of all we can see the definition of the `pscr_ret__` variable with the `int` type. Why int? Ok, `variable` is `cpu_number` and it was declared as per-cpu int variable:
|
Yes, it looks a little strange but it's easy. First of all we can see the definition of the `pscr_ret__` variable with the `int` type. Why `int`? Ok, `variable` is `cpu_number` and it was declared as per-cpu `int` variable:
|
||||||
|
|
||||||
```C
|
```C
|
||||||
DECLARE_PER_CPU_READ_MOSTLY(int, cpu_number);
|
DECLARE_PER_CPU_READ_MOSTLY(int, cpu_number);
|
||||||
```
|
```
|
||||||
|
|
||||||
In the next step we call `__verify_pcpu_ptr` with the address of `cpu_number`. `__veryf_pcpu_ptr` used to verify that the given parameter is a per-cpu pointer. After that we set `pscr_ret__` value which depends on the size of the variable. Our `cpu_number` variable is `int`, so it's 4 bytes in size. It means that we will get `this_cpu_read_4(cpu_number)` in `pscr_ret__`. In the end of the `__pcpu_size_call_return` we just call it. `this_cpu_read_4` is a macro:
|
In the next step we call `__verify_pcpu_ptr` with the address of `cpu_number`. `__veryf_pcpu_ptr` is used to verify that the given parameter is a per-cpu pointer. After that we set `pscr_ret__` value depending on the size of the variable. Our `cpu_number` variable is `int`, so it's 4 bytes in size. It means that we will get `this_cpu_read_4(cpu_number)` in `pscr_ret__`. In the end of the `__pcpu_size_call_return` we just call it. `this_cpu_read_4` is a macro:
|
||||||
|
|
||||||
```C
|
```C
|
||||||
#define this_cpu_read_4(pcp) percpu_from_op("mov", pcp)
|
#define this_cpu_read_4(pcp) percpu_from_op("mov", pcp)
|
||||||
@ -296,7 +296,7 @@ is the same as:
|
|||||||
movl %gs:$cpu_number, $pfo_ret__
|
movl %gs:$cpu_number, $pfo_ret__
|
||||||
```
|
```
|
||||||
|
|
||||||
As we didn't setup per-cpu area, we have only one - for the current running CPU, we will get `zero` as a result of the `smp_processor_id`.
|
As we didn't setup a per-cpu area, we have only one - for the current running CPU, we will get `zero` as a result of the `smp_processor_id`.
|
||||||
|
|
||||||
As we got the current processor id, `boot_cpu_init` sets the given CPU online, active, present and possible with the:
|
As we got the current processor id, `boot_cpu_init` sets the given CPU online, active, present and possible with the:
|
||||||
|
|
||||||
@ -315,13 +315,13 @@ For example let's look at `set_cpu_possible`. As we passed `true` as the second
|
|||||||
cpumask_set_cpu(cpu, to_cpumask(cpu_possible_bits));
|
cpumask_set_cpu(cpu, to_cpumask(cpu_possible_bits));
|
||||||
```
|
```
|
||||||
|
|
||||||
will be called. First of all let's try to understand the `to_cpumask` macro. This macro casts a bitmap to a `struct cpumask *`. CPU masks provide a bitmap suitable for representing the set of CPU's in a system, one bit position per CPU number. CPU mask presented by the `cpumask` structure:
|
will be called. First of all let's try to understand the `to_cpumask` macro. This macro casts a bitmap to a `struct cpumask *`. CPU masks provide a bitmap suitable for representing the set of CPU's in a system, one bit position per CPU number. CPU mask is represented by the `cpumask` structure:
|
||||||
|
|
||||||
```C
|
```C
|
||||||
typedef struct cpumask { DECLARE_BITMAP(bits, NR_CPUS); } cpumask_t;
|
typedef struct cpumask { DECLARE_BITMAP(bits, NR_CPUS); } cpumask_t;
|
||||||
```
|
```
|
||||||
|
|
||||||
which is just bitmap declared with the `DECLARE_BITMAP` macro:
|
which is just a bitmap declared with the `DECLARE_BITMAP` macro:
|
||||||
|
|
||||||
```C
|
```C
|
||||||
#define DECLARE_BITMAP(name, bits) unsigned long name[BITS_TO_LONGS(bits)]
|
#define DECLARE_BITMAP(name, bits) unsigned long name[BITS_TO_LONGS(bits)]
|
||||||
@ -346,7 +346,7 @@ static inline int __check_is_bitmap(const unsigned long *bitmap)
|
|||||||
|
|
||||||
Yeah, it just returns `1` every time. Actually we need in it here only for one purpose: at compile time it checks that the given `bitmap` is a bitmap, or in other words it checks that the given `bitmap` has a type of `unsigned long *`. So we just pass `cpu_possible_bits` to the `to_cpumask` macro for converting the array of `unsigned long` to the `struct cpumask *`. Now we can call `cpumask_set_cpu` function with the `cpu` - 0 and `struct cpumask *cpu_possible_bits`. This function makes only one call of the `set_bit` function which sets the given `cpu` in the cpumask. All of these `set_cpu_*` functions work on the same principle.
|
Yeah, it just returns `1` every time. Actually we need in it here only for one purpose: at compile time it checks that the given `bitmap` is a bitmap, or in other words it checks that the given `bitmap` has a type of `unsigned long *`. So we just pass `cpu_possible_bits` to the `to_cpumask` macro for converting the array of `unsigned long` to the `struct cpumask *`. Now we can call `cpumask_set_cpu` function with the `cpu` - 0 and `struct cpumask *cpu_possible_bits`. This function makes only one call of the `set_bit` function which sets the given `cpu` in the cpumask. All of these `set_cpu_*` functions work on the same principle.
|
||||||
|
|
||||||
If you're not sure that this `set_cpu_*` operations and `cpumask` are not clear for you, don't worry about it. You can get more info by reading the special part about it - [cpumask](https://0xax.gitbook.io/linux-insides/summary/concepts/linux-cpu-2) or [documentation](https://www.kernel.org/doc/Documentation/cpu-hotplug.txt).
|
If you're not sure of this `set_cpu_*` operations and `cpumask` are not clear for you, don't worry about it. You can get more info by reading the special part about it - [cpumask](https://0xax.gitbook.io/linux-insides/summary/concepts/linux-cpu-2) or [documentation](https://www.kernel.org/doc/Documentation/cpu-hotplug.txt).
|
||||||
|
|
||||||
As we activated the bootstrap processor, it's time to go to the next function in the `start_kernel.` Now it is `page_address_init`, but this function does nothing in our case, because it executes only when all `RAM` can't be mapped directly.
|
As we activated the bootstrap processor, it's time to go to the next function in the `start_kernel.` Now it is `page_address_init`, but this function does nothing in our case, because it executes only when all `RAM` can't be mapped directly.
|
||||||
|
|
||||||
@ -377,7 +377,7 @@ Architecture-dependent parts of initialization
|
|||||||
|
|
||||||
The next step is architecture-specific initialization. The Linux kernel does it with the call of the `setup_arch` function. This is a very big function like `start_kernel` and we do not have time to consider all of its implementation in this part. Here we'll only start to do it and continue in the next part. As it is `architecture-specific`, we need to go again to the `arch/` directory. The `setup_arch` function defined in the [arch/x86/kernel/setup.c](https://github.com/torvalds/linux/blob/master/arch/x86/kernel/setup.c) source code file and takes only one argument - address of the kernel command line.
|
The next step is architecture-specific initialization. The Linux kernel does it with the call of the `setup_arch` function. This is a very big function like `start_kernel` and we do not have time to consider all of its implementation in this part. Here we'll only start to do it and continue in the next part. As it is `architecture-specific`, we need to go again to the `arch/` directory. The `setup_arch` function defined in the [arch/x86/kernel/setup.c](https://github.com/torvalds/linux/blob/master/arch/x86/kernel/setup.c) source code file and takes only one argument - address of the kernel command line.
|
||||||
|
|
||||||
This function starts from the reserving memory block for the kernel `_text` and `_data` which starts from the `_text` symbol (you can remember it from the [arch/x86/kernel/head_64.S](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/arch/x86/kernel/head_64.S#L46)) and ends before `__bss_stop`. We are using `memblock` for the reserving of memory block:
|
This function starts from the reserving memory block for the kernel `_text` and `_data` which starts from the `_text` symbol (you can remember it from the [arch/x86/kernel/head_64.S](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/arch/x86/kernel/head_64.S#L46)) and ends before `__bss_stop`. We are using `memblock` to reserve a memory block:
|
||||||
|
|
||||||
```C
|
```C
|
||||||
memblock_reserve(__pa_symbol(_text), (unsigned long)__bss_stop - (unsigned long)_text);
|
memblock_reserve(__pa_symbol(_text), (unsigned long)__bss_stop - (unsigned long)_text);
|
||||||
@ -415,7 +415,7 @@ u64 ramdisk_size = get_ramdisk_size();
|
|||||||
u64 ramdisk_end = PAGE_ALIGN(ramdisk_image + ramdisk_size);
|
u64 ramdisk_end = PAGE_ALIGN(ramdisk_image + ramdisk_size);
|
||||||
```
|
```
|
||||||
|
|
||||||
All of these parameters are taken from `boot_params`. If you have read the chapter about [Linux Kernel Booting Process](https://0xax.gitbook.io/linux-insides/summary/booting), you must remember that we filled the `boot_params` structure during boot time. The kernel setup header contains a couple of fields which describes ramdisk, for example:
|
All of these parameters are taken from `boot_params`. If you have read the chapter about [Linux Kernel Booting Process](https://0xax.gitbook.io/linux-insides/summary/booting), you must remember that we filled the `boot_params` structure during boot time. The kernel setup header contains a couple of fields which describe ramdisk, for example:
|
||||||
|
|
||||||
```
|
```
|
||||||
Field name: ramdisk_image
|
Field name: ramdisk_image
|
||||||
@ -440,13 +440,13 @@ static u64 __init get_ramdisk_image(void)
|
|||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
Here we get the address of the ramdisk from the `boot_params` and shift left it on `32`. We need to do it because as you can read in the [Documentation/x86/zero-page.txt](https://github.com/0xAX/linux/blob/0a07b238e5f488b459b6113a62e06b6aab017f71/Documentation/x86/zero-page.txt):
|
Here we get the address of the ramdisk from the `boot_params` and shift it left by `32`. We need to do it because as you can read in the [Documentation/x86/zero-page.txt](https://github.com/0xAX/linux/blob/0a07b238e5f488b459b6113a62e06b6aab017f71/Documentation/x86/zero-page.txt):
|
||||||
|
|
||||||
```
|
```
|
||||||
0C0/004 ALL ext_ramdisk_image ramdisk_image high 32bits
|
0C0/004 ALL ext_ramdisk_image ramdisk_image high 32bits
|
||||||
```
|
```
|
||||||
|
|
||||||
So after shifting it on 32, we're getting a 64-bit address in `ramdisk_image` and we return it. `get_ramdisk_size` works on the same principle as `get_ramdisk_image`, but it used `ext_ramdisk_size` instead of `ext_ramdisk_image`. After we got ramdisk's size, base address and end address, we check that bootloader provided ramdisk with the:
|
So after shifting it by 32, we're getting a 64-bit address in `ramdisk_image` and we return it. `get_ramdisk_size` works on the same principle as `get_ramdisk_image`, but it uses `ext_ramdisk_size` instead of `ext_ramdisk_image`. After we got ramdisk's size, base address and end address, we check that bootloader provided ramdisk with the:
|
||||||
|
|
||||||
```C
|
```C
|
||||||
if (!boot_params.hdr.type_of_loader ||
|
if (!boot_params.hdr.type_of_loader ||
|
||||||
@ -454,7 +454,7 @@ if (!boot_params.hdr.type_of_loader ||
|
|||||||
return;
|
return;
|
||||||
```
|
```
|
||||||
|
|
||||||
and reserve memory block with the calculated addresses for the initial ramdisk in the end:
|
and reserve a memory block with the calculated addresses for the initial ramdisk in the end:
|
||||||
|
|
||||||
```C
|
```C
|
||||||
memblock_reserve(ramdisk_image, ramdisk_end - ramdisk_image);
|
memblock_reserve(ramdisk_image, ramdisk_end - ramdisk_image);
|
||||||
|
Loading…
Reference in New Issue
Block a user