From 872b5fe292469ed838173b40b52055adfa489b8f Mon Sep 17 00:00:00 2001 From: Jakub Kramarz Date: Sun, 5 Jul 2015 22:13:18 +0200 Subject: [PATCH 01/17] typo fixes in interrupts-6.md --- interrupts/interrupts-6.md | 26 +++++++++++++------------- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/interrupts/interrupts-6.md b/interrupts/interrupts-6.md index 247e47e..ea9abac 100644 --- a/interrupts/interrupts-6.md +++ b/interrupts/interrupts-6.md @@ -4,9 +4,9 @@ Interrupts and Interrupt Handling. Part 6. Non-maskable interrupt handler -------------------------------------------------------------------------------- -It is sixth part of the [Interrupts and Interrupt Handling in the Linux kernel](http://0xax.gitbooks.io/linux-insides/content/interrupts/index.html) chapter and in the previous [part](http://0xax.gitbooks.io/linux-insides/content/interrupts/interrupts-5.html) we saw implementation of some exception handlers for the [General Protection Fault](https://en.wikipedia.org/wiki/General_protection_fault) exception, divide excetpion, invalid [opcode](https://en.wikipedia.org/wiki/Opcode) excetpion and etc. As I wrote in the previous part we will see implementations of the rest excetpions in this part. We will see implementation of the following handlers: +It is sixth part of the [Interrupts and Interrupt Handling in the Linux kernel](http://0xax.gitbooks.io/linux-insides/content/interrupts/index.html) chapter and in the previous [part](http://0xax.gitbooks.io/linux-insides/content/interrupts/interrupts-5.html) we saw implementation of some exception handlers for the [General Protection Fault](https://en.wikipedia.org/wiki/General_protection_fault) exception, divide exception, invalid [opcode](https://en.wikipedia.org/wiki/Opcode) exceptions and etc. As I wrote in the previous part we will see implementations of the rest exceptions in this part. We will see implementation of the following handlers: -* [Non-Masksable](https://en.wikipedia.org/wiki/Non-maskable_interrupt) interrupt; +* [Non-Maskable](https://en.wikipedia.org/wiki/Non-maskable_interrupt) interrupt; * [BOUND](http://pdos.csail.mit.edu/6.828/2005/readings/i386/BOUND.htm) Range Exceeded Exception; * [Coprocessor](https://en.wikipedia.org/wiki/Coprocessor) exception; * [SIMD](https://en.wikipedia.org/wiki/SIMD) coprocessor exception. @@ -16,7 +16,7 @@ in this part. So, let's start. Non-Maskable interrupt handling -------------------------------------------------------------------------------- -A [Non-Masksable](https://en.wikipedia.org/wiki/Non-maskable_interrupt) interrupt is a hardware interrupt that cannot be ignore by standard masking techniques. In a general way, a non-maskable interrupt can be generated in either of two ways: +A [Non-Maskable](https://en.wikipedia.org/wiki/Non-maskable_interrupt) interrupt is a hardware interrupt that cannot be ignore by standard masking techniques. In a general way, a non-maskable interrupt can be generated in either of two ways: * External hardware asserts the non-maskable interrupt [pin](https://en.wikipedia.org/wiki/CPU_socket) on the CPU. * The processor receives a message on the system bus or the APIC serial bus with a delivery mode `NMI`. @@ -49,7 +49,7 @@ ENTRY(nmi) END(nmi) ``` -in the same [arch/x86/entry/entry_64.S](https://github.com/torvalds/linux/blob/master/arch/x86/entry/entry_64.S) assembly file. Let's dive into it and will try to understand how `Non-Maskable` interrupt handler works. The `nmi` handlers starts from the call of the: +in the same [arch/x86/entry/entry_64.S](https://github.com/torvalds/linux/blob/master/arch/x86/entry/entry_64.S) assembly file. Lets dive into it and will try to understand how `Non-Maskable` interrupt handler works. The `nmi` handlers starts from the call of the: ```assembly PARAVIRT_ADJUST_EXCEPTION_FRAME @@ -123,7 +123,7 @@ pushq 11*8(%rsp) .endr ``` -with the [.rept](http://tigcc.ticalc.org/doc/gnuasm.html#SEC116) assembly directive. We need in the copy of the original stack frame. Generally we need in two copies of the interrupt stack. First is `copied` interrupts stack: `saved` stack frame and `copied` stack frame. Now we pushes original stack frame to the `saved` stack frame which locates after the just allocated `40` bytes (`copied` stack frame). This stack frame is used to fixup the `copied` stack frame that a nested NMI may change. The second - `copied` stack frame modifed by any nested `NMIs` to let the first `NMI` know that we triggered a second `NMI` and we shoult rebepat the first `NMI` handler. Ok, we have made first copy of the original stack frame, now time to make second copy: +with the [.rept](http://tigcc.ticalc.org/doc/gnuasm.html#SEC116) assembly directive. We need in the copy of the original stack frame. Generally we need in two copies of the interrupt stack. First is `copied` interrupts stack: `saved` stack frame and `copied` stack frame. Now we pushes original stack frame to the `saved` stack frame which locates after the just allocated `40` bytes (`copied` stack frame). This stack frame is used to fixup the `copied` stack frame that a nested NMI may change. The second - `copied` stack frame modified by any nested `NMIs` to let the first `NMI` know that we triggered a second `NMI` and we should repeat the first `NMI` handler. Ok, we have made first copy of the original stack frame, now time to make second copy: ```assembly addq $(10*8), %rsp @@ -162,7 +162,7 @@ After all of these manipulations our stack frame will be like this: +-------------------------+ ``` -After this we push dummy error code on the stack as we did it already in the previous exception handlers and allocate space for the general purpose regiseters on the stack: +After this we push dummy error code on the stack as we did it already in the previous exception handlers and allocate space for the general purpose registers on the stack: ```assembly pushq $-1 @@ -183,7 +183,7 @@ After space allocation for the general registers we can see call of the `paranoi call paranoid_entry ``` -We can remember from the previous parts this label. It pushes general purpose regisrers on the stack, reads `MSR_GS_BASE` [Model Specific regiser](https://en.wikipedia.org/wiki/Model-specific_register) and checks its value. If the value of the `MSR_GS_BASE` is negative, we came from the kernel mode and just return from the `paranoid_entry`, in other way it means that we came from the usermode and need to execeute `swapgs` instruction which will change user `gs` with the kernel `gs`: +We can remember from the previous parts this label. It pushes general purpose registers on the stack, reads `MSR_GS_BASE` [Model Specific register](https://en.wikipedia.org/wiki/Model-specific_register) and checks its value. If the value of the `MSR_GS_BASE` is negative, we came from the kernel mode and just return from the `paranoid_entry`, in other way it means that we came from the usermode and need to execute `swapgs` instruction which will change user `gs` with the kernel `gs`: ```assembly ENTRY(paranoid_entry) @@ -201,7 +201,7 @@ ENTRY(paranoid_entry) END(paranoid_entry) ``` -Note that after the `swapgs` instruction we zeroed the `ebx` register. Next time we will check content of this register and if we executed `swapgs` than `ebx` must contain `0` and `1` in other way. In the next step we store value of the `cr2` [control register](https://en.wikipedia.org/wiki/Control_register) to the `r12` register, because the `NMI` handler can cause `page fault` and corrup the value of this control register: +Note that after the `swapgs` instruction we zeroed the `ebx` register. Next time we will check content of this register and if we executed `swapgs` than `ebx` must contain `0` and `1` in other way. In the next step we store value of the `cr2` [control register](https://en.wikipedia.org/wiki/Control_register) to the `r12` register, because the `NMI` handler can cause `page fault` and corrupt the value of this control register: ```C movq %cr2, %r12 @@ -215,7 +215,7 @@ movq $-1, %rsi call do_nmi ``` -We will back to the `do_nmi` little later in this part, but now let's look what occurs after the `do_nmi` will finish its exceution. After the `do_nmi` handler will be finished we check the `cr2` register, because we can got page fault during `do_nmi` performed and if we got it we restore original `cr2`, in other way we jump on the label `1`. After this we test content of the `ebx` register (remember it must contain `0` if we have used `swapgs` instruction and `1` if we didn't use it) and execute `SWAPGS_UNSAFE_STACK` if it contains `1` or jump to the `nmi_restore` label. The `SWAPGS_UNSAFE_STACK` macro just expands to the `swapgs` instruction. In the `nmi_restore` label we restore general purpose registers, clear allocated space on the stack for this registers clear our temporary variable and exit from the interrupt handler with the `INTERRUPT_RETURN` macro: +We will back to the `do_nmi` little later in this part, but now let's look what occurs after the `do_nmi` will finish its execution. After the `do_nmi` handler will be finished we check the `cr2` register, because we can got page fault during `do_nmi` performed and if we got it we restore original `cr2`, in other way we jump on the label `1`. After this we test content of the `ebx` register (remember it must contain `0` if we have used `swapgs` instruction and `1` if we didn't use it) and execute `SWAPGS_UNSAFE_STACK` if it contains `1` or jump to the `nmi_restore` label. The `SWAPGS_UNSAFE_STACK` macro just expands to the `swapgs` instruction. In the `nmi_restore` label we restore general purpose registers, clear allocated space on the stack for this registers clear our temporary variable and exit from the interrupt handler with the `INTERRUPT_RETURN` macro: ```assembly movq %cr2, %rcx @@ -239,14 +239,14 @@ nmi_restore: where `INTERRUPT_RETURN` is defined in the [arch/x86/include/irqflags.h](https://github.com/torvalds/linux/blob/master/arch/x86/include/irqflags.h) and just expands to the `iret` instruction. That's all. -Now let's consider case when another `NMI` interrupt occured when previous `NMI` interrupt didn't finish its execution. You can remember from the beginnig of this part that we've made a check that we came from userspace and jump on the `first_nmi` in this case: +Now let's consider case when another `NMI` interrupt occurred when previous `NMI` interrupt didn't finish its execution. You can remember from the beginning of this part that we've made a check that we came from userspace and jump on the `first_nmi` in this case: ```assembly cmpl $__KERNEL_CS, 16(%rsp) jne first_nmi ``` -Note that in this case it is first `NMI` everytime, because if the first `NMI` catched page fault, breakpoint or another exception it will be executed in the kernel mode. If we didn't come from userspace, first of all we test our temporary variable: +Note that in this case it is first `NMI` every time, because if the first `NMI` catched page fault, breakpoint or another exception it will be executed in the kernel mode. If we didn't come from userspace, first of all we test our temporary variable: ```assembly cmpl $1, -8(%rsp) @@ -310,7 +310,7 @@ That's all. Range Exceeded Exception -------------------------------------------------------------------------------- -The next exception is the `BOUND` range exceeded exception. The `BOUND` instruction determines if the first operand (array index) is within the bounds of an array specified the second operand (bounds operand). If the index is not within bounds, a `BOUND` range exceeded exception or `#BR` is occured. The handler of the `#BR` exception is the `do_bounds` function that defined in the [arch/x86/kernel/traps.c](https://github.com/torvalds/linux/blob/master/arch/x86/kernel/traps.c). The `do_bounds` handler starts with the call of the `exception_enter` function and ends with the call of the `exception_exit`: +The next exception is the `BOUND` range exceeded exception. The `BOUND` instruction determines if the first operand (array index) is within the bounds of an array specified the second operand (bounds operand). If the index is not within bounds, a `BOUND` range exceeded exception or `#BR` is occurred. The handler of the `#BR` exception is the `do_bounds` function that defined in the [arch/x86/kernel/traps.c](https://github.com/torvalds/linux/blob/master/arch/x86/kernel/traps.c). The `do_bounds` handler starts with the call of the `exception_enter` function and ends with the call of the `exception_exit`: ```C prev_state = exception_enter(); @@ -457,7 +457,7 @@ Links * [General Protection Fault](https://en.wikipedia.org/wiki/General_protection_fault) * [opcode](https://en.wikipedia.org/wiki/Opcode) -* [Non-Masksable](https://en.wikipedia.org/wiki/Non-maskable_interrupt) +* [Non-Maskable](https://en.wikipedia.org/wiki/Non-maskable_interrupt) * [BOUND instruction](http://pdos.csail.mit.edu/6.828/2005/readings/i386/BOUND.htm) * [CPU socket](https://en.wikipedia.org/wiki/CPU_socket) * [Interrupt Descriptor Table](https://en.wikipedia.org/wiki/Interrupt_descriptor_table) From 3e84a906e4aaa533fb4260bb34442381d6593d00 Mon Sep 17 00:00:00 2001 From: 0xAX Date: Wed, 8 Jul 2015 13:23:33 +0600 Subject: [PATCH 02/17] Update linux-bootstrap-1.md --- Booting/linux-bootstrap-1.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Booting/linux-bootstrap-1.md b/Booting/linux-bootstrap-1.md index bd9a0c5..42ad15d 100644 --- a/Booting/linux-bootstrap-1.md +++ b/Booting/linux-bootstrap-1.md @@ -135,7 +135,7 @@ We will see: In this example we can see that this code will be executed in 16 bit real mode and will start at 0x7c00 in memory. After the start it calls the [0x10](http://www.ctyme.com/intr/rb-0106.htm) interrupt which just prints `!` symbol. It fills rest of 510 bytes with zeros and finish with two magic bytes `0xaa` and `0x55`. -Although you can see binary dump of it with `objdump` util: +You can see binary dump of it with `objdump` util: ``` nasm -f bin boot.nasm From 25c03728b2f2f046b19bd5946669d741d34fa3cd Mon Sep 17 00:00:00 2001 From: Waqar Ahmed Date: Wed, 8 Jul 2015 19:24:29 +0500 Subject: [PATCH 03/17] Update Linux-Bootsrap-2.md Fix minor mistakes updated some sentences. added more explanation and code. --- Booting/linux-bootstrap-2.md | 51 +++++++++++++++++++++++++----------- 1 file changed, 36 insertions(+), 15 deletions(-) diff --git a/Booting/linux-bootstrap-2.md b/Booting/linux-bootstrap-2.md index fd08112..2379001 100644 --- a/Booting/linux-bootstrap-2.md +++ b/Booting/linux-bootstrap-2.md @@ -313,9 +313,9 @@ After that `biosregs` structure is filled with `memset`, `bios_putchar` calls th Heap initialization -------------------------------------------------------------------------------- -After the stack and bss section were prepared in [header.S](https://github.com/torvalds/linux/blob/master/arch/x86/boot/header.S) (see previous [part](linux-bootstrap-1.md)), the kernel needs to initialize the [heap](https://github.com/torvalds/linux/blob/master/arch/x86/boot/main.c#L116) with the [init_heap](https://github.com/torvalds/linux/blob/master/arch/x86/boot/main.c#L116) function. +After the stack and bss section were prepared in [header.S](https://github.com/torvalds/linux/blob/master/arch/x86/boot/header.S) (see previous [part](linux-bootstrap-1.md)), the kernel needs to initialize the [heap](https://github.com/torvalds/linux/blob/master/arch/x86/boot/main.c#L116) with the [`init_heap`](https://github.com/torvalds/linux/blob/master/arch/x86/boot/main.c#L116) function. -First of all `init_heap` checks the [`CAN_USE_HEAP`](https://github.com/torvalds/linux/blob/master/arch/x86/include/uapi/asm/bootparam.h#L21) flag from the [`loadflags`](https://github.com/torvalds/linux/blob/master/arch/x86/boot/header.S#L321) kernel setup header and calculates the end of the stack if this flag was set: +First of all `init_heap` checks the [`CAN_USE_HEAP`](https://github.com/torvalds/linux/blob/master/arch/x86/include/uapi/asm/bootparam.h#L21) flag from the [`loadflags`](https://github.com/torvalds/linux/blob/master/arch/x86/boot/header.S#L321) in the kernel setup header and calculates the end of the stack if this flag was set: ```C char *stack_end; @@ -327,9 +327,13 @@ First of all `init_heap` checks the [`CAN_USE_HEAP`](https://github.com/torvalds or in other words `stack_end = esp - STACK_SIZE`. -Then there is the `heap_end` calculation which is `heap_end_ptr` or `_end` + 512 and a check if `heap_end` is greater than `stack_end` makes it equal. +Then there is the `heap_end` calculation: +```c + heap_end = (char *)((size_t)boot_params.hdr.heap_end_ptr + 0x200); +``` +which means `heap_end_ptr` or `_end` + `512`(`0x200h`). And at the last is checked that whether `heap_end` is greater than `stack_end`. If it is then `stack_end` is assigned to `heap_end` to make them equal. -From this moment we can use the heap in the kernel setup code. We will see how to use it and how the API for it is implemented in the next posts. +Now the heap is initialized and we can use it using the `GET_HEAP` method. We will see how it is used, how to use it and how the it is implemented in the next posts. CPU validation -------------------------------------------------------------------------------- @@ -344,12 +348,14 @@ check_cpu(&cpu_level, &req_level, &err_flags); return -1; } ``` -It checks the cpu's flags, presence of [long mode](http://en.wikipedia.org/wiki/Long_mode) (which we will see in more detail in the next sections) in case of x86_64(64-bit) CPU, checks the processor's vendor and makes preparation for certain vendors like turning off SSE+SSE2 for AMD if they are missing, etc. +`check_cpu` checks the cpu's flags, presence of [long mode](http://en.wikipedia.org/wiki/Long_mode) in case of x86_64(64-bit) CPU, checks the processor's vendor and makes preparation for certain vendors like turning off SSE+SSE2 for AMD if they are missing, etc. Memory detection -------------------------------------------------------------------------------- -The next step is memory detection by the `detect_memory` function. It uses different programming interfaces for memory detection like `0xe820`, `0xe801` and `0x88`. We will see only the implementation of 0xE820 here. Let's look into the `detect_memory_e820` implementation from the [arch/x86/boot/memory.c](https://github.com/torvalds/linux/blob/master/arch/x86/boot/memory.c) source file. First of all, the `detect_memory_e820` function initializes the `biosregs` structure as we saw above and fills registers with special values for the `0xe820` call: +The next step is memory detection by the `detect_memory` function. `detect_memory` basically provides a map of available RAM to the cpu. It uses different programming interfaces for memory detection like `0xe820`, `0xe801` and `0x88`. We will see only the implementation of **0xE820** here. + +Let's look into the `detect_memory_e820` implementation from the [arch/x86/boot/memory.c](https://github.com/torvalds/linux/blob/master/arch/x86/boot/memory.c) source file. First of all, the `detect_memory_e820` function initializes the `biosregs` structure as we saw above and fills registers with special values for the `0xe820` call: ```assembly initregs(&ireg); @@ -359,9 +365,13 @@ The next step is memory detection by the `detect_memory` function. It uses diffe ireg.di = (size_t)&buf; ``` -The `ax` register must contain the number of the function (0xe820 in our case), `cx` register contains size of the buffer which will contain data about memory, `edx` must contain the `SMAP` magic number, `es:di` must contain the address of the buffer which will contain memory data and `ebx` has to be zero. +* `ax` contains the number of the function (0xe820 in our case) +* `cx` register contains size of the buffer which will contain data about memory +* `edx` must contain the `SMAP` magic number +* `es:di` must contain the address of the buffer which will contain memory data +* `ebx` has to be zero. -Next is a loop where data about the memory will be collected. It starts from the call of the 0x15 bios interrupt, which writes one line from the address allocation table. For getting the next line we need to call this interrupt again (which we do in the loop). Before the next call `ebx` must contain the value returned previously: +Next is a loop where data about the memory will be collected. It starts from the call of the `0x15` BIOS interrupt, which writes one line from the address allocation table. For getting the next line we need to call this interrupt again (which we do in the loop). Before the next call `ebx` must contain the value returned previously: ```C intcall(0x15, &ireg, &oreg); @@ -389,7 +399,18 @@ You can see the result of this in the `dmesg` output, something like: Keyboard initialization -------------------------------------------------------------------------------- -The next step is the initialization of the keyboard with the call of the [`keyboard_init()`](https://github.com/torvalds/linux/blob/master/arch/x86/boot/main.c#L65) function. At first `keyboard_init` initializes registers using the `initregs` function and calling the [0x16](http://www.ctyme.com/intr/rb-1756.htm) interrupt for getting the keyboard status. After this it calls [0x16](http://www.ctyme.com/intr/rb-1757.htm) again to set repeat rate and delay. +The next step is the initialization of the keyboard with the call of the [`keyboard_init()`](https://github.com/torvalds/linux/blob/master/arch/x86/boot/main.c#L65) function. At first `keyboard_init` initializes registers using the `initregs` function and calling the [0x16](http://www.ctyme.com/intr/rb-1756.htm) interrupt for getting the keyboard status. +```c + initregs(&ireg); + ireg.ah = 0x02; /* Get keyboard status */ + intcall(0x16, &ireg, &oreg); + boot_params.kbd_status = oreg.al; +``` +After this it calls [0x16](http://www.ctyme.com/intr/rb-1757.htm) again to set repeat rate and delay. +```c + ireg.ax = 0x0305; /* Set keyboard repeat rate */ + intcall(0x16, &ireg, NULL); +``` Querying -------------------------------------------------------------------------------- @@ -422,7 +443,7 @@ int query_mca(void) } ``` -It fills the `ah` register with `0xc0` and calls the `0x15` BIOS interruption. After the interrupt execution it checks the [carry flag](http://en.wikipedia.org/wiki/Carry_flag) and if it is set to 1, the BIOS doesn't support `MCA`. If carry flag is set to 0, `ES:BX` will contain a pointer to the system information table, which looks like this: +It fills the `ah` register with `0xc0` and calls the `0x15` BIOS interruption. After the interrupt execution it checks the [carry flag](http://en.wikipedia.org/wiki/Carry_flag) and if it is set to 1, the BIOS doesn't support (**MCA**)[https://en.wikipedia.org/wiki/Micro_Channel_architecture]. If carry flag is set to 0, `ES:BX` will contain a pointer to the system information table, which looks like this: ``` Offset Size Description ) @@ -460,15 +481,15 @@ static inline void set_fs(u16 seg) } ``` -There is inline assembly which gets the value of the `seg` parameter and puts it into the `fs` register. There are many functions in [boot.h](https://github.com/torvalds/linux/blob/master/arch/x86/boot/boot.h) like `set_fs`, for example `set_gs`, `fs`, `gs` for reading a value in it etc... +This function contains inline assembly which gets the value of the `seg` parameter and puts it into the `fs` register. There are many functions in [boot.h](https://github.com/torvalds/linux/blob/master/arch/x86/boot/boot.h) like `set_fs`, for example `set_gs`, `fs`, `gs` for reading a value in it etc... At the end of `query_mca` it just copies the table which pointed to by `es:bx` to the `boot_params.sys_desc_table`. The next step is getting [Intel SpeedStep](http://en.wikipedia.org/wiki/SpeedStep) information by calling the `query_ist` function. First of all it checks the CPU level and if it is correct, calls `0x15` for getting info and saves the result to `boot_params`. -The following [query_apm_bios](https://github.com/torvalds/linux/blob/master/arch/x86/boot/apm.c#L21) function gets [Advanced Power Management](http://en.wikipedia.org/wiki/Advanced_Power_Management) information from the BIOS. `query_apm_bios` calls the `0x15` BIOS interruption too, but with `ah` - `0x53` to check `APM` installation. After the `0x15` execution, `query_apm_bios` functions checks `PM` signature (it must be `0x504d`), carry flag (it must be 0 if `APM` supported) and value of the `cx` register (if it's 0x02, protected mode interface is supported). +The following [query_apm_bios](https://github.com/torvalds/linux/blob/master/arch/x86/boot/apm.c#L21) function gets [Advanced Power Management](http://en.wikipedia.org/wiki/Advanced_Power_Management) information from the BIOS. `query_apm_bios` calls the `0x15` BIOS interruption too, but with `ah` = `0x53` to check `APM` installation. After the `0x15` execution, `query_apm_bios` functions checks `PM` signature (it must be `0x504d`), carry flag (it must be 0 if `APM` supported) and value of the `cx` register (if it's 0x02, protected mode interface is supported). -Next it calls the `0x15` again, but with `ax = 0x5304` for disconnecting the `APM` interface and connect the 32bit protected mode interface. In the end it fills `boot_params.apm_bios_info` with values obtained from the BIOS. +Next it calls the `0x15` again, but with `ax = 0x5304` for disconnecting the `APM` interface and connecting the 32-bit protected mode interface. In the end it fills `boot_params.apm_bios_info` with values obtained from the BIOS. Note that `query_apm_bios` will be executed only if `CONFIG_APM` or `CONFIG_APM_MODULE` was set in configuration file: @@ -478,7 +499,7 @@ Note that `query_apm_bios` will be executed only if `CONFIG_APM` or `CONFIG_APM_ #endif ``` -The last is the [`query_edd`](https://github.com/torvalds/linux/blob/master/arch/x86/boot/edd.c#L122) function, which asks `Enhanced Disk Drive` information from the BIOS. Let's look into the `query_edd` implementation. +The last is the [`query_edd`](https://github.com/torvalds/linux/blob/master/arch/x86/boot/edd.c#L122) function, which queries `Enhanced Disk Drive` information from the BIOS. Let's look into the `query_edd` implementation. First of all it reads the [edd](https://github.com/torvalds/linux/blob/master/Documentation/kernel-parameters.txt#L1023) option from kernel's command line and if it was set to `off` then `query_edd` just returns. @@ -496,7 +517,7 @@ If EDD is enabled, `query_edd` goes over BIOS-supported hard disks and queries E ... ``` -where `0x80` is the first hard drive and the `EDD_MBR_SIG_MAX` macro is 16. It collects data into the array of [edd_info](https://github.com/torvalds/linux/blob/master/include/uapi/linux/edd.h#L172) structures. `get_edd_info` checks that EDD is present by invoking the `0x13` interrupt with `ah` as `0x41` and if EDD is present, `get_edd_info` again calls the `0x13` interrupt, but with `ah` as `0x48` and `si` containing the address of the buffer where EDD information will be stored. +where `0x80` is the first hard drive and the value of `EDD_MBR_SIG_MAX` macro is 16. It collects data into the array of [edd_info](https://github.com/torvalds/linux/blob/master/include/uapi/linux/edd.h#L172) structures. `get_edd_info` checks that EDD is present by invoking the `0x13` interrupt with `ah` as `0x41` and if EDD is present, `get_edd_info` again calls the `0x13` interrupt, but with `ah` as `0x48` and `si` containing the address of the buffer where EDD information will be stored. Conclusion -------------------------------------------------------------------------------- From 40fa470600b6c51fa20fbb718f48c0ffca9855d2 Mon Sep 17 00:00:00 2001 From: 0xAX Date: Fri, 10 Jul 2015 14:59:07 +0600 Subject: [PATCH 04/17] Update SUMMARY.md --- SUMMARY.md | 1 + 1 file changed, 1 insertion(+) diff --git a/SUMMARY.md b/SUMMARY.md index e85464a..09f574b 100644 --- a/SUMMARY.md +++ b/SUMMARY.md @@ -24,6 +24,7 @@ * [Initialization of non-early interrupt gates](interrupts/interrupts-4.md) * [Implementation of some exception handlers](interrupts/interrupts-5.md) * [Handling Non-Maskable interrupts](interrupts/interrupts-6.md) + * [External hardware interrupts]() * [Memory management](mm/README.md) * [Memblock](mm/linux-mm-1.md) * [Fixmaps and ioremap](mm/linux-mm-2.md) From 0c01f882cd7f47d65d39d46a6908f659c101c3a6 Mon Sep 17 00:00:00 2001 From: 0xAX Date: Sun, 12 Jul 2015 20:24:48 +0600 Subject: [PATCH 05/17] Create interrupts-7.md --- interrupts/interrupts-7.md | 461 +++++++++++++++++++++++++++++++++++++ 1 file changed, 461 insertions(+) create mode 100644 interrupts/interrupts-7.md diff --git a/interrupts/interrupts-7.md b/interrupts/interrupts-7.md new file mode 100644 index 0000000..43194b1 --- /dev/null +++ b/interrupts/interrupts-7.md @@ -0,0 +1,461 @@ +Interrupts and Interrupt Handling. Part 7. +================================================================================ + +Introduction to external interrupts +-------------------------------------------------------------------------------- + +This is the seventh part of the Interrupts and Interrupt Handling in the Linux kernel [chapter](http://0xax.gitbooks.io/linux-insides/content/interrupts/index.html) and in the previous [part](http://0xax.gitbooks.io/linux-insides/content/interrupts/interrupts-6.html) we have finished with the exceptions which are generated by the processor. In this part we will continue to dive to the interrupt handling and will start with the external handware interrupt handling. As you can remember, in the previous part we have finsihed with the `trap_init` function from the [arch/x86/kernel/trap.c](https://github.com/torvalds/linux/blob/master/arch/x86/kernel/traps.c) and the next step is the call of the `early_irq_init` function from the [init/main.c](https://github.com/torvalds/linux/blob/master/init/main.c). + +Interrupts are signal that are sent across [IRQ](https://en.wikipedia.org/wiki/Interrupt_request_%28PC_architecture%29) or `Interrupt Request Line` by a hardware or software. External hardware interrupts allow devices like keyboard, mouse and etc, to indicate that it needs attention of the processor. Once the processor receives the `Interrupt Request`, it will temporary stop execution of the running program and invoke special routine which depends on an interrupt. We already know that this routine is called interrupt handler (or how we will call it `ISR` or `Interrupt Service Routine` from this part). The `ISR` or `Interrupt Handler Routine` can be found in Interrupt Vector table that is located at fixed address in the memory. After the interrupt is handled processor resumes the interrupted process. At the boot/initialization time, the Linux kernel identifies all devices in the machine, and appropriate interrupt handlers are loaded into the interrupt table. As we saw in the previous parts, most exceptions are handled simply by the sending a [Unix signal](https://en.wikipedia.org/wiki/Unix_signal) to the interrupted process. That's why kernel is can handle an exception quickly. Unfortunatelly we can not use this approach for the external handware interrupts, because often they arrive after (and sometimes long after) the process to which they are related has been suspended. So it would make no sense to send a Unix signal to the current process. External interrupt handling depends on the type of an interrupt: + +* `I/O` interrupts; +* Timer interrupts; +* Interprocessor interrupts. + +I will try to describe all types of interrupts in this book. + +Generally, a handler of an `I/O` interrupt must be flexible enough to service several devices at the same time. For exmaple in the [PCI](https://en.wikipedia.org/wiki/Conventional_PCI) bus architecture several devices may share the same `IRQ` line. In the simplest way the Linux kernel must do following thing when an `I/O` interrupt occured: + +* Save the value of an `IRQ` and the register's contents on the kernel stack; +* Send an acknowledgment to the hardware controller which is servicing the `IRQ` line; +* Execute the interrupt service routine (next we will call it `ISR`) which is associated with the device; +* Restore registers and return from an interrupt; + +Ok, we know a little theory and now let's start with the `early_irq_init` function. The implementation of the `early_irq_init` function is in the [kernel/irq/irqdesc.c](https://github.com/torvalds/linux/blob/master/kernel/irq/irqdesc.c). This function make early initialziation of the `irq_desc` structure. The `irq_desc` structure is the foundation of interrupt management code in the Linux kernel. An array of this structure, which has the same name - `irq_desc`, keeps track of every interrupt request source in the Linux kernel. This structure defined in the [include/linux/irqdesc.h](https://github.com/torvalds/linux/blob/master/include/linux/irqdesc.h) and as you can note it depends on the `CONFIG_SPARSE_IRQ` kernel configuration option. This kernel configuration option enables support for sparse irqs. The `irq_desc` structure contains many different fiels: + +* `irq_common_data` - per irq and chip data passed down to chip functions; +* `status_use_accessors` - contains status of the interrupt source which is can be combination of of the values from the `enum` from the [include/linux/irq.h](https://github.com/torvalds/linux/blob/master/include/linux/irq.h) and different macros which are defined in the same source code file; +* `kstat_irqs` - irq stats per-cpu; +* `handle_irq` - highlevel irq-events handler; +* `action` - identifies the interrupt service routines to be invoked when the [IRQ](https://en.wikipedia.org/wiki/Interrupt_request_%28PC_architecture%29) occurs; +* `irq_count` - counter of interrupt occurrences on the IRQ line; +* `depth` - `0` if the IRQ line is enabled and a positive value if it has been disabled at least once; +* `last_unhandled` - aging timer for unhandled count; +* `irqs_unhandled` - count of the unhandled interrupts; +* `lock` - a spin lock used to serialize the accesses to the `IRQ` descriptor; +* `pending_mask` - pending rebalanced interrupts; +* `owner` - an owner of interrupt descriptor. Interrupt descriptors can be allocated from modules. This field is need to proved refcount on the module which provides the interrupts; +* and etc. + +Of course it is not all fields of the `irq_desc` structure, because it is too long to describe each field of this structure, but we will see it all soon. Now let's start to dive into the implementation of the `early_irq_init` function. + +Early external interrupts initialization +-------------------------------------------------------------------------------- + +Now, let's look on the implementation of the `early_irq_init` function. Note that implementation of the `early_irq_init` function depends on the `CONFIG_SPARSE_IRQ` kernel configuration option. Now we consider implementation of the `early_irq_init` function when the `CONFIG_SPARSE_IRQ` kernel configuration option is not set. This function starts from the declaration of the following variables: `irq` descriptors counter, loop counter, memory node and the `irq_desc` descriptor: + +```C +int __init early_irq_init(void) +{ + int count, i, node = first_online_node; + struct irq_desc *desc; + ... + ... + ... +} +``` + +The `node` is an online [NUMA](https://en.wikipedia.org/wiki/Non-uniform_memory_access) node which depends on the `MAX_NUMNODES` value which depends on the `CONFIG_NODES_SHIFT` kernel configuration parameter: + +```C +#define MAX_NUMNODES (1 << NODES_SHIFT) +... +... +... +#ifdef CONFIG_NODES_SHIFT + #define NODES_SHIFT CONFIG_NODES_SHIFT +#else + #define NODES_SHIFT 0 +#endif +``` + +As I already wrote, implementation of the `first_online_node` macro depends on the `MAX_NUMNODES` value: + +```C +#if MAX_NUMNODES > 1 + #define first_online_node first_node(node_states[N_ONLINE]) +#else + #define first_online_node 0 +``` + +The `node_states` is the [enum](https://en.wikipedia.org/wiki/Enumerated_type) which defined in the [include/linux/nodemask.h](https://github.com/torvalds/linux/blob/master/include/linux/nodemask.h) and represent the set of the states of a node. In our case we are searching an online node and it will be `0` if `MAX_NUMNODES` is one or zero. If the `MAX_NUMNODES` is greater than one, the `node_states[N_ONLINE]` will return `1` and the `first_node` macro will be expands to the call of the `__first_node` function which will return `minimal` or the first online node: + +```C +#define first_node(src) __first_node(&(src)) + +static inline int __first_node(const nodemask_t *srcp) +{ + return min_t(int, MAX_NUMNODES, find_first_bit(srcp->bits, MAX_NUMNODES)); +} +``` + +More about this will be in the another chapter about the `NUMA`. The next step after the declaration of these local variables is the call of the: + +```C +init_irq_default_affinity(); +``` + +function. The `init_irq_default_affinity` function defined in the same source code file and depends on the `CONFIG_SMP` kernel configuration option allocates a given [cpumask](http://0xax.gitbooks.io/linux-insides/content/Concepts/cpumask.html) structure (in our case it is the `irq_default_affinity`): + +```C +#if defined(CONFIG_SMP) +cpumask_var_t irq_default_affinity; + +static void __init init_irq_default_affinity(void) +{ + alloc_cpumask_var(&irq_default_affinity, GFP_NOWAIT); + cpumask_setall(irq_default_affinity); +} +#else +static void __init init_irq_default_affinity(void) +{ +} +#endif +``` + +We know that when a hardware, such as disk controller or keyboard, needs attention from the processor, it throws an interrupt. The interrupt tells to the processor that something has happened and that the processor should interrupt current process and handle an incoming event. In order to prevent mutliple devices from sending the same interrupts, the [IRQ](https://en.wikipedia.org/wiki/Interrupt_request_%28PC_architecture%29) system was established where each device in a computer system is assigned its own special IRQ so that its interrupts are unique. Linux kernel can assign certain `IRQs` to specific processors. This is known as `SMP IRQ affinity`, and it allows you control how your system will respond to various hardware events (that's why it has certain implementation only if the `CONFIG_SMP` kernel configuration option is set). After we allocated `irq_default_affinity` cpumask, we can see `printk` output: + +```C +printk(KERN_INFO "NR_IRQS:%d\n", NR_IRQS); +``` + +which prints `NR_IRQS`: + +```C +~$ dmesg | grep NR_IRQS +[ 0.000000] NR_IRQS:4352 +``` + +The `NR_IRQS` is the maximum number of the `irq` descriptors or in another words maximum number of interrupts. Its value depends on the state of the `COFNIG_X86_IO_APIC` kernel configuration option. If the `CONFIG_X86_IO_APIC` is not set and the Linux kernel uses an old [PIC](https://en.wikipedia.org/wiki/Programmable_Interrupt_Controller) chip, the `NR_IRQS` is: + +```C +#define NR_IRQS_LEGACY 16 + +#ifdef CONFIG_X86_IO_APIC +... +... +... +#else +# define NR_IRQS NR_IRQS_LEGACY +#endif +``` + +In other way, when the `CONFIG_X86_IO_APIC` kernel configuration option is set, the `NR_IRQS` depends on the amount of the processors and amount of the interrupt vectors: + +```C +#define CPU_VECTOR_LIMIT (64 * NR_CPUS) +#define NR_VECTORS 256 +#define IO_APIC_VECTOR_LIMIT ( 32 * MAX_IO_APICS ) +#define MAX_IO_APICS 128 + +# define NR_IRQS \ + (CPU_VECTOR_LIMIT > IO_APIC_VECTOR_LIMIT ? \ + (NR_VECTORS + CPU_VECTOR_LIMIT) : \ + (NR_VECTORS + IO_APIC_VECTOR_LIMIT)) +... +... +... +``` + +We remember from the previous parts, that the amount of processors we can set during Linux kernel configuration process with the `CONFIG_NR_CPUS` configuration option: + +![kernel](http://oi60.tinypic.com/1zdm1dt.jpg) + +In the first case (`CPU_VECTOR_LIMIT > IO_APIC_VECTOR_LIMIT`), the `NR_IRQS` will be `4352`, in the second case (`CPU_VECTOR_LIMIT < IO_APIC_VECTOR_LIMIT`), the `NR_IRQS` will be `768`. In my case the `NR_CPUS` is `8` as you can see in the my configuration, the `CPU_VECTOR_LIMIT` is `512` and the `IO_APIC_VECTOR_LIMIT` is `4096`. So `NR_IRQS` for my configuration is `4352`: + +``` +~$ dmesg | grep NR_IRQS +[ 0.000000] NR_IRQS:4352 +``` + +In the next step we assign array of the IRQ descriptors to the `irq_desc` variable which we defined in the start of the `early_irq_init` function and cacluate count of the `irq_desc` array with the `ARRAY_SIZE` macro: + +```C +desc = irq_desc; +count = ARRAY_SIZE(irq_desc); +``` + +The `irq_desc` array defined in the same source code file and looks like: + +```C +struct irq_desc irq_desc[NR_IRQS] __cacheline_aligned_in_smp = { + [0 ... NR_IRQS-1] = { + .handle_irq = handle_bad_irq, + .depth = 1, + .lock = __RAW_SPIN_LOCK_UNLOCKED(irq_desc->lock), + } +}; +``` + +The `irq_desc` is array of the `irq` descriptors. It has three already initialized fields: + +* `handle_irq` - as I already wrote above, this field is the highlevel irq-event handler. In our case it initialized with the `handle_bad_irq` function that defined in the [kernel/irq/handle.c](https://github.com/torvalds/linux/blob/master/kernel/irq/handle.c) source code file and handles spurious and unhandled irqs; +* `depth` - `0` if the IRQ line is enabled and a positive value if it has been disabled at least once; +* `lock` - A spin lock used to serialize the accesses to the `IRQ` descriptor. + +As we calculated count of the interrupts and initialized our `irq_desc` array, we start to fill descriptors in the loop: + +```C +for (i = 0; i < count; i++) { + desc[i].kstat_irqs = alloc_percpu(unsigned int); + alloc_masks(&desc[i], GFP_KERNEL, node); + raw_spin_lock_init(&desc[i].lock); + lockdep_set_class(&desc[i].lock, &irq_desc_lock_class); + desc_set_defaults(i, &desc[i], node, NULL); +} +``` + +We are going through the all interrupt descriptors and do the following things: + +First of all we allocate [percpu](http://0xax.gitbooks.io/linux-insides/content/Concepts/per-cpu.html) variable for the `irq` kernel statistic with the `alloc_percpu` macro. This macro allocates one instance of an object of the given type for every processor on the system. You can access kernel statistic from the userspace via `/proc/stat`: + +``` +~$ cat /proc/stat +cpu 207907 68 53904 5427850 14394 0 394 0 0 0 +cpu0 25881 11 6684 679131 1351 0 18 0 0 0 +cpu1 24791 16 5894 679994 2285 0 24 0 0 0 +cpu2 26321 4 7154 678924 664 0 71 0 0 0 +cpu3 26648 8 6931 678891 414 0 244 0 0 0 +... +... +... +``` + +Where the sixth column is the servicing interrupts. After this we allocate [cpumask](http://0xax.gitbooks.io/linux-insides/content/Concepts/cpumask.html) for the given irq descriptor affinity and initialize the [spinlock](https://en.wikipedia.org/wiki/Spinlock) for the given interrupt descriptor. After this before the [critical section](https://en.wikipedia.org/wiki/Critical_section), the lock will be aqcuired with a call of the `raw_spin_lock` and unlocked with the call of the `raw_spin_unlock`. In the next step we call the `lockdep_set_class` macro which set the [Lock validator](https://lwn.net/Articles/185666/) `irq_desc_lock_class` class for the lock of the given interrupt descriptor. More about `lockdep`, `spinlock` and other synchronization primitives will be described in the separate chapter. + +In the end of the loop we call the `desc_set_defaults` function from the [kernel/irq/irqdesc.c](https://github.com/torvalds/linux/blob/master/kernel/irq/irqdesc.c). This function takes four parameters: + +* number of a irq; +* interrupt descriptor; +* online `NUMA` node; +* owner of interrupt descriptor. Interrupt descriptors can be allocated from modules. This field is need to proved refcount on the module which provides the interrupts; + +and fills the rest of the `irq_desc` fields. The `desc_set_defaults` function fills interrupt number, `irq` chip, platform-specific per-chip private data for the chip methods, per-IRQ data for the `irq_chip` methods and [MSI](https://en.wikipedia.org/wiki/Message_Signaled_Interrupts) descriptor for the per `irq` and `irq` chip data: + +```C +desc->irq_data.irq = irq; +desc->irq_data.chip = &no_irq_chip; +desc->irq_data.chip_data = NULL; +desc->irq_data.handler_data = NULL; +desc->irq_data.msi_desc = NULL; +... +... +... +``` + +The `irq_data.chip` structure provides general `API` like the `irq_set_chip`, `irq_set_irq_type` and etc, for the irq controller [drivers](https://github.com/torvalds/linux/tree/master/drivers/irqchip). You can find it in the [kernel/irq/chip.c](https://github.com/torvalds/linux/blob/master/kernel/irq/chip.c) source code file. + +After this we set the status of the accessor for the given descriptor and set disabled state of the interrupts: + +```C +... +... +... +irq_settings_clr_and_set(desc, ~0, _IRQ_DEFAULT_INIT_FLAGS); +irqd_set(&desc->irq_data, IRQD_IRQ_DISABLED); +... +... +... +``` + +In the next step we set the high level interrupt handlers to the `handle_bad_irq` which handles spurious and unhandled irqs (as the hardware stuff is not initialized yet, we set this handler), set `irq_desc.desc` to `1` which means that an `IRQ` is disabled, reset count of the unhandled interrupts and interrupts in general: + +```C +... +... +... +desc->handle_irq = handle_bad_irq; +desc->depth = 1; +desc->irq_count = 0; +desc->irqs_unhandled = 0; +desc->name = NULL; +desc->owner = owner; +... +... +... +``` + +After this we go through the all [possible](http://0xax.gitbooks.io/linux-insides/content/Concepts/cpumask.html) processor with the [for_each_possible_cpu](https://github.com/torvalds/linux/blob/master/include/linux/cpumask.h#L714) helper and set the `kstat_irqs` to zero for the given interrupt descriptor: + +```C + for_each_possible_cpu(cpu) + *per_cpu_ptr(desc->kstat_irqs, cpu) = 0; +``` + +and call the `desc_smp_init` function from the [kernel/irq/irqdesc.c](https://github.com/torvalds/linux/blob/master/kernel/irq/irqdesc.c) that initializes `NUMA` node of the given interrupt descriptor, sets default `SMP` affinity and clears the `pending_mask` of the given interrupt descriptor depends on the value of the `CONFIG_GENERIC_PENDING_IRQ` kernel configuration option: + +```C +static void desc_smp_init(struct irq_desc *desc, int node) +{ + desc->irq_data.node = node; + cpumask_copy(desc->irq_data.affinity, irq_default_affinity); +#ifdef CONFIG_GENERIC_PENDING_IRQ + cpumask_clear(desc->pending_mask); +#endif +} +``` + +In the end of the `early_irq_init` function we return the return value of the `arch_early_irq_init` function: + +```C +return arch_early_irq_init(); +``` + +This function defined in the [kernel/apic/vector.c](https://github.com/torvalds/linux/blob/master/kernel/apic/vector.c) and contains only one call of the `arch_early_ioapic_init` function from the [kernel/apic/io_apic.c](https://github.com/torvalds/linux/blob/master/kernel/apic/io_apic.c). As we can understand from the `arch_early_ioapic_init` function's name, this function makes early initialization of the [I/O APIC](https://en.wikipedia.org/wiki/Advanced_Programmable_Interrupt_Controller). First of all it make a check of the number of the legacy interrupts wit the call of the `nr_legacy_irqs` function. If we have no lagacy interrupts with the [Intel 8259](https://en.wikipedia.org/wiki/Intel_8259) programmable interrupt controller we set `io_apic_irqs` to the `0xffffffffffffffff`: + +```C +if (!nr_legacy_irqs()) + io_apic_irqs = ~0UL; +``` + +After this we are going through the all `I/O APICs` and allocate space for the registers with the call of the `alloc_ioapic_saved_registers`: + +```C +for_each_ioapic(i) + alloc_ioapic_saved_registers(i); +``` + +And in the end of the `arch_early_ioapic_init` function we are going through the all legacy irqs (from `IRQ0` to `IRQ15`) in the loop and allocate space for the `irq_cfg` which represents configuration of an irq on the given `NUMA` node: + +```C +for (i = 0; i < nr_legacy_irqs(); i++) { + cfg = alloc_irq_and_cfg_at(i, node); + cfg->vector = IRQ0_VECTOR + i; + cpumask_setall(cfg->domain); +} +``` + +That's all. + +Sparse IRQs +-------------------------------------------------------------------------------- + +We already saw in the beginning of this part that implementation of the `early_irq_init` function depends on the `CONFIG_SPARSE_IRQ` kernel configuration option. Previously we saw implementation of the `early_irq_init` function when the `CONFIG_SPARSE_IRQ` configuration option is not set, not let's look on the its implementation when this option is set. Implementation of this function very similar, but little differ. We can see the same definition of variables and call of the `init_irq_default_affinity` in the beginning of the `early_irq_init` function: + +```C +#ifdef CONFIG_SPARSE_IRQ +int __init early_irq_init(void) +{ + int i, initcnt, node = first_online_node; + struct irq_desc *desc; + + init_irq_default_affinity(); + ... + ... + ... +} +#else +... +... +... +``` + +But after this we can see the following call: + +```C +initcnt = arch_probe_nr_irqs(); +``` + +The `arch_probe_nr_irqs` function defined in the [arch/x86/kernel/apic/vector.c](https://github.com/torvalds/linux/blob/master/arch/x86/kernel/apic/vector.c) and calculates count of the pre-allocated irqs and update `nr_irqs` with its number. But stop. Why there are pre-allocated irqs? There is alternative form of interrupts called - [Message Signaled Interrupts](https://en.wikipedia.org/wiki/Message_Signaled_Interrupts) available in the [PCI](https://en.wikipedia.org/wiki/Conventional_PCI). Instead of assigning a fixed number of the interrupt request, the device is allowed to record a message at a particular address of RAM, in fact, the display on the [Local APIC](https://en.wikipedia.org/wiki/Advanced_Programmable_Interrupt_Controller#Integrated_local_APICs). `MSI` permits a device to allocate `1`, `2`, `4`, `8`, `16` or `32` interrupts and `MSI-X` permits a device to allocate up to `2048` interrupts. Now we know that irqs can be pre-allocated. More about `MSI` will be in a next part, but now let's look on the `arch_probe_nr_irqs` function. We can see the check which assign amount of the interrupt vectors for the each processor in the system to the `nr_irqs` if it is greater and calculate the `nr` which represents number of `MSI` interrupts: + +```C +int nr_irqs = NR_IRQS; + +if (nr_irqs > (NR_VECTORS * nr_cpu_ids)) + nr_irqs = NR_VECTORS * nr_cpu_ids; + +nr = (gsi_top + nr_legacy_irqs()) + 8 * nr_cpu_ids; +``` + +Take a look on the `gsi_top` variable. Each `APIC` is identified with its own `ID` and with the offset where its `IRQ` starts. It is called `GSI` base or `Global System Interrupt` base. So the `gsi_top` represnters it. We get the `Global System Interrupt` base from the [MultiProcessor Configuration Table](https://en.wikipedia.org/wiki/MultiProcessor_Specification) table (you can remember that we have parsed this table in the sixth [part](http://0xax.gitbooks.io/linux-insides/content/Initialization/linux-initialization-6.html) of the Linux Kernel initialization process chapter). + +After this we update the `nr` depends on the value of the `gsi_top`: + +```C +#if defined(CONFIG_PCI_MSI) || defined(CONFIG_HT_IRQ) + if (gsi_top <= NR_IRQS_LEGACY) + nr += 8 * nr_cpu_ids; + else + nr += gsi_top * 16; +#endif +``` + +Update the `nr_irqs` if it less than `nr` and return the number of the legacy irqs: + +```C +if (nr < nr_irqs) + nr_irqs = nr; + +return nr_legacy_irqs(); +} +``` + +The next after the `arch_probe_nr_irqs` is printing information about number of `IRQs`: + +```C +printk(KERN_INFO "NR_IRQS:%d nr_irqs:%d %d\n", NR_IRQS, nr_irqs, initcnt); +``` + +We can find it in the [dmesg](https://en.wikipedia.org/wiki/Dmesg) output: + +``` +$ dmesg | grep NR_IRQS +[ 0.000000] NR_IRQS:4352 nr_irqs:488 16 +``` + +After this we do some checks that `nr_irqs` and `initcnt` values is not greater than maximum allowable number of `irqs`: + +```C +if (WARN_ON(nr_irqs > IRQ_BITMAP_BITS)) + nr_irqs = IRQ_BITMAP_BITS; + +if (WARN_ON(initcnt > IRQ_BITMAP_BITS)) + initcnt = IRQ_BITMAP_BITS; +``` + +where `IRQ_BITMAP_BITS` is equal to the `NR_IRQS` if the `CONFIG_SPARSE_IRQ` is not set and `NR_IRQS + 8196` in other way. In the next step we are going over all interrupt descript which need to be allocated in the loop and allocate space for the descriptor and insert to the `irq_desc_tree` [radix tree](http://0xax.gitbooks.io/linux-insides/content/DataStructures/radix-tree.html): + +```C +for (i = 0; i < initcnt; i++) { + desc = alloc_desc(i, node, NULL); + set_bit(i, allocated_irqs); + irq_insert_desc(i, desc); +} +``` + +In the end of the `early_irq_init` function we return the value of the call of the `arch_early_irq_init` function as we did it already in the previous variant when the `CONFIG_SPARSE_IRQ` option was not set: + +```C +return arch_early_irq_init(); +``` + +That's all. + +Conclusion +-------------------------------------------------------------------------------- + +It is the end of the seventh part of the [Interrupts and Interrupt Handling](http://0xax.gitbooks.io/linux-insides/content/interrupts/index.html) chapter and we started to dive into external hardware interrupts in this part. We saw early initialization of the `irq_desc` structure which represents description of an external interrupt and contains information about it like list of irq actions, information about interrupt handler, interrupts's owner, count of the unhandled interrupt and etc. In the next part we will continue to research external interrupts. + +If you will have any questions or suggestions write me a comment or ping me at [twitter](https://twitter.com/0xAX). + +**Please note that English is not my first language, And I am really sorry for any inconvenience. If you will find any mistakes please send me PR to [linux-internals](https://github.com/0xAX/linux-internals).** + +Links +-------------------------------------------------------------------------------- + +* [IRQ](https://en.wikipedia.org/wiki/Interrupt_request_%28PC_architecture%29) +* [numa](https://en.wikipedia.org/wiki/Non-uniform_memory_access) +* [Enum type](https://en.wikipedia.org/wiki/Enumerated_type) +* [cpumask](http://0xax.gitbooks.io/linux-insides/content/Concepts/cpumask.html) +* [percpu](http://0xax.gitbooks.io/linux-insides/content/Concepts/per-cpu.html) +* [spinlock](https://en.wikipedia.org/wiki/Spinlock) +* [critical section](https://en.wikipedia.org/wiki/Critical_section) +* [Lock validator](https://lwn.net/Articles/185666/) +* [MSI](https://en.wikipedia.org/wiki/Message_Signaled_Interrupts) +* [I/O APIC](https://en.wikipedia.org/wiki/Advanced_Programmable_Interrupt_Controller) +* [Local APIC](https://en.wikipedia.org/wiki/Advanced_Programmable_Interrupt_Controller#Integrated_local_APICs) +* [Intel 8259](https://en.wikipedia.org/wiki/Intel_8259) +* [PIC](https://en.wikipedia.org/wiki/Programmable_Interrupt_Controller) +* [MultiProcessor Configuration Table](https://en.wikipedia.org/wiki/MultiProcessor_Specification) +* [radix tree](http://0xax.gitbooks.io/linux-insides/content/DataStructures/radix-tree.html) +* [dmesg](https://en.wikipedia.org/wiki/Dmesg) From 8cbd9e148c27e13a690392f73f56549b83a9f807 Mon Sep 17 00:00:00 2001 From: 0xAX Date: Sun, 12 Jul 2015 20:25:13 +0600 Subject: [PATCH 06/17] Update SUMMARY.md --- SUMMARY.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/SUMMARY.md b/SUMMARY.md index 09f574b..ee80dfd 100644 --- a/SUMMARY.md +++ b/SUMMARY.md @@ -24,7 +24,7 @@ * [Initialization of non-early interrupt gates](interrupts/interrupts-4.md) * [Implementation of some exception handlers](interrupts/interrupts-5.md) * [Handling Non-Maskable interrupts](interrupts/interrupts-6.md) - * [External hardware interrupts]() + * [Dive into external hardware interrupts](interrupts/interrupts-7.md) * [Memory management](mm/README.md) * [Memblock](mm/linux-mm-1.md) * [Fixmaps and ioremap](mm/linux-mm-2.md) From f1e4e86d29e72af82098894c8cf0d38a6f0775ed Mon Sep 17 00:00:00 2001 From: 0xAX Date: Sun, 12 Jul 2015 20:25:59 +0600 Subject: [PATCH 07/17] Update README.md --- interrupts/README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/interrupts/README.md b/interrupts/README.md index 7dfa3d8..19d57eb 100644 --- a/interrupts/README.md +++ b/interrupts/README.md @@ -8,3 +8,4 @@ You will find a couple of posts which describes an interrupts and an exceptions * [Interrupt handlers](https://github.com/0xAX/linux-insides/blob/master/interrupts/interrupts-4.md) - fourth part describes first non-early interrupt handlers. * [Implementation of exception handlers](https://github.com/0xAX/linux-insides/blob/master/interrupts/interrupts-5.md) - descripbes implementation of some exception handlers as double fault, divide by zero and etc. * [Handling Non-Maskable interrupts](https://github.com/0xAX/linux-insides/blob/master/interrupts/interrupts-6.md) - describes handling of non-maskable interrupts and the rest of interrupts handlers from the architecture-specific part. +* [Dive into external hardware interrupts](https://github.com/0xAX/linux-insides/blob/master/interrupts/interrupts-7.md) - this part describes early initialization of code which is related to handling of external hardware interrupts. From f935203ffddec65062874fcb0e35b3008609b35b Mon Sep 17 00:00:00 2001 From: Jean-Baptiste Cayrou Date: Sun, 12 Jul 2015 22:14:06 +0200 Subject: [PATCH 08/17] Fix linux-bootstrap-1.md incorrect merging Correct merging of commit da38b6038d2026b8d8b0de5dab0bdf0a16b19610 line 253 --- Booting/linux-bootstrap-1.md | 4 ---- 1 file changed, 4 deletions(-) diff --git a/Booting/linux-bootstrap-1.md b/Booting/linux-bootstrap-1.md index 42ad15d..2ccbad8 100644 --- a/Booting/linux-bootstrap-1.md +++ b/Booting/linux-bootstrap-1.md @@ -252,10 +252,6 @@ Start of Kernel Setup Finally we are in the kernel. Technically kernel didn't run yet, first of all we need to setup kernel, memory manager, process manager etc. Kernel setup execution starts from [arch/x86/boot/header.S](https://github.com/torvalds/linux/blob/master/arch/x86/boot/header.S) at the [_start](https://github.com/torvalds/linux/blob/master/arch/x86/boot/header.S#L293). It is a little strange at the first look, there are many instructions before it. -======= - -Finally we are in the kernel. Technically kernel didn't run yet, first of all we need to setup kernel, memory manager, process manager, etc. Kernel setup execution starts from [arch/x86/boot/header.S](https://github.com/torvalds/linux/blob/master/arch/x86/boot/header.S) at the [_start](https://github.com/torvalds/linux/blob/master/arch/x86/boot/header.S#L293). It is little strange at the first look, there are many instructions before it. Actually.... - Actually Long time ago Linux kernel had its own bootloader, but now if you run for example: ``` From e8a12b712a21e09299240f8f5b946fab15de8750 Mon Sep 17 00:00:00 2001 From: 0xAX Date: Thu, 16 Jul 2015 20:07:53 +0600 Subject: [PATCH 09/17] Update SUMMARY.md --- SUMMARY.md | 1 + 1 file changed, 1 insertion(+) diff --git a/SUMMARY.md b/SUMMARY.md index ee80dfd..bfd1c2a 100644 --- a/SUMMARY.md +++ b/SUMMARY.md @@ -25,6 +25,7 @@ * [Implementation of some exception handlers](interrupts/interrupts-5.md) * [Handling Non-Maskable interrupts](interrupts/interrupts-6.md) * [Dive into external hardware interrupts](interrupts/interrupts-7.md) + * [Initialization of external hardware interrupts structures]() * [Memory management](mm/README.md) * [Memblock](mm/linux-mm-1.md) * [Fixmaps and ioremap](mm/linux-mm-2.md) From 2b8dfaded3b63534e2227306e0a605d512ab40e2 Mon Sep 17 00:00:00 2001 From: Diogo Kersting Date: Thu, 16 Jul 2015 18:23:46 -0300 Subject: [PATCH 10/17] Fix a few english typos - "can be runned not" --- Initialization/linux-initialization-1.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Initialization/linux-initialization-1.md b/Initialization/linux-initialization-1.md index 29942b6..c8eba9c 100644 --- a/Initialization/linux-initialization-1.md +++ b/Initialization/linux-initialization-1.md @@ -72,7 +72,7 @@ Now we know default physical and virtual addresses of the `startup_64` routine, subq $_text - __START_KERNEL_map, %rbp ``` -Here we just put the `rip-relative` address to the `rbp` register and than subtract `$_text - __START_KERNEL_map` from it. We know that compiled address of the `_text` is `0xffffffff81000000` and `__START_KERNEL_map` contains `0xffffffff81000000`, so `rbp` will contain physical address of the `text` - `0x1000000` after this calculation. We need to calculate it because kernel can be runned not on the default address, but now we know actual physical address. +Here we just put the `rip-relative` address to the `rbp` register and then subtract `$_text - __START_KERNEL_map` from it. We know that compiled address of the `_text` is `0xffffffff81000000` and `__START_KERNEL_map` contains `0xffffffff81000000`, so `rbp` will contain physical address of the `text` - `0x1000000` after this calculation. We need to calculate it because kernel can't be runned on the default address, but now we know the actual physical address. In the next step we checks that this address is aligned with: @@ -122,7 +122,7 @@ The first step before we started to setup identity paging, need to correct follo addq %rbp, level2_fixmap_pgt + (506*8)(%rip) ``` -Here we need to correct `early_level4_pgt` and other addresses of the page table directories, because as I wrote above, kernel can be runned not at the default `0x1000000` address. `rbp` register contains actuall address so we add to the `early_level4_pgt`, `level3_kernel_pgt` and `level2_fixmap_pgt`. Let's try to understand what this labels means. First of all let's look on their definition: +Here we need to correct `early_level4_pgt` and other addresses of the page table directories, because as I wrote above, kernel can't be runned at the default `0x1000000` address. `rbp` register contains actual address so we add to the `early_level4_pgt`, `level3_kernel_pgt` and `level2_fixmap_pgt`. Let's try to understand what these labels means. First of all let's look on their definition: ```assembly NEXT_PAGE(early_level4_pgt) From bfe6fc596aca25cb5b64330adace87bfc78803ac Mon Sep 17 00:00:00 2001 From: Diogo Kersting Date: Fri, 17 Jul 2015 09:33:14 -0300 Subject: [PATCH 11/17] Some english fixes - "runned" --- Initialization/linux-initialization-1.md | 6 +++--- Initialization/linux-initialization-10.md | 6 +++--- Initialization/linux-initialization-8.md | 2 +- 3 files changed, 7 insertions(+), 7 deletions(-) diff --git a/Initialization/linux-initialization-1.md b/Initialization/linux-initialization-1.md index c8eba9c..c93b246 100644 --- a/Initialization/linux-initialization-1.md +++ b/Initialization/linux-initialization-1.md @@ -72,7 +72,7 @@ Now we know default physical and virtual addresses of the `startup_64` routine, subq $_text - __START_KERNEL_map, %rbp ``` -Here we just put the `rip-relative` address to the `rbp` register and then subtract `$_text - __START_KERNEL_map` from it. We know that compiled address of the `_text` is `0xffffffff81000000` and `__START_KERNEL_map` contains `0xffffffff81000000`, so `rbp` will contain physical address of the `text` - `0x1000000` after this calculation. We need to calculate it because kernel can't be runned on the default address, but now we know the actual physical address. +Here we just put the `rip-relative` address to the `rbp` register and then subtract `$_text - __START_KERNEL_map` from it. We know that compiled address of the `_text` is `0xffffffff81000000` and `__START_KERNEL_map` contains `0xffffffff81000000`, so `rbp` will contain physical address of the `text` - `0x1000000` after this calculation. We need to calculate it because kernel can't be run on the default address, but now we know the actual physical address. In the next step we checks that this address is aligned with: @@ -122,7 +122,7 @@ The first step before we started to setup identity paging, need to correct follo addq %rbp, level2_fixmap_pgt + (506*8)(%rip) ``` -Here we need to correct `early_level4_pgt` and other addresses of the page table directories, because as I wrote above, kernel can't be runned at the default `0x1000000` address. `rbp` register contains actual address so we add to the `early_level4_pgt`, `level3_kernel_pgt` and `level2_fixmap_pgt`. Let's try to understand what these labels means. First of all let's look on their definition: +Here we need to correct `early_level4_pgt` and other addresses of the page table directories, because as I wrote above, kernel can't be run at the default `0x1000000` address. `rbp` register contains actual address so we add to the `early_level4_pgt`, `level3_kernel_pgt` and `level2_fixmap_pgt`. Let's try to understand what these labels means. First of all let's look on their definition: ```assembly NEXT_PAGE(early_level4_pgt) @@ -385,7 +385,7 @@ INIT_PER_CPU(gdt_page); As we got `init_per_cpu__gdt_page` in `INIT_PER_CPU_VAR` and `INIT_PER_CPU` macro from linker script will be expanded we will get offset from the `__per_cpu_load`. After this calculations, we will have correct base address of the new GDT. -Generally per-CPU variables is a 2.6 kernel feature. You can understand what is it from it's name. When we create `per-CPU` variable, each CPU will have will have it's own copy of this variable. Here we creating `gdt_page` per-CPU variable. There are many advantages for variables of this type, like there are no locks, because each CPU works with it's own copy of variable and etc... So every core on multiprocessor will have it's own `GDT` table and every entry in the table will represent a memory segment which can be accessed from the thread which runned on the core. You can read in details about `per-CPU` variables in the [Theory/per-cpu](http://0xax.gitbooks.io/linux-insides/content/Concepts/per-cpu.html) post. +Generally per-CPU variables is a 2.6 kernel feature. You can understand what is it from it's name. When we create `per-CPU` variable, each CPU will have will have it's own copy of this variable. Here we creating `gdt_page` per-CPU variable. There are many advantages for variables of this type, like there are no locks, because each CPU works with it's own copy of variable and etc... So every core on multiprocessor will have it's own `GDT` table and every entry in the table will represent a memory segment which can be accessed from the thread which ran on the core. You can read in details about `per-CPU` variables in the [Theory/per-cpu](http://0xax.gitbooks.io/linux-insides/content/Concepts/per-cpu.html) post. As we loaded new Global Descriptor Table, we reload segments as we did it every time: diff --git a/Initialization/linux-initialization-10.md b/Initialization/linux-initialization-10.md index 3424d62..f59f507 100644 --- a/Initialization/linux-initialization-10.md +++ b/Initialization/linux-initialization-10.md @@ -230,9 +230,9 @@ and a couple of directories depends on the different configuration options: In the end of the `proc_root_init` we call the `proc_sys_init` function which creates `/proc/sys` directory and initializes the [Sysctl](http://en.wikipedia.org/wiki/Sysctl). -It is the end of `start_kernel` function. I did not describe all functions which are called in the `start_kernel`. I missed it, because they are not so improtant for the generic kernel initialization stuff and depend on only different kernel configurations. They are `taskstats_init_early` which exports per-task statistic to the user-space, `delayacct_init` - initializes per-task delay accounting, `key_init` and `security_init` initialize diferent security stuff, `check_bugs` - makes fix up of the some architecture-dependent bugs, `ftrace_init` function executes initialization of the [ftrace](https://www.kernel.org/doc/Documentation/trace/ftrace.txt), `cgroup_init` makes initialization of the rest of the [cgroup](http://en.wikipedia.org/wiki/Cgroups) subsystem and etc... Many of these parts and subsystems will be described in the other chapters. +It is the end of `start_kernel` function. I did not describe all functions which are called in the `start_kernel`. I missed it, because they are not so important for the generic kernel initialization stuff and depend on only different kernel configurations. They are `taskstats_init_early` which exports per-task statistic to the user-space, `delayacct_init` - initializes per-task delay accounting, `key_init` and `security_init` initialize diferent security stuff, `check_bugs` - makes fix up of the some architecture-dependent bugs, `ftrace_init` function executes initialization of the [ftrace](https://www.kernel.org/doc/Documentation/trace/ftrace.txt), `cgroup_init` makes initialization of the rest of the [cgroup](http://en.wikipedia.org/wiki/Cgroups) subsystem and etc... Many of these parts and subsystems will be described in the other chapters. -That's all. Finally we passed through the long-long `start_kernel` function. But it is not the end of the linux kernel initialization process. We have no runned first process yet. In the end of the `start_kernel` we can see the last call of the - `rest_init` function. Let's go ahead. +That's all. Finally we passed through the long-long `start_kernel` function. But it is not the end of the linux kernel initialization process. We haven't run the first process yet. In the end of the `start_kernel` we can see the last call of the - `rest_init` function. Let's go ahead. First steps after the start_kernel -------------------------------------------------------------------------------- @@ -314,7 +314,7 @@ void init_idle_bootup_task(struct task_struct *idle) } ``` -where `idle` class is a low priority tasks and tasks can be runned only when the processor has not to run anything besides this tasks. The second function `schedule_preempt_disabled` disables preempt in `idle` tasks. And the third function `cpu_startup_entry` defined in the [kernel/sched/idle.c](https://github.com/torvalds/linux/blob/master/sched/idle.c) and calls `cpu_idle_loop` from the [kernel/sched/idle.c](https://github.com/torvalds/linux/blob/master/sched/idle.c). The `cpu_idle_loop` function works as process with `PID = 0` and works in the background. Main purpose of the `cpu_idle_loop` is usage of the idle CPU cycles. When there are no one process to run, this process starts to work. We have one process with `idle` scheduling class (we just set the `current` task to the `idle` with the call of the `init_idle_bootup_task` function), so the `idle` thread does not do useful work and checks that there is not active task to switch: +where `idle` class is a low priority tasks and tasks can be run only when the processor doesn't have to run anything besides this tasks. The second function `schedule_preempt_disabled` disables preempt in `idle` tasks. And the third function `cpu_startup_entry` defined in the [kernel/sched/idle.c](https://github.com/torvalds/linux/blob/master/sched/idle.c) and calls `cpu_idle_loop` from the [kernel/sched/idle.c](https://github.com/torvalds/linux/blob/master/sched/idle.c). The `cpu_idle_loop` function works as process with `PID = 0` and works in the background. Main purpose of the `cpu_idle_loop` is usage of the idle CPU cycles. When there are no one process to run, this process starts to work. We have one process with `idle` scheduling class (we just set the `current` task to the `idle` with the call of the `init_idle_bootup_task` function), so the `idle` thread does not do useful work and checks that there is not active task to switch: ```C static void cpu_idle_loop(void) diff --git a/Initialization/linux-initialization-8.md b/Initialization/linux-initialization-8.md index 0502360..0adf83c 100644 --- a/Initialization/linux-initialization-8.md +++ b/Initialization/linux-initialization-8.md @@ -441,7 +441,7 @@ init_idle(current, smp_processor_id()); calc_load_update = jiffies + LOAD_FREQ; ``` -So, the `init` process will be runned, when there will no other candidates (as it first process in the system). In the end we just set `scheduler_running` variable: +So, the `init` process will be run, when there will be no other candidates (as it is the first process in the system). In the end we just set `scheduler_running` variable: ```C scheduler_running = 1; From e142ad5d78ce5a2e40f99b09d0801ae1887d4448 Mon Sep 17 00:00:00 2001 From: 0xAX Date: Sun, 19 Jul 2015 20:10:46 +0600 Subject: [PATCH 12/17] Create interrupts-8.md --- interrupts/interrupts-8.md | 542 +++++++++++++++++++++++++++++++++++++ 1 file changed, 542 insertions(+) create mode 100644 interrupts/interrupts-8.md diff --git a/interrupts/interrupts-8.md b/interrupts/interrupts-8.md new file mode 100644 index 0000000..0614375 --- /dev/null +++ b/interrupts/interrupts-8.md @@ -0,0 +1,542 @@ +Interrupts and Interrupt Handling. Part 8. +================================================================================ + +Non-early initialization of the IRQs +-------------------------------------------------------------------------------- + +This is the eighth part of the Interrupts and Interrupt Handling in the Linux kernel [chapter](http://0xax.gitbooks.io/linux-insides/content/interrupts/index.html) and in the previous [part](http://0xax.gitbooks.io/linux-insides/content/interrupts/interrupts-7.html) we started to dive into the external hardware [interrupts](https://en.wikipedia.org/wiki/Interrupt_request_%28PC_architecture%29). We looked on the implementation of the `early_irq_init` function from the [kernel/irq/irqdesc.c](https://github.com/torvalds/linux/blob/master/kernel/irq/irqdesc.c) source code file and saw the initialization of the `irq_desc` structure in this function. Remind that `irq_desc` structure (defined in the [include/linux/irqdesc.h](https://github.com/torvalds/linux/blob/master/include/linux/irqdesc.h#L46) is the foundation of interrupt management code in the Linux kernel and represents an interrupt descriptor. In this part we will continue to dive into the initialization stuff which is related to the external hardware interrupts. + +Right after the call of the `early_irq_init` function in the [init/main.c](https://github.com/torvalds/linux/blob/master/init/main.c) we can see the call of the `init_IRQ` function. This function is architecture-specfic and defined in the [arch/x86/kernel/irqinit.c](https://github.com/torvalds/linux/blob/master/kernel/irqinit.c). The `init_IRQ` function makes initialization of the `vector_irq` [percpu](http://0xax.gitbooks.io/linux-insides/content/Concepts/per-cpu.html) variable that defined in the same [arch/x86/kernel/irqinit.c](https://github.com/torvalds/linux/blob/master/kernel/irqinit.c) source code file: + +```C +... +DEFINE_PER_CPU(vector_irq_t, vector_irq) = { + [0 ... NR_VECTORS - 1] = -1, +}; +... +``` + +and represents `percpu` array of the interrupt vector numbers. The `vector_irq_t` defined in the [arch/x86/include/asm/hw_irq.h](https://github.com/torvalds/linux/blob/master/arch/x86/include/asm/hw_irq.h) and expands to the: + +```C +typedef int vector_irq_t[NR_VECTORS]; +``` + +where `NR_VECTORS` is count of the vector number and as you can remember from the first [part](http://0xax.gitbooks.io/linux-insides/content/interrupts/interrupts-1.html) of this chapter it is `256` for the [x86_64](https://en.wikipedia.org/wiki/X86-64): + +```C +#define NR_VECTORS 256 +``` + +So, in the start of the `init_IRQ` function we fill the `vecto_irq` [percpu](http://0xax.gitbooks.io/linux-insides/content/Concepts/per-cpu.html) array with the vector number of the `legacy` interrupts: + +```C +void __init init_IRQ(void) +{ + int i; + + for (i = 0; i < nr_legacy_irqs(); i++) + per_cpu(vector_irq, 0)[IRQ0_VECTOR + i] = i; +... +... +... +} +``` + +This `vector_irq` will be used during the first steps of an external hardware interrupt handling in the `do_IRQ` function from the [arch/x86/kernel/irq.c](https://github.com/torvalds/linux/blob/master/arch/x86/kernel/irq.c): + +```C +__visible unsigned int __irq_entry do_IRQ(struct pt_regs *regs) +{ + ... + ... + ... + irq = __this_cpu_read(vector_irq[vector]); + + if (!handle_irq(irq, regs)) { + ... + ... + ... + } + + exiting_irq(); + ... + ... + return 1; +} +``` + +Why is `legacy` here? Actuall all interrupts handled by the modern [IO-APIC](https://en.wikipedia.org/wiki/Advanced_Programmable_Interrupt_Controller#I.2FO_APICs) controller. But these interrupts (from `0x30` to `0x3f`) by legacy interrupt-controllers like [Programmable Interrupt Controller](https://en.wikipedia.org/wiki/Programmable_Interrupt_Controller). If these interrupts are handled by the `I/O APIC` then this vector space will be freed and re-used. Let's look on this code closer. First of all the `nr_legacy_irqs` defined in the [arch/x86/include/asm/i8259.h](https://github.com/torvalds/linux/blob/master/arch/x86/include/asm/i8259.h) and just returns the `nr_legacy_irqs` field from the `legacy_pic` strucutre: + +```C +static inline int nr_legacy_irqs(void) +{ + return legacy_pic->nr_legacy_irqs; +} +``` + +This structure defined in the same header file and represents non-modern programmable interrupts controller: + +```C +struct legacy_pic { + int nr_legacy_irqs; + struct irq_chip *chip; + void (*mask)(unsigned int irq); + void (*unmask)(unsigned int irq); + void (*mask_all)(void); + void (*restore_mask)(void); + void (*init)(int auto_eoi); + int (*irq_pending)(unsigned int irq); + void (*make_irq)(unsigned int irq); +}; +``` + +Actuall default maximum number of the legacy interrupts represtented by the `NR_IRQ_LEGACY` macro from the [arch/x86/include/asm/irq_vectors.h](https://github.com/torvalds/linux/blob/master/arch/x86/include/asm/irq_vectors.h): + +```C +#define NR_IRQS_LEGACY 16 +``` + +In the loop we are accessing the `vecto_irq` per-cpu array with the `per_cpu` macro by the `IRQ0_VECTOR + i` index and write the legacy vector number there. The `IRQ0_VECTOR` macro defined in the [arch/x86/include/asm/irq_vectors.h](https://github.com/torvalds/linux/blob/master/arch/x86/include/asm/irq_vectors.h) header file and expands to the `0x30`: + +```C +#define FIRST_EXTERNAL_VECTOR 0x20 + +#define IRQ0_VECTOR ((FIRST_EXTERNAL_VECTOR + 16) & ~15) +``` + +Why is `0x30` here? You can remember from the first [part](http://0xax.gitbooks.io/linux-insides/content/interrupts/interrupts-1.html) of this chapter that first 32 vector numbers from `0` to `31` are reserved by the processor and used for the processing of architecture-defined exceptions and interrupts. Vector numbers from `0x30` to `0x3f` are reserved for the [ISA](https://en.wikipedia.org/wiki/Industry_Standard_Architecture). So, it means that we fill the `vector_irq` from the `IRQ0_VECTOR` which is equal to the `32` to the `IRQ0_VECTOR + 16` (before the `0x30`). + +In the end of the `init_IRQ` functio we can see the call of the following function: + +```C +x86_init.irqs.intr_init(); +``` + +from the [arch/x86/kernel/x86_init.c](https://github.com/torvalds/linux/blob/master/arch/x86/kernel/x86_init.c) source code file. If you have read [chapter](http://0xax.gitbooks.io/linux-insides/content/Initialization/index.html) about the Linux kernel initialization process, you can remember the `x86_init` structure. This structure contains a couple of files which are points to the function related to the platform setup (`x86_64` in our case), for example `resources` - related with the memory resources, `mpparse` - related with the parsing of the [MultiProcessor Configuration Table](https://en.wikipedia.org/wiki/MultiProcessor_Specification) table and etc.). As we can see the `x86_init` also contains the `irqs` field which contains three following fields: + +```C +struct x86_init_ops x86_init __initdata +{ + ... + ... + ... + .irqs = { + .pre_vector_init = init_ISA_irqs, + .intr_init = native_init_IRQ, + .trap_init = x86_init_noop, + }, + ... + ... + ... +} +``` + +Now, we are interesting in the `native_init_IRQ`. As we can note, the name of the `native_init_IRQ` function contains the `native_` prefix which means that this function is architecture-specific. It defined in the [arch/x86/kernel/irqinit.c](https://github.com/torvalds/linux/blob/master/kernel/irqinit.c) and executes general initialization of the [Local APIC](https://en.wikipedia.org/wiki/Advanced_Programmable_Interrupt_Controller#Integrated_local_APICs) and initialization of the [ISA](https://en.wikipedia.org/wiki/Industry_Standard_Architecture) irqs. Let's look on the implementation of the `native_init_IRQ` function and will try to understand what occurs there. The `native_init_IRQ` function starts from the execution of the following function: + +```C +x86_init.irqs.pre_vector_init(); +``` + +As we can see above, the `pre_vector_init` points to the `init_ISA_irqs` function that defined in the same [source code](https://github.com/torvalds/linux/blob/master/kernel/irqinit.c) file and as we can understand from the function's name, it makes initialization of the `ISA` related interrupts. The `init_ISA_irqs` function starts from the definition of the `chip` variable which has a `irq_chip` type: + +```C +void __init init_ISA_irqs(void) +{ + struct irq_chip *chip = legacy_pic->chip; + ... + ... + ... +``` + +The `irq_chip` structure defined in the [include/linux/irq.h](https://github.com/torvalds/linux/blob/master/include/linux/irq.h) header file and represents hardware interrupt chip descriptor. It contains: + +* `name` - name of a device. Used in the `/proc/interrupts`: + +```C +$ cat /proc/interrupts + CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 + 0: 16 0 0 0 0 0 0 0 IO-APIC 2-edge timer + 1: 2 0 0 0 0 0 0 0 IO-APIC 1-edge i8042 + 8: 1 0 0 0 0 0 0 0 IO-APIC 8-edge rtc0 +``` + +look on the last columnt; + +* `(*irq_mask)(struct irq_data *data)` - mask an interrupt source; +* `(*irq_ack)(struct irq_data *data)` - start of a new interrupt; +* `(*irq_startup)(struct irq_data *data)` - start up the interrupt; +* `(*irq_shutdown)(struct irq_data *data)` - shutdown the interrupt +* and etc. + +fields. Note that the `irq_data` structure represents set of the per irq chip data passed down to chip functions. It contains `mask` - precomputed bitmask for accessing the chip registers, `irq` - interrupt number, `hwirq` - hardware interrupt number, local to the interrupt domain chip low level interrupt hardware access and etc. + +After this depends on the `CONFIG_X86_64` and `CONFIG_X86_LOCAL_APIC` kernel configuration option call the `init_bsp_APIC` function from the [arch/x86/kernel/apic/apic.c](https://github.com/torvalds/linux/blob/master/arch/x86/kernel/apic/apic.c): + +```C +#if defined(CONFIG_X86_64) || defined(CONFIG_X86_LOCAL_APIC) + init_bsp_APIC(); +#endif +``` + +This function makes initialization of the [APIC](https://en.wikipedia.org/wiki/Advanced_Programmable_Interrupt_Controller) of `bootstrap processor` (or processor which starts first). It starts from the check that we found [SMP](https://en.wikipedia.org/wiki/Symmetric_multiprocessing) config (read more about it in the sixth [part](http://0xax.gitbooks.io/linux-insides/content/Initialization/linux-initialization-6.html) of the Linux kernel initialization process chapter) and the processor has `APIC`: + +```C +if (smp_found_config || !cpu_has_apic) + return; +``` + +In other way we return from this function. In the next step we call the `clear_local_APIC` function from the same source code file that shutdowns the local `APIC` (more about it will be in the chapter about the `Advanced Programmable Interrupt Controller`) and enable `APIC` of the first processor by the setting `unsigned int value` to the `APIC_SPIV_APIC_ENABLED`: + +```C +value = apic_read(APIC_SPIV); +value &= ~APIC_VECTOR_MASK; +value |= APIC_SPIV_APIC_ENABLED; +``` + +and writing it with the help of the `apic_write` function: + +```C +apic_write(APIC_SPIV, value); +``` + +After we have enabled `APIC` for the bootstrap processor, we return to the `init_ISA_irqs` function and in the next step we initalize legacy `Programmable Interrupt Controller` and set the legacy chip and handler for the each legacy irq: + +```C +legacy_pic->init(0); + +for (i = 0; i < nr_legacy_irqs(); i++) + irq_set_chip_and_handler(i, chip, handle_level_irq); +``` + +Where can we find `init` function? The `legacy_pic` defined in the [arch/x86/kernel/i8259.c](https://github.com/torvalds/linux/blob/master/arch/x86/kernel/i8259.c) and it is: + +```C +struct legacy_pic *legacy_pic = &default_legacy_pic; +``` + +Where the `default_legacy_pic` is: + +```C +struct legacy_pic default_legacy_pic = { + ... + ... + ... + .init = init_8259A, + ... + ... + ... +} +``` + +The `init_8259A` function defined in the same source code file and executes initialization of the [Intel 8259](https://en.wikipedia.org/wiki/Intel_8259) ``Programmable Interrupt Controller` (more about it will be in the separate chapter abot `Programmable Interrupt Controllers` and `APIC`). + +Now we can return to the `native_init_IRQ` function, after the `init_ISA_irqs` function finished its work. The next step is the call of the `apic_intr_init` function that allocates special interrupt gates which are used by the [SMP](https://en.wikipedia.org/wiki/Symmetric_multiprocessing) architecture for the [Inter-processor interrupt](https://en.wikipedia.org/wiki/Inter-processor_interrupt). The `alloc_intr_gate` macro from the [arch/x86/include/asm/desc.h](https://github.com/torvalds/linux/blob/master/arch/x86/include/asm/desc.h) used for the interrupt descriptor allocation allocation: + +```C +#define alloc_intr_gate(n, addr) \ +do { \ + alloc_system_vector(n); \ + set_intr_gate(n, addr); \ +} while (0) +``` + +As we can see, first of all it expands to the call of the `alloc_system_vector` function that checks the given vector number in the `user_vectors` bitmap (read previous [part](http://0xax.gitbooks.io/linux-insides/content/interrupts/interrupts-7.html) about it) and if it is not set in the `user_vectors` bitmap we set it. After this we test that the `first_system_vector` is greater than given interrupt vector number and if it is greater we assign it: + +```C +if (!test_bit(vector, used_vectors)) { + set_bit(vector, used_vectors); + if (first_system_vector > vector) + first_system_vector = vector; +} else { + BUG(); +} +``` + +We already saw the `set_bit` macro, now let's look on the `test_bit` and the `first_system_vector`. The first `test_bit` macro defined in the [arch/x86/include/asm/bitops.h](https://github.com/torvalds/linux/blob/master/arch/x86/include/bitops.h) and looks like this: + +```C +#define test_bit(nr, addr) \ + (__builtin_constant_p((nr)) \ + ? constant_test_bit((nr), (addr)) \ + : variable_test_bit((nr), (addr))) +``` + +We can see the [ternary operator](https://en.wikipedia.org/wiki/Ternary_operation) here make a test with the [gcc](https://en.wikipedia.org/wiki/GNU_Compiler_Collection) built-in function `__builtin_constant_p` tests that given vector number (`nr`) is known at compile time. If you're feeling misunderstanding of the `__builtin_constant_p`, we can make simple test: + +```C +#include + +#define PREDEFINED_VAL 1 + +int main() { + int i = 5; + printf("__builtin_constant_p(i) is %d\n", __builtin_constant_p(i)); + printf("__builtin_constant_p(PREDEFINED_VAL) is %d\n", __builtin_constant_p(PREDEFINED_VAL)); + printf("__builtin_constant_p(100) is %d\n", __builtin_constant_p(100)); + + return 0; +} +``` + +and look on the result: + +``` +$ gcc test.c -o test +$ ./test +__builtin_constant_p(i) is 0 +__builtin_constant_p(PREDEFINED_VAL) is 1 +__builtin_constant_p(100) is 1 +``` + +Now I think it must be clear for you. Let's get back to the `test_bit` macro. If the `__builtin_constant_p` will return non-zero, we call `constant_test_bit` function: + +```C +static inline int constant_test_bit(int nr, const void *addr) +{ + const u32 *p = (const u32 *)addr; + + return ((1UL << (nr & 31)) & (p[nr >> 5])) != 0; +} +``` + +and the `variable_test_bit` in other way: + +```C +static inline int variable_test_bit(int nr, const void *addr) +{ + u8 v; + const u32 *p = (const u32 *)addr; + + asm("btl %2,%1; setc %0" : "=qm" (v) : "m" (*p), "Ir" (nr)); + return v; +} +``` + +What's the difference between two these functions and why do we need in two different functions for the same purpose? As you already can guess main purpose is optimization. If we will write simple example with these functions: + +```C +#define CONST 25 + +int main() { + int nr = 24; + variable_test_bit(nr, (int*)0x10000000); + constant_test_bit(CONST, (int*)0x10000000) + return 0; +} +``` + +and will look on the assembly output of our example we will see followig assembly code: + +```assembly +pushq %rbp +movq %rsp, %rbp + +movl $268435456, %esi +movl $25, %edi +call constant_test_bit +``` + +for the `constant_test_bit`, and: + +```assembly +pushq %rbp +movq %rsp, %rbp + +subq $16, %rsp +movl $24, -4(%rbp) +movl -4(%rbp), %eax +movl $268435456, %esi +movl %eax, %edi +call variable_test_bit +``` + +for the `variable_test_bit`. These two code listings starts with the same part, first of all we save base of the current stack frame in the `%rbp` register. But after this code for both examples is different. In the first example we put `$268435456` (here the `$268435456` is our second parameter - `0x10000000`) to the `esi` and `$25` (our first parameter) to the `edi` register and call `constant_test_bit`. We put functuin parameters to the `esi` and `edi` registers because as we are learning Linux kernel for the `x86_64` architecture we use `System V AMD64 ABI` [calling convention](https://en.wikipedia.org/wiki/X86_calling_conventions). All is pretty simple. When we are using predifined constant, the compiler can just substitute its value. Now let's look on the second part. As you can see here, the compiler can not substitute value from the `nr` variable. In this case compiler must calcuate its offset on the programm's [stack frame](https://en.wikipedia.org/wiki/Call_stack). We substract `16` from the `rsp` register to allocate stack for the local variables data and put the `$24` (value of the `nr` variable) to the `rbp` with offset `-4`. Our stack frame will be like this: + +``` + <- stack grows + + %[rbp] + | ++----------+ +---------+ +---------+ +--------+ +| | | | | return | | | +| nr |-| |-| |-| argc | +| | | | | address | | | ++----------+ +---------+ +---------+ +--------+ + | + %[rsp] +``` + +After this we put this value to the `eax`, so `eax` register now contains value of the `nr`. In the end we do the same that in the first example, we put the `$268435456` (the first parameter of the `variable_test_bit` function) and the value of the `eax` (value of `nr`) to the `edi` register (the second parameter of the `variable_test_bit function`). + +The next step after the `apic_intr_init` function will finish its work is the setting interrup gates from the `FIRST_EXTERNAL_VECTOR` or `0x20` to the `0x256`: + +```C +i = FIRST_EXTERNAL_VECTOR; + +#ifndef CONFIG_X86_LOCAL_APIC +#define first_system_vector NR_VECTORS +#endif + +for_each_clear_bit_from(i, used_vectors, first_system_vector) { + set_intr_gate(i, irq_entries_start + 8 * (i - FIRST_EXTERNAL_VECTOR)); +} +``` + +But as we are using the `for_each_clear_bit_from` helper, we set only non-initialized interrupt gates. After this we use the same `for_each_clear_bit_from` helper to fill the non-filled interrupt gates in the interrupt table with the `spurious_interrupt`: + +```C +#ifdef CONFIG_X86_LOCAL_APIC +for_each_clear_bit_from(i, used_vectors, NR_VECTORS) + set_intr_gate(i, spurious_interrupt); +#endif +``` + +Where the `spurious_interrupt` function represent interrupt handler fro the `spurious` interrupt. Here the `used_vectors` is the `unsigned long` that contains already initialized interrupt gates. We already filled first `32` interrupt vectors in the `trap_init` function from the [arch/x86/kernel/setup.c](https://github.com/torvalds/linux/blob/master/arch/x86/kernel/setup.c) source code file: + +```C +for (i = 0; i < FIRST_EXTERNAL_VECTOR; i++) + set_bit(i, used_vectors); +``` + +You can remember how we did it in the sixth [part](http://0xax.gitbooks.io/linux-insides/content/interrupts/interrupts-6.html) of this chapter. + +In the end of the `native_init_IRQ` function we can see the following check: + +```C +if (!acpi_ioapic && !of_ioapic && nr_legacy_irqs()) + setup_irq(2, &irq2); +``` + +First of all let's deal with the condition. The `acpi_ioapic` variable represents existence of [I/O APIC](https://en.wikipedia.org/wiki/Advanced_Programmable_Interrupt_Controller#I.2FO_APICs). It defined in the [arch/x86/kernel/acpi/boot.c](https://github.com/torvalds/linux/blob/master/arch/x86/kernel/acpi/boot.c). This variable set in the `acpi_set_irq_model_ioapic` function that called during the processing `Multiple APIC Description Table`. This occurs during initialization of the architecture-specific stuff in the [arch/x86/kernel/setup.c](https://github.com/torvalds/linux/blob/master/arch/x86/kernel/setup.c) (more about it we will know in the other chapter about [APIC](https://en.wikipedia.org/wiki/Advanced_Programmable_Interrupt_Controller)). Note that the value of the `acpi_ioapic` variable depends on the `CONFIG_ACPI` and `CONFIG_X86_LOCAL_APIC` Linux kernel configuration options. If these options did not set, this variable will be just zero: + +```C +#define acpi_ioapic 0 +``` + +The second condition - `!of_ioapic && nr_legacy_irqs()` checks that we do not use [Open Firmware](https://en.wikipedia.org/wiki/Open_Firmware) `I/O APIC` and legacy interrupt controller. We already know about the `nr_legacy_irqs`. The second is `of_ioapic` variable defined in the [arch/x86/kernel/devicetree.c](https://github.com/torvalds/linux/blob/master/arch/x86/kernel/devicetree.c) and initialized in the `dtb_ioapic_setup` function that build information about `APICs` in the [devicetree](https://en.wikipedia.org/wiki/Device_tree). Note that `of_ioapic` variable depends on the `CONFIG_OF` Linux kernel configuration opiotn. If this option is not set, the value of the `of_ioapic` will be zero too: + +```C +#ifdef CONFIG_OF +extern int of_ioapic; +... +... +... +#else +#define of_ioapic 0 +... +... +... +#endif +``` + +If the condition will return non-zero vaule we call the: + +```C +setup_irq(2, &irq2); +``` + +function. First of all about the `irq2`. The `irq2` is the `irqaction` structure that defined in the [arch/x86/kernel/irqinit.c](https://github.com/torvalds/linux/blob/master/arch/x86/kernel/irqinit.c) source code file and represents `IRQ 2` line that is used to query devices connected cascade: + +```C +static struct irqaction irq2 = { + .handler = no_action, + .name = "cascade", + .flags = IRQF_NO_THREAD, +}; +``` + +Some time ago interrupt controller consisted of two chips and one was connected to second. The second chip that was connected to the first chip via this `IRQ 2` line. This chip serviced lines from `8` to `15` and after after this lines of the first chip. So, for example [Intel 8259A](https://en.wikipedia.org/wiki/Intel_8259) has following lines: + +* `IRQ 0` - system time; +* `IRQ 1` - keyboard; +* `IRQ 2` - used for devices which are cascade connected; +* `IRQ 8` - [RTC](https://en.wikipedia.org/wiki/Real-time_clock); +* `IRQ 9` - reserved; +* `IRQ 10` - reserved; +* `IRQ 11` - reserved; +* `IRQ 12` - `ps/2` mouse; +* `IRQ 13` - coprocessor; +* `IRQ 14` - hard drive controller; +* `IRQ 1` - reserved; +* `IRQ 3` - `COM2` and `COM4`; +* `IRQ 4` - `COM1` and `COM3`; +* `IRQ 5` - `LPT2`; +* `IRQ 6` - drive controller; +* `IRQ 7` - `LPT1`. + +The `setup_irq` function defined in the [kernel/irq/manage.c](https://github.com/torvalds/linux/blob/master/kernel/irq/manage.c) and takes two parameters: + +* vector number of an interrupt; +* `irqaction` structure related with an interrupt. + +This function initializes interrupt descriptor from the given vector number at the beginning: + +```C +struct irq_desc *desc = irq_to_desc(irq); +``` + +And call the `__setup_irq` function that setups given interrupt: + +```C +chip_bus_lock(desc); +retval = __setup_irq(irq, desc, act); +chip_bus_sync_unlock(desc); +return retval; +``` + +Note that the interrupt descriptor is locked during `__setup_irq` function will work. The `__setup_irq` function makes many different things: It creates a handler thread when a thread function is supplied and the interrupt does not nest into another interrupt thread, sets the flags of the chip, fills the `irqaction` structure and many many more. + +All of the above it creates `/prov/vector_number` directory and fills it, but if you are using modern computer all values will be zero there: + +``` +$ cat /proc/irq/2/node +0 + +$cat /proc/irq/2/affinity_hint +00 + +cat /proc/irq/2/spurious +count 0 +unhandled 0 +last_unhandled 0 ms +``` + +because probably `APIC` handles interrupts on the our machine. + +That's all. + +Conclusion +-------------------------------------------------------------------------------- + +It is the end of the eighth part of the [Interrupts and Interrupt Handling](http://0xax.gitbooks.io/linux-insides/content/interrupts/index.html) chapter and we continued to dive into external hardware interrupts in this part. In the previous part we started to do it and saw early initialization of the `IRQs`. In this part we already saw non-early interrupts initialization in the `init_IRQ` function. We saw initialization of the `vector_irq` per-cpu array which is store vector numbers of the interrupts and will be used during interrupt handling and initialization of other stuff which is related to the external hardware interrupts. + +In the next part we will continue to learn interrupts handling related stuff and will see initialization of the `softirqs`. + +If you will have any questions or suggestions write me a comment or ping me at [twitter](https://twitter.com/0xAX). + +**Please note that English is not my first language, And I am really sorry for any inconvenience. If you will find any mistakes please send me PR to [linux-internals](https://github.com/0xAX/linux-internals).** + +Links +-------------------------------------------------------------------------------- + +* [IRQ](https://en.wikipedia.org/wiki/Interrupt_request_%28PC_architecture%29) +* [percpu](http://0xax.gitbooks.io/linux-insides/content/Concepts/per-cpu.html) +* [x86_64](https://en.wikipedia.org/wiki/X86-64) +* [Intel 8259](https://en.wikipedia.org/wiki/Intel_8259) +* [Programmable Interrupt Controller](https://en.wikipedia.org/wiki/Programmable_Interrupt_Controller) +* [ISA](https://en.wikipedia.org/wiki/Industry_Standard_Architecture) +* [MultiProcessor Configuration Table](https://en.wikipedia.org/wiki/MultiProcessor_Specification) +* [Local APIC](https://en.wikipedia.org/wiki/Advanced_Programmable_Interrupt_Controller#Integrated_local_APICs) +* [I/O APIC](https://en.wikipedia.org/wiki/Advanced_Programmable_Interrupt_Controller#I.2FO_APICs) +* [SMP](https://en.wikipedia.org/wiki/Symmetric_multiprocessing) +* [Inter-processor interrupt](https://en.wikipedia.org/wiki/Inter-processor_interrupt) +* [ternary operator](https://en.wikipedia.org/wiki/Ternary_operation) +* [gcc](https://en.wikipedia.org/wiki/GNU_Compiler_Collection) +* [calling convention](https://en.wikipedia.org/wiki/X86_calling_conventions) +* [PDF. System V Application Binary Interface AMD64](http://x86-64.org/documentation/abi.pdf) +* [Call stack](https://en.wikipedia.org/wiki/Call_stack) +* [Open Firmware](https://en.wikipedia.org/wiki/Open_Firmware) +* [devicetree](https://en.wikipedia.org/wiki/Device_tree) +* [RTC](https://en.wikipedia.org/wiki/Real-time_clock) +* [Previous part](http://0xax.gitbooks.io/linux-insides/content/interrupts/interrupts-7.html) From af25a2506371c03fea232565b4fae2116bcb2038 Mon Sep 17 00:00:00 2001 From: 0xAX Date: Sun, 19 Jul 2015 20:11:31 +0600 Subject: [PATCH 13/17] Update README.md --- interrupts/README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/interrupts/README.md b/interrupts/README.md index 19d57eb..8382468 100644 --- a/interrupts/README.md +++ b/interrupts/README.md @@ -9,3 +9,4 @@ You will find a couple of posts which describes an interrupts and an exceptions * [Implementation of exception handlers](https://github.com/0xAX/linux-insides/blob/master/interrupts/interrupts-5.md) - descripbes implementation of some exception handlers as double fault, divide by zero and etc. * [Handling Non-Maskable interrupts](https://github.com/0xAX/linux-insides/blob/master/interrupts/interrupts-6.md) - describes handling of non-maskable interrupts and the rest of interrupts handlers from the architecture-specific part. * [Dive into external hardware interrupts](https://github.com/0xAX/linux-insides/blob/master/interrupts/interrupts-7.md) - this part describes early initialization of code which is related to handling of external hardware interrupts. +* [Non-early initialization of the IRQs](https://github.com/0xAX/linux-insides/blob/master/interrupts/interrupts-8.md) - this part describes non-early initialization of code which is related to handling of external hardware interrupts. From 32439b33f011fd32aef26e17359af58f31357892 Mon Sep 17 00:00:00 2001 From: 0xAX Date: Sun, 19 Jul 2015 20:11:57 +0600 Subject: [PATCH 14/17] Update SUMMARY.md --- SUMMARY.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/SUMMARY.md b/SUMMARY.md index bfd1c2a..58d9471 100644 --- a/SUMMARY.md +++ b/SUMMARY.md @@ -25,7 +25,7 @@ * [Implementation of some exception handlers](interrupts/interrupts-5.md) * [Handling Non-Maskable interrupts](interrupts/interrupts-6.md) * [Dive into external hardware interrupts](interrupts/interrupts-7.md) - * [Initialization of external hardware interrupts structures]() + * [Initialization of external hardware interrupts structures](interrupts/interrupts-8.md) * [Memory management](mm/README.md) * [Memblock](mm/linux-mm-1.md) * [Fixmaps and ioremap](mm/linux-mm-2.md) From 6e1c66f9f8bf8343cba3b71b16279a4cc51ad843 Mon Sep 17 00:00:00 2001 From: 0xAX Date: Mon, 20 Jul 2015 23:35:04 +0600 Subject: [PATCH 15/17] Update SUMMARY.md --- SUMMARY.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/SUMMARY.md b/SUMMARY.md index 58d9471..25166d4 100644 --- a/SUMMARY.md +++ b/SUMMARY.md @@ -45,7 +45,8 @@ * [Initial ram disk]() * [initrd]() * [Misc](Misc/README.md) - * [Kernel building and instalation]() + * [Kernel building and instalation]() + * [How kernel compiled]() * [Write and Submit your first Linux kernel Patch]() * [Data types in the kernel]() * [Useful links](LINKS.md) From 9a27e614eccf559de6615c30e72abe52ca3c3e4d Mon Sep 17 00:00:00 2001 From: 0xAX Date: Tue, 21 Jul 2015 22:48:40 +0600 Subject: [PATCH 16/17] Update contributors.md --- contributors.md | 1 + 1 file changed, 1 insertion(+) diff --git a/contributors.md b/contributors.md index d9f3af1..066d753 100644 --- a/contributors.md +++ b/contributors.md @@ -63,3 +63,4 @@ Thank you to all contributors: * [Adam Shannon](https://github.com/adamdecaf) * [Donny Nadolny](https://github.com/dnadolny) * [Ehsun N](https://github.com/imehsunn) +* [Waqar Ahmed](https://github.com/Waqar144) From b2134001ebaae60540651372bab8c6fd5a0cc55e Mon Sep 17 00:00:00 2001 From: 0xAX Date: Thu, 23 Jul 2015 23:39:50 +0600 Subject: [PATCH 17/17] Update SUMMARY.md --- SUMMARY.md | 1 - 1 file changed, 1 deletion(-) diff --git a/SUMMARY.md b/SUMMARY.md index 25166d4..29a2e75 100644 --- a/SUMMARY.md +++ b/SUMMARY.md @@ -45,7 +45,6 @@ * [Initial ram disk]() * [initrd]() * [Misc](Misc/README.md) - * [Kernel building and instalation]() * [How kernel compiled]() * [Write and Submit your first Linux kernel Patch]() * [Data types in the kernel]()