diff --git a/MM/linux-mm-1.md b/MM/linux-mm-1.md index b34ba96..1c645eb 100644 --- a/MM/linux-mm-1.md +++ b/MM/linux-mm-1.md @@ -171,7 +171,7 @@ First of all we get the end of the memory region with the: phys_addr_t end = base + memblock_cap_size(base, &size); ``` -`memblock_cap_size` adjusts `size` that `base + size` will not overflow. Its implementation is pretty easy: +`memblock_cap_size` adjusts `size` so that `base + size` will not overflow. Its implementation is pretty easy: ```C static inline phys_addr_t memblock_cap_size(phys_addr_t base, phys_addr_t *size) @@ -337,10 +337,10 @@ There is also `memblock_reserve` function which does the same as `memblock_add`, Of course this is not the full API. Memblock provides APIs not only for adding `memory` and `reserved` memory regions, but also: -* memblock_remove - removes memory region from memblock; -* memblock_find_in_range - finds free area in given range; -* memblock_free - releases memory region in memblock; -* for_each_mem_range - iterates through memblock areas. +* `memblock_remove` - removes memory region from memblock; +* `memblock_find_in_range` - finds free area in given range; +* `memblock_free` - releases memory region in memblock; +* `for_each_mem_range` - iterates through memblock areas. and many more.... @@ -349,8 +349,8 @@ Getting info about memory regions Memblock also provides an API for getting information about allocated memory regions in the `memblock`. It is split in two parts: -* get_allocated_memblock_memory_regions_info - getting info about memory regions; -* get_allocated_memblock_reserved_regions_info - getting info about reserved regions. +* `get_allocated_memblock_memory_regions_info` - getting info about memory regions; +* `get_allocated_memblock_reserved_regions_info` - getting info about reserved regions. Implementation of these functions is easy. Let's look at `get_allocated_memblock_reserved_regions_info` for example: @@ -401,9 +401,9 @@ And you will see something like this: Memblock also has support in [debugfs](http://en.wikipedia.org/wiki/Debugfs). If you run the kernel on another architecture than `X86` you can access: -* /sys/kernel/debug/memblock/memory -* /sys/kernel/debug/memblock/reserved -* /sys/kernel/debug/memblock/physmem +* `/sys/kernel/debug/memblock/memory` +* `/sys/kernel/debug/memblock/reserved` +* `/sys/kernel/debug/memblock/physmem` to get a dump of the `memblock` contents. diff --git a/MM/linux-mm-2.md b/MM/linux-mm-2.md index 2b2fca0..d31c20f 100644 --- a/MM/linux-mm-2.md +++ b/MM/linux-mm-2.md @@ -36,9 +36,9 @@ Base virtual address and size of the `fix-mapped` area are presented by the two #define FIXADDR_START (FIXADDR_TOP - FIXADDR_SIZE) ``` -Here `__end_of_permanent_fixed_addresses` is an element of the `fixed_addresses` enum and as I wrote above: Every fix-mapped address is represented by an integer index which is defined in the `fixed_addresses`. `PAGE_SHIFT` determines the size of a page. For example size of the one page we can get with the `1 << PAGE_SHIFT` expression. +Here `__end_of_permanent_fixed_addresses` is an element of the `fixed_addresses` enum and as I wrote above, every fix-mapped address is represented by an integer index which is defined in the `fixed_addresses`. `PAGE_SHIFT` determines the size of a page. For example size of the one page we can get with the `1 << PAGE_SHIFT` expression. -In our case we need to get the size of the fix-mapped area, but not only of one page, that's why we are using `__end_of_permanent_fixed_addresses` for getting the size of the fix-mapped area. The `__end_of_permanent_fixed_addresses` is the last index of the `fixed_addresses` enum or in other words the `__end_of_permanent_fixed_addresses` contains amount of pages in a fixed-mapped area. So if multiply value of the `__end_of_permanent_fixed_addresses` on a page size value we will get size of fix-mapped area. In my case it's a little more than `536` kilobytes. In your case it might be a different number, because the size depends on amount of the fix-mapped addresses which are depends on your kernel's configuration. +In our case we need to get the size of the fix-mapped area, but not only of one page, that's why we are using `__end_of_permanent_fixed_addresses` for getting the size of the fix-mapped area. The `__end_of_permanent_fixed_addresses` is the last index of the `fixed_addresses` enum or in other words the `__end_of_permanent_fixed_addresses` contains amount of pages in a fixed-mapped area. So if we multiply the value of the `__end_of_permanent_fixed_addresses` on a page size value we will get size of fix-mapped area. In my case it's a little more than `536` kilobytes. In your case it might be a different number, because the size depends on amount of the fix-mapped addresses which depends on your kernel configuration. The second `FIXADDR_START` macro just subtracts the fix-mapped area size from the last address of the fix-mapped area to get its base virtual address. `FIXADDR_TOP` is a rounded up address from the base address of the [vsyscall](https://lwn.net/Articles/446528/) space: @@ -46,8 +46,8 @@ The second `FIXADDR_START` macro just subtracts the fix-mapped area size from th #define FIXADDR_TOP (round_up(VSYSCALL_ADDR + PAGE_SIZE, 1< Memory Debugging ``` - + menu of the Linux kernel configuration: ![kernel configuration menu](images/kernel_configuration_menu1.png) @@ -140,7 +140,7 @@ config X86 ... ``` -So, there is no anything which is specific for other architectures. +So, there isn't anything which is specific for other architectures. Ok, so we know that `kmemcheck` provides mechanism to check usage of `uninitialized memory` in the Linux kernel and how to enable it. How it does these checks? When the Linux kernel tries to allocate some memory i.e. something is called like this: @@ -148,7 +148,7 @@ Ok, so we know that `kmemcheck` provides mechanism to check usage of `uninitiali struct my_struct *my_struct = kmalloc(sizeof(struct my_struct), GFP_KERNEL); ``` -or in other words somebody wants to access a [page](https://en.wikipedia.org/wiki/Page_%28computer_memory%29), a [page fault](https://en.wikipedia.org/wiki/Page_fault) exception is generated. This is achieved by the fact that the `kmemcheck` marks memory pages as `non-present` (more about this you can read in the special part which is devoted to [Paging](https://0xax.gitbook.io/linux-insides/summary/theory/linux-theory-1)). If a `page fault` exception is occurred, the exception handler knows about it and in a case when the `kmemcheck` is enabled it transfers control to it. After the `kmemcheck` will finish its checks, the page will be marked as `present` and the interrupted code will be able to continue execution. There is little subtlety in this chain. When the first instruction of interrupted code will be executed, the `kmemcheck` will mark the page as `non-present` again. In this way next access to memory will be caught again. +or in other words somebody wants to access a [page](https://en.wikipedia.org/wiki/Page_%28computer_memory%29), a [page fault](https://en.wikipedia.org/wiki/Page_fault) exception is generated. This is achieved by the fact that the `kmemcheck` marks memory pages as `non-present` (more about this you can read in the special part which is devoted to [Paging](https://0xax.gitbook.io/linux-insides/summary/theory/linux-theory-1)). If a `page fault` exception occurred, the exception handler knows about it and in a case when the `kmemcheck` is enabled it transfers control to it. After the `kmemcheck` will finish its checks, the page will be marked as `present` and the interrupted code will be able to continue execution. There is little subtlety in this chain. When the first instruction of interrupted code will be executed, the `kmemcheck` will mark the page as `non-present` again. In this way next access to memory will be caught again. We just considered the `kmemcheck` mechanism from theoretical side. Now let's consider how it is implemented in the Linux kernel. @@ -215,9 +215,9 @@ if (!kmemcheck_selftest()) { printk(KERN_INFO "kmemcheck: Initialized\n"); ``` -and return with the `EINVAL` if this check is failed. The `kmemcheck_selftest` function checks sizes of different memory access related [opcodes](https://en.wikipedia.org/wiki/Opcode) like `rep movsb`, `movzwq` and etc. If sizes of opcodes are equal to expected sizes, the `kmemcheck_selftest` will return `true` and `false` in other way. +and return with the `EINVAL` if this check is failed. The `kmemcheck_selftest` function checks sizes of different memory access related [opcodes](https://en.wikipedia.org/wiki/Opcode) like `rep movsb`, `movzwq` and etc. If sizes of opcodes are equal to expected sizes, the `kmemcheck_selftest` will return `true` and `false` otherwise. -So when the somebody will call: +So when somebody calls: ```C struct my_struct *my_struct = kmalloc(sizeof(struct my_struct), GFP_KERNEL); @@ -236,7 +236,7 @@ if (kmemcheck_enabled && !(cachep->flags & SLAB_NOTRACK)) { } ``` -So, here we check that the if `kmemcheck` is enabled and the `SLAB_NOTRACK` bit is not set in flags we set `non-present` bit for the just allocated page. The `SLAB_NOTRACK` bit tell us to not track uninitialized memory. Additionally we check if a cache object has constructor (details will be considered in next parts) we mark allocated page as uninitialized or unallocated in other way. The `kmemcheck_alloc_shadow` function is defined in the [mm/kmemcheck.c](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/mm/kmemcheck.c) source code file and does following things: +So, here we check that the if `kmemcheck` is enabled and the `SLAB_NOTRACK` bit is not set in flags we set `non-present` bit for the just allocated page. The `SLAB_NOTRACK` bit tell us to not track uninitialized memory. Additionally we check if a cache object has constructor (details will be considered in next parts) we mark allocated page as uninitialized or unallocated otherwise. The `kmemcheck_alloc_shadow` function is defined in the [mm/kmemcheck.c](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/mm/kmemcheck.c) source code file and does following things: ```C void kmemcheck_alloc_shadow(struct page *page, int order, gfp_t flags, int node) @@ -276,7 +276,7 @@ void kmemcheck_hide_pages(struct page *p, unsigned int n) } ``` -Here we go through all pages and and tries to get `page table entry` for each page. If this operation was successful, we unset present bit and set hidden bit in each page. In the end we flush [translation lookaside buffer](https://en.wikipedia.org/wiki/Translation_lookaside_buffer), because some pages was changed. From this point allocated pages are tracked by the `kmemcheck`. Now, as `present` bit is unset, the [page fault](https://en.wikipedia.org/wiki/Page_fault) execution will be occurred right after the `kmalloc` will return pointer to allocated space and a code will try to access this memory. +Here we go through all pages and try to get `page table entry` for each page. If this operation was successful, we unset present bit and set hidden bit in each page. In the end we flush [translation lookaside buffer](https://en.wikipedia.org/wiki/Translation_lookaside_buffer), because some pages was changed. From this point allocated pages are tracked by the `kmemcheck`. Now, as `present` bit is unset, the [page fault](https://en.wikipedia.org/wiki/Page_fault) execution will be occurred right after the `kmalloc` will return pointer to allocated space and a code will try to access this memory. As you may remember from the [second part](https://0xax.gitbook.io/linux-insides/summary/initialization/linux-initialization-2) of the Linux kernel initialization chapter, the `page fault` handler is located in the [arch/x86/mm/fault.c](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/arch/x86/mm/fault.c) source code file and represented by the `do_page_fault` function. We can see following check from the beginning of the `do_page_fault` function: @@ -296,7 +296,7 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code, } ``` -The `kmemcheck_active` gets `kmemcheck_context` [per-cpu](https://0xax.gitbook.io/linux-insides/summary/concepts/linux-cpu-1) structure and return the result of comparison of the `balance` field of this structure with zero: +The `kmemcheck_active` gets `kmemcheck_context` [per-cpu](https://0xax.gitbook.io/linux-insides/summary/concepts/linux-cpu-1) structure and returns the result of comparison of the `balance` field of this structure with zero: ``` bool kmemcheck_active(struct pt_regs *regs) @@ -314,7 +314,7 @@ if (kmemcheck_fault(regs, address, error_code)) return; ``` -First of all the `kmemcheck_fault` function checks that the fault was occurred by the correct reason. At first we check the [flags register](https://en.wikipedia.org/wiki/FLAGS_register) and check that we are in normal kernel mode: +First of all the `kmemcheck_fault` function checks that the fault occurred by the correct reason. At first we check the [flags register](https://en.wikipedia.org/wiki/FLAGS_register) and check that we are in normal kernel mode: ```C if (regs->flags & X86_VM_MASK) @@ -323,7 +323,7 @@ if (regs->cs != __KERNEL_CS) return false; ``` -If these checks wasn't successful we return from the `kmemcheck_fault` function as it was not `kmemcheck` related page fault. After this we try to lookup a `page table entry` related to the faulted address and if we can't find it we return: +If these checks weren't successful we return from the `kmemcheck_fault` function as it was not `kmemcheck` related page fault. After this we try to lookup a `page table entry` related to the faulted address and if we can't find it we return: ```C pte = kmemcheck_pte_lookup(address); @@ -331,7 +331,7 @@ if (!pte) return false; ``` -Last two steps of the `kmemcheck_fault` function is to call the `kmemcheck_access` function which check access to the given page and show addresses again by setting present bit in the given page. The `kmemcheck_access` function does all main job. It check current instruction which caused a page fault. If it will find an error, the context of this error will be saved by `kmemcheck` to the ring queue: +Last two steps of the `kmemcheck_fault` function is to call the `kmemcheck_access` function which check access to the given page and show addresses again by setting present bit in the given page. The `kmemcheck_access` function does all main job. It checks current instruction which caused a page fault. If it finds an error, the context of this error will be saved by `kmemcheck` to the ring queue: ```C static struct kmemcheck_error error_fifo[CONFIG_KMEMCHECK_QUEUE_SIZE]; @@ -343,7 +343,7 @@ The `kmemcheck` mechanism declares special [tasklet](https://0xax.gitbook.io/lin static DECLARE_TASKLET(kmemcheck_tasklet, &do_wakeup, 0); ``` -which runs the `do_wakeup` function from the [arch/x86/mm/kmemcheck/error.c](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/arch/x86/mm/kmemcheck/error.c) source code file when it will be scheduled to run. +which runs the `do_wakeup` function from the [arch/x86/mm/kmemcheck/error.c](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/arch/x86/mm/kmemcheck/error.c) source code file when it is scheduled to run. The `do_wakeup` function will call the `kmemcheck_error_recall` function which will print errors collected by `kmemcheck`. As we already saw the: