You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

9.5 KiB

CPU masks

Introduction

Cpumasks is a special way provided by the Linux kernel to store information about CPUs in the system. The relevant source code and header files which are contains API for Cpumasks manipulating:

As comment says from the include/linux/cpumask.h: Cpumasks provide a bitmap suitable for representing the set of CPU's in a system, one bit position per CPU number. We already saw a bit about cpumask in the boot_cpu_init function from the Kernel entry point part. This function makes first boot cpu online, active and etc...:

set_cpu_online(cpu, true);
set_cpu_active(cpu, true);
set_cpu_present(cpu, true);
set_cpu_possible(cpu, true);

set_cpu_possible is a set of cpu ID's which can be plugged in anytime during the life of that system boot. cpu_present represents which CPUs are currently plugged in. cpu_online represents subset of the cpu_present and indicates CPUs which are available for scheduling. These masks depends on CONFIG_HOTPLUG_CPU configuration option and if this option is disabled possible == present and active == online. Implementation of the all of these functions are very similar. Every function checks the second parameter. If it is true, calls cpumask_set_cpu or cpumask_clear_cpu otherwise.

There are two ways for a cpumask creation. First is to use cpumask_t. It defined as:

typedef struct cpumask { DECLARE_BITMAP(bits, NR_CPUS); } cpumask_t;

It wraps cpumask structure which contains one bitmak bits field. DECLARE_BITMAP macro gets two parameters:

  • bitmap name;
  • number of bits.

and creates an array of unsigned long with the give name. It's implementation is pretty easy:

#define DECLARE_BITMAP(name,bits) \
        unsigned long name[BITS_TO_LONGS(bits)]

where BITS_TO_LONG:

#define BITS_TO_LONGS(nr)       DIV_ROUND_UP(nr, BITS_PER_BYTE * sizeof(long))
#define DIV_ROUND_UP(n,d) (((n) + (d) - 1) / (d))

As we learning x86_64 architecture, unsigned long is 8-bytes size and our array will contain only one element:

(((8) + (8) - 1) / (8)) = 1

NR_CPUS macro presents the number of the CPUs in the system and depends on the CONFIG_NR_CPUS macro which defined in the include/linux/threads.h and looks like this:

#ifndef CONFIG_NR_CPUS
        #define CONFIG_NR_CPUS  1
#endif

#define NR_CPUS         CONFIG_NR_CPUS

The second way to define cpumask is to use DECLARE_BITMAP macro directly and to_cpumask macro which convertes given bitmap to the struct cpumask *:

#define to_cpumask(bitmap)                                              \
        ((struct cpumask *)(1 ? (bitmap)                                \
                            : (void *)sizeof(__check_is_bitmap(bitmap))))

We can see ternary operator operator here which is true every time. __check_is_bitmap inline function defined as:

static inline int __check_is_bitmap(const unsigned long *bitmap)
{
        return 1;
}

And returns 1 every time. We need in it here only for one purpose: In compile time it checks that given bitmap is a bitmap, or with another words it checks that given bitmap has type - unsigned long *. So we just pass cpu_possible_bits to the to_cpumask macro for converting array of unsigned long to the struct cpumask *.

cpumask API

As we can define cpumask with one of the method, Linux kernel provides API for manipulating a cpumask. Let's consider one of the function which presented above. For example set_cpu_online. This function takes two parameters:

  • Number of CPU;
  • CPU status;

Implementation of this function looks as:

void set_cpu_online(unsigned int cpu, bool online)
{
	if (online) {
		cpumask_set_cpu(cpu, to_cpumask(cpu_online_bits));
		cpumask_set_cpu(cpu, to_cpumask(cpu_active_bits));
	} else {
		cpumask_clear_cpu(cpu, to_cpumask(cpu_online_bits));
	}
}

First of all it checks the second state parameter and calls cpumask_set_cpu or cpumask_clear_cpu depends on it. Here we can see casting to the struct cpumask * of the second parameter in the cpumask_set_cpu. In our case it is cpu_online_bits which is bitmap and defined as:

static DECLARE_BITMAP(cpu_online_bits, CONFIG_NR_CPUS) __read_mostly;

cpumask_set_cpu function makes only one call of the set_bit function inside:

static inline void cpumask_set_cpu(unsigned int cpu, struct cpumask *dstp)
{
        set_bit(cpumask_check(cpu), cpumask_bits(dstp));
}

set_bit function takes two parameter too, and sets a given bit (first parameter) in the memory (second parameter or cpu_online_bits bitmap). We can see here that before set_bit will be called, its two parameter will be passed to the

  • cpumask_check;
  • cpumask_bits.

Let's consider these two macro. First if cpumask_check does nothing in our case and just returns given parameter. The second cpumask_bits just returns bits field from the given struct cpumask * structure:

#define cpumask_bits(maskp) ((maskp)->bits)

Now let's look on the set_bit implementation:

 static __always_inline void
 set_bit(long nr, volatile unsigned long *addr)
 {
         if (IS_IMMEDIATE(nr)) {
                asm volatile(LOCK_PREFIX "orb %1,%0"
                        : CONST_MASK_ADDR(nr, addr)
                        : "iq" ((u8)CONST_MASK(nr))
                        : "memory");
        } else {
                asm volatile(LOCK_PREFIX "bts %1,%0"
                        : BITOP_ADDR(addr) : "Ir" (nr) : "memory");
        }
 }

This function looks scarry, but it is not so hard as it seems. First of all it passes nr or number of the bit to the IS_IMMEDIATE macro which just makes call of the GCC internal __builtin_constant_p function:

#define IS_IMMEDIATE(nr)    (__builtin_constant_p(nr))

__builtin_constant_p checks that given parameter is known constant at compile-time. As our cpu is not compile-time constant, else clause will be executed:

asm volatile(LOCK_PREFIX "bts %1,%0" : BITOP_ADDR(addr) : "Ir" (nr) : "memory");

Let's try to understand how it works step by step:

LOCK_PREFIX is a x86 lock instruction. This instruction tells to the cpu to occupy the system bus while instruction will be executed. This allows to synchronize memory access, preventing simultaneous access of multiple processors (or devices - DMA controller for example) to one memory cell.

BITOP_ADDR casts given parameter to the (*(volatile long *) and adds +m constraints. + means that this operand is bot read and written by the instruction. m shows that this is memory operand. BITOP_ADDR is defined as:

#define BITOP_ADDR(x) "+m" (*(volatile long *) (x))

Next is the memory clobber. It tells the compiler that the assembly code performs memory reads or writes to items other than those listed in the input and output operands (for example, accessing the memory pointed to by one of the input parameters).

Ir - immideate register operand.

bts instruction sets given bit in a bit string and stores the value of a given bit in the CF flag. So we passed cpu number which is zero in our case and after set_bit will be executed, it sets zero bit in the cpu_online_bits cpumask. It would mean that the first cpu is online at this moment.

Besides the set_cpu_* API, cpumask ofcourse provides another API for cpumasks manipulation. Let's consider it in shoft.

Additional cpumask API

cpumask provides the set of macro for getting amount of the CPUs with different state. For example:

#define num_online_cpus()	cpumask_weight(cpu_online_mask)

This macro returns amount of the online CPUs. It calls cpumask_weight function with the cpu_online_mask bitmap (read about about it). cpumask_wieght function makes an one call of the bitmap_wiegt function with two parameters:

  • cpumask bitmap;
  • nr_cpumask_bits - which is NR_CPUS in our case.
static inline unsigned int cpumask_weight(const struct cpumask *srcp)
{
	return bitmap_weight(cpumask_bits(srcp), nr_cpumask_bits);
}

and calculates amount of the bits in the given bitmap. Besides the num_online_cpus, cpumask provides macros for the all CPU states:

  • num_possible_cpus;
  • num_active_cpus;
  • cpu_online;
  • cpu_possible.

and many more.

Besides that Linux kernel provides following API for the manipulating of cpumask:

  • for_each_cpu - iterates over every cpu in a mask;
  • for_each_cpu_not - iterates over every cpu in a complemented mask;
  • cpumask_clear_cpu - clears a cpu in a cpumask;
  • cpumask_test_cpu - tests a cpu in a mask;
  • cpumask_setall - set all cpus in a mask;
  • cpumask_size - returns size to allocate for a 'struct cpumask' in bytes;

and many many more...