Multipath and Anaconda¶
Authors: | Ales Kozumplik <akozumpl@redhat.com> |
---|
Introduction¶
If there are two block devices in your /dev for which udev reports the same ‘ID_SERIAL’ then you can create a certain device mapper device which arbitrarily uses those devices to access the physical device. And that is Multipath [1].
For instance, suppose there are:
/dev/sda, with ID_SERIAL of 20090ef12700001d2, and
/dev/sdb, with the same ID_SERIAL.
Those are probably some adapters in the system that just connect your box to a storage area network (SAN) somewhere. There are perhaps two cables, one for sda, one for sdb, and if one of the cables gets cut the other can still transmit data. Normally the system won’t recognize that sda and sdb have this special relation to each other, but by creating a suitable device map using multipath tools [2] we can create a DM device /dev/mapper/mpatha and use it for storing and retrieving data.
The device mapper then automatically routes IO requests to /dev/mapper/mpatha to either sda or sdb depending on the load of the line or network congestion on the particular network etc.
The nomenclature I will use here is: - ‘multipath device’ for the smart /dev/mapper/mpathX device. - ‘multipath member device’ for the ‘/dev/sdX’ devices. Also ‘a path’.
What is expected from Anaconda¶
Anaconda is expected to: - detect that there are multipath devices present - coalesce all relevant (e.g. exclusiveDisks) multipath devices. - only let the user interact with the multipath devices in filtering,
cleardiskssel and partition screen, that is once we know ‘sdc’ and ‘sdd’ are part of ‘mpathb’ show only ‘mpathb’ and never the paths.
- install bootloader and boot from an mpath device
- make it happen so all the multipath devices (carrying or not the root filesystem) we used for installation are correctly coalesced in the booted system. This is achieved by generating a suitable /etc/multipath.conf and writing it into sysroot.
- be able to refer to mpath devices from kickstart, either by name like ‘mpatha’ or by their id like ‘disk/by-id/scsi-20090ef12700001d2’
How Anaconda handles multipath¶
To detect presence of multipath devices we rely on multipath tools. The same we do for coalescing, see pyanaconda/storage/devicelibs/mpath.py, the file that provides some abstraction from mpath tools. During the device scan we use the ‘multipath -d’ output to find out what devices are going to end up as multipath members. The MultipathTopology object also enhances the multipath member’s udev dictionaries with ‘ID_FS_TYPE’ set to ‘multipath_member’ (yes, this is a hack surviving from the original mpath implementation, and righteous is he who eradicates it). This information is picked up by DeviceTree when populating itself. Meaning, if ‘sda’ and ‘sdb’ are multipath member devices DeviceTree gives them MultipathMember format and creates one MultipathDevice for them (we know its name from ‘multipath -d’). We end up with:
DiskDevice ‘sda’, format ‘MultipathMember’ DiskDevice ‘sdb’, format ‘MultipathMember’ MultipathDevice ‘mpatha’, parents are ‘sda’ and ‘sdb’.
From then on, Anaconda only deals with the MultipathDevice and generally leaves anything with ‘MultipathMember’ format alone (understand, this is an inert format that really is not there but we use it just to mark the device as “useless beyond a multipath member”, kind of like MDRaidMember).
Partition happens over the multipath device and during the preinstallconfig step /mnt/sysimage/etc/multipath.conf is created and filled with information about the coalesced devices. This is handled in the Storage.write() method. It is important this file and /etc/multipath/wwids (autogenerated by mpath tools) make it to the sysimage before the dracut image is generated.
Debugging multipath bugs¶
Unlike with iSCSI, to reproduce a multipath bug one does not need the same specific hardware as the reporter. Just found any box connected to a multipathed SAN and you are fine (at the moment, connecting to the same iSCSI target through its IPv4 and IPv6 address also produces a multipathed device).
On top of that, much of the necessary information is already included in the anaconda logs or can be easily extracted from the reporter. The things to particularly look at are:
- storage.log, the output around ‘devices to scan for multipath’ and ‘devices post multipath scan’. The latter shows a triple with regular disks, disks comprising multipath devices and partitions. This helps you quickly find out what the target system is about.
- this information is also in program.log’s calls to ‘multipath’ [3]. If mpath devices are mysteriously appearing/disappearing between filtering and partitioning screens look at those. ‘multipath -ll’ is called to display currently coalesced mpath devices, ‘multipath -d’ is called to show the mpath devices that would be coalesced if we ran ‘multipath’ now. This is exploited by the device filtering screen.
Future of multipath in Anaconda¶
Overall as of RHEL6.2, the shape of multipath in Anaconda is good and what’s more important it is flexible enough to sustain new RFEs and bugs. Those are however bugs that I expect to appear sometime soon:
- enable or disable mpath_friendly_names in kickstart. Disabling friendly names just means the mpath devices are called by their wwid, e.g. /dev/mapper/360334332345343234, not ‘/dev/mapper/mpathc’. This is straightforward to implement.
- extend support for mpath devices in kickstart in general. Currently mpath devices should be accepted in most commands but I am sure there will be corner cases. Difficulty medium.
- [rawhide] stop extending the udev info dictionary with ‘ID_FS_TYPE’ and ‘ID_MPATH_NAME’. Doing it this way is asking for the trouble if a dictionary of particular mpath device is reloaded from udev without running it through the MultipathTopology object as it will miss those entries (and DeviceTree depends on them a lot). Difficulty hard, but includes a lot of pleasant refactoring.
- Improve support for multipathing iSCSI devices. Someone might ask for it one day (in fact, with the NIC bounding they already did), and it will make mpath debugging possible on any virt machine with multiple virt NICs.
[1] | http://akozumpl.fedorapeople.org/archive/Multipass.jpg |
[2] | http://christophe.varoqui.free.fr/ |
[3] | ‘man 8 multipath’ |