Commit https://github.com/QubesOS/qubes-linux-utils/commit/c1d42f1 --
"qfile-unpacker: do not call fdatasync() at each file" fixing
QubesOS/qubes-issues#1257 -- increased the chance of data loss with
qvm-move-to-vm: Say it nominally succeeds, and *deletes* the files from
the source VM. Soon after, the destination VM or the system could crash,
or an external drive hosting ~/QubesIncoming/srcVM could get unplugged
by accident, all before the data had really been persisted to disk.
But reverting the commit (ignoring the performance issue) wouldn't
completely solve this:
"Calling fsync() does not necessarily ensure that the entry in the
directory containing the file has also reached disk. For that an
explicit fsync() on a file descriptor for the directory is also
needed." - fsync(2)
It gets even worse for "slow symlinks" (whose target is too long to be
stored directly in the inode metadata), apparently they can't be synced
at all individually.
So instead, just call syncfs() once after everything has been unpacked:
+ Should prevent all data loss (if fs and disk are well behaved)
+ Allows caching and reordering -> no slowdown with many small files
- Blocks until any unrelated writes on the filesystem finish :\
(cherry picked from commit 4f59b3df6f)
The filesystem hosting ~/QubesIncoming/srcVM/ needs to support O_TMPFILE
too, in addition to the kernel. If it doesn't, take the use_tmpfile = 0
fallback.
(cherry picked from commit 74a1b4cc50)
It is required to prevent deadlocks in single-threaded select-based IO
programs (namely: qrexec). POSIX API doesn't support checking how much
can be written to pipe/socket without blocking, so to prevent blocking
application must use O_NONBLOCK mode, and somehow deal with non-written
data (buffer it).
QubesOS/qubes-issues#1347
(cherry picked from commit 6a44eaeb09)
POSIX requires that a read(2) which can be proved to occur after a
write() has returned returns the new data.
We want here only that other processes in the same VM will see the
file either fully written, or not see it at all. So ensuring that
linkat(2) is called after write is completed should be enough.
FixesQubesOS/qubes-issues#1257
(cherry picked from commit c1d42f1602)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJU3OKsAAoJEBu5sftaTG2teD4P/2AgLmFI2x8RHqHz17T+lDZW
gD4QoQlTFm1jysvexwLSCKYh4yYOKmMOaCs8UVc44k1KTxr1l/XYPhTFpzDg1gmb
6zAKV36yxRZuB/3oRQ1tpUFN8obgV3GQh9Uz7zyOV8a34xSLkulUqp86ceW8gEyR
XHlUF2XSNpOLca56IOTHzSlvje+kTxTON9OAfNv18cPv+Um27xt+IAz2nl7jytlc
2SgZnJgcdf+blFvdxoEJQ9Dky3jLxcE/W4HMNmMPBEknSJYhoVNaSWtfgvJO66E9
M0CIUk2v068vDSdmC7OUIDgQ/URE6SW85OyTvQlKOft3k33mZkSWog6y7FEbuXAS
lWpXMR4xwnOqJtFTpKsGNyylqhNZhS1UQ4TpMgQijjxqs6oCWH42KwzSpPjd+zyq
Vn151qsBg2UGMT5OqePDBq0fLFbN1Jfk1Oja78XFZ4PAKsvTmKdMd2oEaU10Wzkr
jOpiEXtOK6QBWQYRySJH5GdFqEc2K4HFtHJPZPg6oIX7nMq9p8k3khfRDTgQ94nW
qMwOoGa/rfuh/8PmSoMsvsceGHDzVV1zZtIVPHnzoQcDjp4wkKodD0dSRV/FC/4B
lFsBS+UJMgOIvywzoRaU4lJowY0TPokg/MYPPYou3efWzDZCvB555n75gtRxYdg1
TQz5tLSVBp9E2JNqt7is
=ZgPE
-----END PGP SIGNATURE-----
Merge tag 'jm_96301f3c'
Tag for commit 96301f3cc1
# gpg: Signature made Thu Feb 12 18:28:12 2015 CET using RSA key ID 5A4C6DAD
# gpg: Good signature from "Jason Mehring (Qubes OS Signing Key) <nrgaway@gmail.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: E0E3 2283 FDCA C1A5 1007 8F27 1BB9 B1FB 5A4C 6DAD
Instead of pipes, stdin/out/err are created as sockets. This allows
qrexec-agent/daemon to decide to use some of them bidirectional. This is
up to qrexec-agent/daemon, such socket can still be used as
unidirectional channel.
The main reason for this feature is to use USBIP over qrexec, which
require single socket.
linux-utils/qrexec-lib/unpack.c:
Different compile errors will abort. Both different for fc20/21 but
based on same error below:
*
* FC21 ERROR: (but FC20 needs the code)
* unpack.c:31:0: error: "O_TMPFILE" redefined [-Werror]
* #define O_TMPFILE (__O_TMPFILE | O_DIRECTORY)
* ^
* In file included from /usr/include/bits/fcntl.h:61:0,
* from /usr/include/fcntl.h:35,
* from unpack.c:4:
* /usr/include/bits/fcntl-linux.h:151:0: note: this is the location of the previous definition
* # define O_TMPFILE __O_TMPFILE / * Atomically create nameless file. * /
* ^
* cc1: all warnings being treated as errors
* <builtin>: recipe for target 'unpack.o' failed
*/
/* #define O_TMPFILE (__O_TMPFILE | O_DIRECTORY) */
When file opened with O_TMPFILE but use_tmpfile==0, the file will not be
linked to the directory (the code at the end of process_one_file_reg).
Additionally it is waste of time trying using O_TMPFILE when it's
already known it shouldn't be.
Also use_tmpfile==0 can mean we don't have access to /proc
(set_procfs_fd wasn't called), so even if linking the file to its
directory would be attempted, it would fail. This is the case for
dom0-updates copy.
Otherwise source domain can modify (append) the file while the user
already is accessing it. While incoming files should be treated as
untrusted, this problem could allow file modification after the user
makes some sanity checks.
It can happen that we already cleared libvchan_fd pending state via
libvchan_wait, but data arrived later. This is especially true just
after connection, when client send unsolicited notification to server,
which can confuse it with some requested notification.
By passing an empty file with a declared negative size,
a hostile VM can decrease the total bytes counter, while
not have do supply a huge amount of data, thus disabing
the byte size check, and potentially filling the target
filesystem.
Also do not rely on unpack being called just once if we don't
have to and initialize counts.
Since we don't know directory size before populating with files,
we just accumulate the size on the second pass, but do not actually
check for the limit being reached. If there's any file after that,
that'll trip the check.
The byte limit would be hit if adding one byte to a buffer
that's half the limit, due to the temporary double copy.
Not sure if that's something that's worth changing.