arno/clair

Vulnerability Static Analysis for Containers

Go to file

Quentin Machu c5d1a8e5f7 database: update vulnerabilities only when necessary		2016-02-24 16:34:54 -05:00
api	updater: port updater and its fetchers	2016-02-24 16:34:54 -05:00
cmd/clair	updater: move each fetcher to its own package	2016-02-24 16:32:21 -05:00
config	*: refactor & do initial work towards PostgreSQL implementation	2016-02-24 16:32:21 -05:00
contrib	Add output for package causing vulnerability	2016-02-18 15:03:28 +01:00
database	database: update vulnerabilities only when necessary	2016-02-24 16:34:54 -05:00
docs	docs: Add missing field in API Example	2016-01-21 11:27:48 -05:00
health	updater: port updater and its fetchers	2016-02-24 16:34:54 -05:00
notifier	*: refactor & do initial work towards PostgreSQL implementation	2016-02-24 16:32:21 -05:00
updater	updater: ignore Debian's "temp" vulnerabilities	2016-02-24 16:34:54 -05:00
utils	updater: port updater and its fetchers	2016-02-24 16:34:54 -05:00
vendor	main: Use configuration file instead of flags and simplify app extension.	2015-12-08 11:50:52 -05:00
worker	database: do insert/find layers (with their features and vulnerabilities)	2016-02-24 16:32:21 -05:00
.dockerignore	Initial commit	2015-11-13 14:11:28 -05:00
.travis.yml	travis: disable install step	2015-12-04 16:24:03 -05:00
clair.go	updater: port updater and its fetchers	2016-02-24 16:34:54 -05:00
config.example.yaml	docs: update config example	2016-02-24 16:32:21 -05:00
CONTRIBUTING.md	Initial commit	2015-11-13 14:11:28 -05:00
DCO	Initial commit	2015-11-13 14:11:28 -05:00
Dockerfile	dockerfile: syntax updates and s/xz/xz-utils	2016-01-19 13:35:27 -05:00
LICENSE	Initial commit	2015-11-13 14:11:28 -05:00
NOTICE	Initial commit	2015-11-13 14:11:28 -05:00
README.md	docs: provide information to run Clair in README	2016-02-14 21:05:40 -08:00

README.md

Clair

Clair is a container vulnerability analysis service. It provides a list of vulnerabilities that threaten a container, and can notify users when new vulnerabilities that affect existing containers become known.

We named the project « Clair », which in French means clear, bright, transparent, because we believe that it enables users to have a clear insight into the security of their container infrastructure.

Why should I use Clair?

Clair is a single-binary server that exposes a JSON HTTP API. It does not require any in-container monitoring agent, nor any other container modifications. It has been designed to perform massive analysis on the Quay.io Container Registry.

Whether you host a container registry, a continuous-integration system, or build anywhere from dozens to thousands of containers, you can benefit from Clair. More generally, if you consider that container security matters (and, honestly, you should), you should give it a try.

How do I run Clair?

Refer to the documentation here for a detailed overview of how to run Clair.

How Clair Detects Vulnerabilities

Clair analyzes each container layer once, and does not execute the container to perform its examination. The scanning engine extracts all required data to detect known vulnerabilities, and caches layer data for examination against vulnerabilities discovered in the future.

Detecting vulnerabilities can be achieved with several techniques. One option is to compute hashes of binaries. These are presented on a layer and then compared with a database. However, building this database would become tricky considering the number of different packages and library versions.

To detect vulnerabilities, Clair instead takes advantage of common package managers, which quickly and comprehensively provide lists of installed binary and source packages. Package lists are extracted for each layer that composes your container image: the difference between the layer’s package list and its parent one is stored. This method is efficient in its use of storage, and allows Clair to scan each layer only once, though that layer may be used in many container images. Coupled with vulnerability databases such as the Debian’s Security Bug Tracker, Clair is able to tell which vulnerabilities threaten a container, and which layer and package introduced them.

Graph

Internally, Clair implements a graph structure to store and query layer data. The non-exhaustive example graph below corresponds to the following Dockerfile.

1.  MAINTAINER Quentin Machu <quentin.machu@coreos.com>
2.  FROM ubuntu:trusty
3.  RUN apt−get update && apt−get upgrade −y
4.  EXPOSE 22
5.  CMD ["/usr/sbin/sshd", "-D"]

The above image shows five layers represented by the purple nodes, associated with their IDs and parents. Because the second layer imports Ubuntu Trusty in the container, Clair can detect the operating system and some packages, depicted in green (we only show one here for the sake of simplicity). The third layer upgrades packages, so the graph reflects that this layer removes the previous version and installs the new one. Finally, the graph knows about a vulnerability, drawn in red, which is fixed by a particular package. Note that two synthetic package versions exist (0 and ∞): they ensure database consistency during parallel modification. ∞ also allows us to define very easily that a vulnerability is not yet fixed; thus, it affects every package version.

Querying this particular graph will tell us that our image is not vulnerable at all because none of the successor versions of its only package fix any vulnerability. However, an image based on the second layer could be vulnerable.

Architecture

Clair is divided into X main modules (which represent Go packages):

api defines how users interact with Clair and exposes a documented HTTP API.
worker extracts useful informations from layers and store everything in the database.
updater periodically updates Clair's vulnerability database from known vulnerability sources.
notifier dispatches notifications about vulnerable containers when vulnerabilities are released or updated.
database persists layers informations and vulnerabilities in Cayley graph database.
health summarizes health checks of every Clair's services.

Multiple backend databases are supported, a testing deployment would use an in-memory storage while a production deployment should use Bolt (single-instance deployment) or PostgreSQL (distributed deployment, probably behind a load-balancer). To learn more about how to run Clair, take a look at the doc.

Detectors & Fetchers

Clair currently supports three operating systems and their package managers, which we believe are the most common ones: Debian (dpkg), Ubuntu (dpkg), CentOS (rpm).

Supporting an operating system implies that we are able to extract the operating system's name and version from a layer and the list of package it has. This is done inside the worker/detectors package and extending that is straightforward.

All of this is useless if no vulnerability is known for any of these packages. The updater/fetchers package defines trusted sources of vulnerabilities, how to fetch them and parse them. For now, Clair uses three databases, one for each supported operating system:

Using these distro-specific sources gives us confidence that Clair can take into consideration all the different package implementations and backports without ever reporting anything possibly inaccurate.

Coming Soon

Improved performances.
Extended detection system
- More package managers
- Generic features such as detecting presence/absence of files
- ...
Expose more informations about vulnerability
- Access vector
- Acess complexity
- ...

Talk @ ContainerDays NYC 2015 [Slides] [Video]
Quay: First container registry using Clair.

README.md Unescape Escape