open-nomad/website/pages/docs/internals/security.mdx

---
layout: docs
page_title: Security Model
sidebar_title: Security Model
description: >-
  Nomad relies on both a lightweight gossip mechanism and an RPC system to
  provide various features. Both of the systems have different security
  mechanisms that stem from their designs. However, the security mechanisms of
  Nomad have a common goal: to provide confidentiality, integrity, and
  authentication.
---

## Overview

Nomad is a flexible workload orchestrator to deploy and manage any containerized
or legacy application using a single, unified workflow. It can run diverse
workloads including Docker, non-containerized, microservice, and batch
applications.

Nomad utilizes a lightweight gossip and RPC system, [similar to
Consul](https://www.consul.io/docs/internals/security.html), which provides
various essential features. Both of these systems provide security mechanisms
which should be utilized to help provide [confidentiality, integrity and
authentication](https://en.wikipedia.org/wiki/Information_security).

Using defense in depth is crucial for cluster security, and deployment
requirements may differ drastically depending on your use case. Further security
features for multi-tenant deployments are offered exclusively in the enterprise
version. This documentation may need to be adapted to your deployment situation,
but the general mechanisms for a secure Nomad deployment revolve around:

* **[mTLS](/guides/security/securing-nomad.html)** -
    Mutual authorization of both the TLS server and client x509 certificates
    prevents internal abuse by preventing unauthorized access to network
    components within the cluster.

* **[ACLs](/guides/security/acl.html)** - Allow for
    roles to be applied to authorized connections by granting capabilities for a
    token.

* **[Namespaces](/docs/enterprise/index.html#namespaces)**
    (**Enterprise Only**) - Access to read and write to a Namepsace can be
    controlled to allow for granular access to job information managed within a
    multi-tenant cluster.

* **[Sentinel Policies](/docs/enterprise/index.html#sentinel-policies)**
    (**Enterprise Only**) - Sentinel policies allow for granular control over
    components such as task drivers within a cluster.

### Personas

When thinking about Nomad, it helps to consider the following types of base
personas when managing the security requirements for the cluster deployment. The
granularity may change depending on your team’s use case where rigorous roles
can be accurately defined and managed using the [Nomad backend secret engine for
Vault](https://www.vaultproject.io/docs/secrets/nomad/index.html). This is
described further with getting started steps using a development server
[here](/guides/security/acl.html#vault-integration).

It’s super important to note that there's no traditional concept of a user
within Nomad itself.

* **System Administrator** - This is someone who has access to the underlying
    infrastructure to a Nomad cluster. Often she has access to SSH or RDP
    directly into a server within a cluster through a bastion host. Ultimately
    they have read, write and execute permissions for the actual Nomad binary.
    This binary is the same for server and client nodes using different
    configuration files. These users potentially have something like sudo,
    administrative, or some other super-user access to the underlying compute
    resource. Users like these are essentially totally trusted by Nomad as they
    have administrative rights to the system and can start or stop the agent.

* **Nomad Administrator** - This is someone ( probably the same **System
    Administrator** ) who has access to define the Nomad agent configurations
    for servers and clients. They also have total rights to all of the parts in
    the Nomad system including the ability to start and stop all jobs within a
    cluster.

* **Nomad Operator** - This is someone who likely has selective access with
    restricted capabilities to manage jobs applicable to their namespace within
    a cluster.

* **User** - This is someone who is a user of an application being run on the
    system. In some cases applications may be public facing and exposed to the
    internet such as a web server. This is someone who shouldn’t have any
    network access to the Nomad server API.

### Secure Configuration

Nomad’s security model is applicable only if all parts of the system are running
with a secure configuration; it is not secure-by-default. Without the following
mechanisms enabled in Nomad’s configuration, it may be possible to abuse access
to a cluster. Like all security considerations, one must appropriately determine
what concerns they have for their environment and adapt to these security
recommendations accordingly.

#### Requirements

* **[mTLS enabled](/guides/security/securing-nomad.html)**
    - Mutual TLS ( mTLS ) enables [mutual
    authentication](https://en.wikipedia.org/wiki/Mutual_authentication) with
    security properties to prevent the following problems:

  * Unauthorized access because both server and clients must provide valid TLS
    [X.509](https://en.wikipedia.org/wiki/X.509) certificates signed by the same
    valid [CA](https://en.wikipedia.org/wiki/Certificate_authority) in order to
    communicate within the cluster.

  * Observing or tampering communication between nodes is thwarted due to the
    traffic being encrypted using the well known network security protocol
    [TLS](https://en.wikipedia.org/wiki/Transport_Layer_Security) version 1.2,
    with a [configurable minimal
    version](/docs/configuration/tls.html#tls_min_version).
    Both server and client agents must be configured to validate each other's
    certificates to ensure mTLS is actually enabled. This requires appropriate
    certificates to be distributed to servers, clients, machines, or operators
    for things like CLI usage. It is recommended to use
    [Vault](/guides/security/vault-pki-integration.html)
    to securely manage the certificate creation and rotation for nodes.

  * Agent role misconfiguration is prevented using the X.509
    [SAN](https://en.wikipedia.org/wiki/Subject_Alternative_Name) extension.
    This is essentially a domain name that is used to identify and verify a
    node’s region and role name are configured as expected ( e.g.
    `client.us-east.nomad` ).

  * Using the previously mentioned role name prevents maliciously masquerading
    as a server or client node, and allows other services to be signed easily by
    the same CA. This also avoids any potential pitfalls with certificates using
    the IP or Hostname of nodes within a cluster.

* **[ACLs enabled](/guides/security/acl.html)** - The
    access control list (ACL) system provides a capability-based control
    mechanism for Nomad administrators allowing for custom roles ( typically
    within Vault ) to be tied to an individual human or machine operator
    identity. This allows for access to capabilities within the cluster to be
    restricted to specific users.

* **[Sentinel Policies](/guides/governance-and-policy/sentinel/sentinel-policy.html)**
    (**Enterprise Only**) - [Sentinel](https://www.hashicorp.com/sentinel/) is
    a feature which enables
    [policy-as-code](https://docs.hashicorp.com/sentinel/concepts/policy-as-code/)
    to enforce further restrictions on operators. This is used to augment the
    built-in ACL system for fine-grained control over jobs.

* **[Namespaces](/guides/governance-and-policy/namespaces.html)**
    (**Enterprise Only**) - This feature allows for a cluster to be shared by
    multiple teams within a company. Using this logical separation is important
    for multi-tenant clusters to prevent users without access to that namespace
    from conflicting with each other. This requires ACLs to be enabled in order
    to be enforced.

* **[Resource Quotas](/guides/governance-and-policy/quotas.html)**
    (**Enterprise Only**) - Can limit a namespace’s access to the underlying
    compute resources in the cluster by setting upper-limits for operators.
    Access to these resource quotas can be managed via ACLs to ensure read-only
    access for operators so they can’t just change their quotas.

#### Recommendations

The following are security recommendations that can help significantly improve
the security of your cluster depending on your use case. We recommend always
practicing defense in depth when architecting the security mechanisms for your
environment.

* **[Rotate Credentials](/docs/job-specification/vault.html)** -
    Using something like [Vault](/docs/vault-integration/index.html) to
    create and manage dynamic, rotated credentials is highly recommended to
    prevent secrets from being easily exposed within the [job
    specification](/docs/job-specification/index.html)
    itself which may be leaked into version control or otherwise be accidently
    stored on disk on an operator’s local machine. It is also possible to
    [integrate with Vault’s PKI secret engine](/guides/security/vault-pki-integration.html)
    to automatically generate and renew dynamic, unique X.509 certificates for
    each Nomad node with a short
    [TTL](https://en.wikipedia.org/wiki/Time_to_live).

* **[Running without Root](https://groups.google.com/forum/#!topic/nomad-tool/pSyMwC_FSFA)** -
    Certain features of Nomad can be used without needing to run the Nomad agent
    server or client as the `root` user. Instead you can granularly assign the
    appropriate capabilities in various ways for your Nomad agents. For example:
    Nomad servers only require access to the data directory; it is possible to
    use Nomad to orchestrate Docker containers by adding a non-root `nomad` user
    to the `docker` group to access the [default unix
    socket](https://docs.docker.com/engine/reference/commandline/dockerd/#daemon-socket-option).

* **Containers with Sandbox Runtimes** - In some situations, such as running
    untrusted code as a service, it may be worth considering using different
    container runtimes such as [gVisor](https://gvisor.dev/) or [Kata
    Containers](https://katacontainers.io/). These types of runtimes provide
    sandboxing features which help prevent raw access to the underlying shared
    kernel for other containers and the Nomad client agent itself.

* **[Disable Unused Drivers](/docs/configuration/client#driver-blacklist)** -
    Each driver provides different degrees of isolation, and bugs may allow
    unintended privilege escalation. If a task driver is not needed, you can
    disable it to reduce risk.

* **Linux Security Modules** - Use of security modules that can be directly
    integrated into operating systems such as AppArmor, SElinux, and Seccomp on
    both the Nomad hosts and applied to containers for an extra layer of
    security. Seccomp profiles are able to be passed directly to containers
    using the
    **[security_opt](/docs/drivers/docker.html#security_opt)**
    parameter available in the default [Docker
    driver](/docs/drivers/docker.html).

* **[Service Mesh](https://www.hashicorp.com/resources/service-mesh-microservices-networking)** -
    Integrating service mesh technologies such as
    **[Consul](https://www.consul.io/)** can be extremely useful for limiting
    and efficiently load balancing network connectivity within a cluster.

### Threat Model

The following are parts of the Nomad threat model:

* **Nomad agent-to-agent communication** - Transport encryption for
    agent-to-agent communication is required to prevent eavesdropping. TCP and
    UDP based protocols within Nomad provide different mechanisms for enabling
    encryption including symmetric (shared gossip encryption keys) and
    asymmetric keys (TLS).

* **Tampering of data in transit** - Any tampering should be detectable via mTLS
    and cause Nomad to avoid processing the request.

* **Access to data without authentication or authorization** - Requests to the
    server should be authenticated and authorized using mTLS and ACLs
    respectively.

* **State modification or corruption due to malicious messages** - Improperly
    formatted messages are discarded while properly formatted messages require
    authentication and authorization.

* **Non-server members accessing raw data** - All servers that join the cluster
    require proper authentication and authorization in order to begin
    participating in Raft. All data in Raft should be encrypted with TLS.

* **Denial of Service against a node** - DoS attacks against a single node
    should not compromise the security posture of Nomad.

The following are not part of the threat model for server agents:

* **Access (read or write) to the Nomad data directory** - Information about the
    jobs managed by Nomad is persisted to a server’s data directory.

* **Access (read or write) to the Nomad configuration directory** - Access to
    Nomad’s configuration file(s) directory can enable and disable features for
    a cluster.

* **Memory access to a running Nomad server agent** - Direct access to the
    memory of the Nomad server agent process ( usually requiring a shell on the
    system through various means ) results in almost all aspects of the agent
    being compromised including access to certificates and other secrets.

The following are not part of the threat model for client agents:

* **Access (read or write) to the Nomad data directory** - Information about the
    allocations scheduled to a Nomad client is persisted to its data directory.
    This would include any secrets in any of the allocation’s file systems.

* **Access (read or write) to the Nomad configuration directory** - Access to a
    client’s configuration file can enable and disable features for a client
    including insecure drivers such as
    [raw_exec](/docs/drivers/raw_exec.html).

* **Memory access to a running Nomad client agent** - Direct access to the
    memory of the Nomad client agent process allows an attack to extract secrets
    from clients such as Vault tokens.

* **Lax Client Driver Sandbox** - Drivers may allow some privileged operations,
    e.g. filesystem access to configuration directories, or raw accesses to host
    devices. Such privileges can be used to facilitate compromise other workloads,
    or cause denial-of-service attacks.

#### Internal Threats

* **Operator** - Someone with a valid mTLS cert and ACL token may still be a
    threat to your cluster in certain situations, especially in multi-team
    cluster deployments. They may accidentally or intentionally use a malicious
    jobspec to harm a cluster which can help be protected against using
    Namespaces and Sentinel policies.

* **Workload** - Workloads may have host network access within a cluster which
    can lead to SSRF due to application security issues outside of the scope of
    Nomad which may lead to internal access within the cluster. Using mTLS, ACLs
    and Sentinel policies together can add layers of protection against
    malicious workloads.

* **RPC / API Access** - RPC and HTTP API endpoints without mTLS can expose
    clusters to abuse within the cluster from malicious workloads.

* **Client driver** - Drivers implement various workload types for a cluster,
    and the backend configuration of these drivers should be considered to
    implement defense in depth. For example, a custom Docker driver that limits
    the ability to mount the host file system may be subverted by network access
    to an exposed Docker daemon API through other means such as the raw_exec
    driver.


#### External Threats

There are two main components to consider to for external threats in a Nomad cluster:

* **Server agent** - Internal cluster leader elections and replication is
    managed via Raft between server agents encrypted in transit. However,
    information about the server is stored unencrypted at rest in the agent’s
    data directory. This information may contain information such as ACL tokens
    and TLS certificates.

* **Client agent** - Client-to-server communication within a cluster is
    encrypted and authenticated using mTLS. Information about the allocations on
    a client node is unencrypted in the agent’s data and configuration
    directory.

### Network Ports


| **Port / Protocol**  | Agents  | Description |
|----------------------|---------|-------------|
| **4646** / TCP       | All     | [HTTP](https://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol) to provide [UI](/guides/web-ui/access.html) and [API](/api/index.html) access to agents. |
| **4647** / TCP       | Servers | [RPC](https://en.wikipedia.org/wiki/Remote_procedure_call) protocol used by agents. |
| **4648** / TCP + UDP | Servers | [gossip](/docs/internals/gossip.html) protocol to manage server membership using [Serf](https://www.serf.io/). |