website: working on internals documentation

This commit is contained in:
Armon Dadgar 2015-09-17 17:59:45 -07:00
parent 35881dc469
commit 9a0d50bdc2
11 changed files with 290 additions and 348 deletions

View File

@ -126,6 +126,7 @@ ensuring PCI compliant workloads run on appropriate servers.
# Getting in Depth
This has been a brief high-level overview of the architecture of Nomad. There
are more details available for each of the sub-systems.
are more details available for each of the sub-systems. The [consensus protocol](/docs/internals/consensus.html) is
documented in detail as is the [gossip protocol](/docs/internals/gossip.html).
For other details, either consult the code, ask in IRC or reach out to the mailing list.

View File

@ -0,0 +1,205 @@
---
layout: "docs"
page_title: "Consensus Protocol"
sidebar_current: "docs-internals-consensus"
description: |-
Nomad uses a consensus protocol to provide Consistency as defined by CAP. The consensus protocol is based on Raft: In search of an Understandable Consensus Algorithm. For a visual explanation of Raft, see The Secret Lives of Data.
---
# Consensus Protocol
Nomad uses a [consensus protocol](http://en.wikipedia.org/wiki/Consensus_(computer_science))
to provide [Consistency (as defined by CAP)](http://en.wikipedia.org/wiki/CAP_theorem).
The consensus protocol is based on
["Raft: In search of an Understandable Consensus Algorithm"](https://ramcloud.stanford.edu/wiki/download/attachments/11370504/raft.pdf).
For a visual explanation of Raft, see [The Secret Lives of Data](http://thesecretlivesofdata.com/raft).
~> **Advanced Topic!** This page covers technical details of
the internals of Nomad. You don't need to know these details to effectively
operate and use Nomad. These details are documented here for those who wish
to learn about them without having to go spelunking through the source code.
## Raft Protocol Overview
Raft is a consensus algorithm that is based on
[Paxos](http://en.wikipedia.org/wiki/Paxos_%28computer_science%29). Compared
to Paxos, Raft is designed to have fewer states and a simpler, more
understandable algorithm.
There are a few key terms to know when discussing Raft:
* Log - The primary unit of work in a Raft system is a log entry. The problem
of consistency can be decomposed into a *replicated log*. A log is an ordered
sequence of entries. We consider the log consistent if all members agree on
the entries and their order.
* FSM - [Finite State Machine](http://en.wikipedia.org/wiki/Finite-state_machine).
An FSM is a collection of finite states with transitions between them. As new logs
are applied, the FSM is allowed to transition between states. Application of the
same sequence of logs must result in the same state, meaning behavior must be deterministic.
* Peer set - The peer set is the set of all members participating in log replication.
For Nomad's purposes, all server nodes are in the peer set of the local region.
* Quorum - A quorum is a majority of members from a peer set: for a set of size `n`,
quorum requires at least `(n/2)+1` members.
For example, if there are 5 members in the peer set, we would need 3 nodes
to form a quorum. If a quorum of nodes is unavailable for any reason, the
cluster becomes *unavailable* and no new logs can be committed.
* Committed Entry - An entry is considered *committed* when it is durably stored
on a quorum of nodes. Once an entry is committed it can be applied.
* Leader - At any given time, the peer set elects a single node to be the leader.
The leader is responsible for ingesting new log entries, replicating to followers,
and managing when an entry is considered committed.
Raft is a complex protocol and will not be covered here in detail (for those who
desire a more comprehensive treatment, the full specification is available in this
[paper](https://ramcloud.stanford.edu/wiki/download/attachments/11370504/raft.pdf)).
We will, however, attempt to provide a high level description which may be useful
for building a mental model.
Raft nodes are always in one of three states: follower, candidate, or leader. All
nodes initially start out as a follower. In this state, nodes can accept log entries
from a leader and cast votes. If no entries are received for some time, nodes
self-promote to the candidate state. In the candidate state, nodes request votes from
their peers. If a candidate receives a quorum of votes, then it is promoted to a leader.
The leader must accept new log entries and replicate to all the other followers.
In addition, if stale reads are not acceptable, all queries must also be performed on
the leader.
Once a cluster has a leader, it is able to accept new log entries. A client can
request that a leader append a new log entry (from Raft's perspective, a log entry
is an opaque binary blob). The leader then writes the entry to durable storage and
attempts to replicate to a quorum of followers. Once the log entry is considered
*committed*, it can be *applied* to a finite state machine. The finite state machine
is application specific; in Nomad's case, we use
[MemDB](https://github.com/hashicorp/go-memdb) to maintain cluster state.
Obviously, it would be undesirable to allow a replicated log to grow in an unbounded
fashion. Raft provides a mechanism by which the current state is snapshotted and the
log is compacted. Because of the FSM abstraction, restoring the state of the FSM must
result in the same state as a replay of old logs. This allows Raft to capture the FSM
state at a point in time and then remove all the logs that were used to reach that
state. This is performed automatically without user intervention and prevents unbounded
disk usage while also minimizing time spent replaying logs. One of the advantages of
using MemDB is that it allows Nomad to continue accepting new transactions even while
old state is being snapshotted, preventing any availability issues.
Consensus is fault-tolerant up to the point where quorum is available.
If a quorum of nodes is unavailable, it is impossible to process log entries or reason
about peer membership. For example, suppose there are only 2 peers: A and B. The quorum
size is also 2, meaning both nodes must agree to commit a log entry. If either A or B
fails, it is now impossible to reach quorum. This means the cluster is unable to add
or remove a node or to commit any additional log entries. This results in
*unavailability*. At this point, manual intervention would be required to remove
either A or B and to restart the remaining node in bootstrap mode.
A Raft cluster of 3 nodes can tolerate a single node failure while a cluster
of 5 can tolerate 2 node failures. The recommended configuration is to either
run 3 or 5 Nomad servers per region. This maximizes availability without
greatly sacrificing performance. The [deployment table](#deployment_table) below
summarizes the potential cluster size options and the fault tolerance of each.
In terms of performance, Raft is comparable to Paxos. Assuming stable leadership,
committing a log entry requires a single round trip to half of the cluster.
Thus, performance is bound by disk I/O and network latency.
## Raft in Nomad
Only Nomad server nodes participate in Raft and are part of the peer set. All
client nodes forward requests to servers. The clients in Nomad only need to know
about their allocations and query that information from the servers, while the
servers need to maintain the global state of the cluster.
Since all servers participate as part of the peer set, they all know the current
leader. When an RPC request arrives at a non-leader server, the request is
forwarded to the leader. If the RPC is a *query* type, meaning it is read-only,
the leader generates the result based on the current state of the FSM. If
the RPC is a *transaction* type, meaning it modifies state, the leader
generates a new log entry and applies it using Raft. Once the log entry is committed
and applied to the FSM, the transaction is complete.
Because of the nature of Raft's replication, performance is sensitive to network
latency. For this reason, each region elects an independent leader and maintains
a disjoint peer set. Data is partitioned by region, so each leader is responsible
only for data in their region. When a request is received for a remote region,
the request is forwarded to the correct leader. This design allows for lower latency
transactions and higher availability without sacrificing consistency.
## Consistency Modes
Although all writes to the replicated log go through Raft, reads are more
flexible. To support various trade-offs that developers may want, Nomad
supports 2 different consistency modes for reads.
The two read modes are:
* `default` - Raft makes use of leader leasing, providing a time window
in which the leader assumes its role is stable. However, if a leader
is partitioned from the remaining peers, a new leader may be elected
while the old leader is holding the lease. This means there are 2 leader
nodes. There is no risk of a split-brain since the old leader will be
unable to commit new logs. However, if the old leader services any reads,
the values are potentially stale. The default consistency mode relies only
on leader leasing, exposing clients to potentially stale values. We make
this trade-off because reads are fast, usually strongly consistent, and
only stale in a hard-to-trigger situation. The time window of stale reads
is also bounded since the leader will step down due to the partition.
* `stale` - This mode allows any server to service the read regardless of if
it is the leader. This means reads can be arbitrarily stale but are generally
within 50 milliseconds of the leader. The trade-off is very fast and scalable
reads but with stale values. This mode allows reads without a leader meaning
a cluster that is unavailable will still be able to respond.
## <a name="deployment_table"></a>Deployment Table
Below is a table that shows quorum size and failure tolerance for various
cluster sizes. The recommended deployment is either 3 or 5 servers. A single
server deployment is _**highly**_ discouraged as data loss is inevitable in a
failure scenario.
<table class="table table-bordered table-striped">
<tr>
<th>Servers</th>
<th>Quorum Size</th>
<th>Failure Tolerance</th>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>2</td>
<td>2</td>
<td>0</td>
</tr>
<tr class="warning">
<td>3</td>
<td>2</td>
<td>1</td>
</tr>
<tr>
<td>4</td>
<td>3</td>
<td>1</td>
</tr>
<tr class="warning">
<td>5</td>
<td>3</td>
<td>2</td>
</tr>
<tr>
<td>6</td>
<td>4</td>
<td>2</td>
</tr>
<tr>
<td>7</td>
<td>4</td>
<td>3</td>
</tr>
</table>

View File

@ -0,0 +1,34 @@
---
layout: "docs"
page_title: "Gossip Protocol"
sidebar_current: "docs-internals-gossip"
description: |-
Nomad uses a gossip protocol to manage membership. All of this is provided through the use of the Serf library.
---
# Gossip Protocol
Nomad uses a [gossip protocol](http://en.wikipedia.org/wiki/Gossip_protocol)
to manage membership. This is provided through the use of the [Serf library](https://www.serfdom.io/).
The gossip protocol used by Serf is based on
["SWIM: Scalable Weakly-consistent Infection-style Process Group Membership Protocol"](http://www.cs.cornell.edu/~asdas/research/dsn02-swim.pdf),
with a few minor adaptations. There are more details about [Serf's protocol here](https://www.serfdom.io/docs/internals/gossip.html).
~> **Advanced Topic!** This page covers technical details of
the internals of Nomad. You don't need to know these details to effectively
operate and use Nomad. These details are documented here for those who wish
to learn about them without having to go spelunking through the source code.
## Gossip in Nomad
Named makes use of a single global WAN gossip pool that all servers participate in.
Membership information provided by the gossip pool allows servers to perform cross region
requests. The integrated failure detection allows Nomad to gracefully handle an entire region
losing connectivity, or just a single server in a remote region. The gossip protocol
is also used to detect servers in the same region to perform automatic clustering
via the [consensus protocol](/docs/internals/consensus.html).
All of these features are provided by leveraging [Serf](https://www.serfdom.io/). It
is used as an embedded library to provide these features. From a user perspective,
this is not important, since the abstraction should be masked by Nomad. It can be useful
however as a developer to understand how this library is leveraged.

View File

@ -1,49 +0,0 @@
---
layout: "docs"
page_title: "High Availability"
sidebar_current: "docs-internals-ha"
description: |-
Learn about the high availability design of Nomad.
---
# High Availability
Nomad is primarily used in production environments to manage secrets.
As a result, any downtime of the Nomad service can affect downstream clients.
Nomad is designed to support a highly available deploy to ensure a machine
or process failure is minimally disruptive.
~> **Advanced Topic!** This page covers technical details
of Nomad. You don't need to understand these details to
effectively use Nomad. The details are documented here for
those who wish to learn about them without having to go
spelunking through the source code. However, if you're an
operator of Nomad, we recommend learning about the architecture
due to the importance of Nomad in an environment.
# Design Overview
The primary design goal in making Nomad highly availability (HA) was to
minimize downtime and not horizontal scalability. Nomad is typically
bound by the IO limits of the storage backend rather than the compute
requirements. This simplifies the HA approach and allows more complex
coordination to be avoided.
Certain storage backends, such as Consul, provide additional coordination
functions that enable Nomad to run in an HA configuration. When supported
by the backend, Nomad will automatically run in HA mode without additional
configuration.
When running in HA mode, Nomad servers have two additional states they
can be in: standby and active. For multiple Nomad servers sharing a storage
backend, only a single instance will be active at any time while all other
instances are hot standbys.
The active server operates in a standard fashion and processes all requests.
The standby servers do not process requests, and instead redirect to the active
Nomad. Meanwhile, if the active server is sealed, fails, or loses network connectivity
then one of the standbys will take over and become the active instance.
It is important to note that only _unsealed_ servers act as a standby.
If a server is still in the sealed state, then it cannot act as a standby
as it would be unable to serve any requests should the active server fail.

View File

@ -9,7 +9,7 @@ description: |-
# Nomad Internals
This section covers the internals of Nomad and explains the technical
details of how Nomad functions, its architecture and security properties.
details of how Nomad functions, its architecture and sub-systems.
-> **Note:** Knowledge of Nomad internals is not
required to use Nomad. If you aren't interested in the internals

View File

@ -1,58 +0,0 @@
---
layout: "docs"
page_title: "Key Rotation"
sidebar_current: "docs-internals-rotation"
description: |-
Learn about the details of key rotation within Nomad.
---
# Key Rotation
Nomad has multiple encryption keys that are used for various purposes. These keys support
rotation so that they can be periodically changed or in response to a potential leak or
compromise. It is useful to first understand the
[high-level architecture](/docs/internals/architecture.html) before learning about key rotation.
As a review, Nomad starts in a _sealed_ state. Nomad is unsealed by providing the unseal keys.
By default, Nomad uses a technique known as [Shamir's secret sharing algorithm](http://en.wikipedia.org/wiki/Shamir's_Secret_Sharing)
to split the master key into 5 shares, any 3 of which are required to reconstruct the master
key. The master key is used to protect the encryption key, which is ultimately used to protect
data written to the storage backend.
![Keys](/assets/images/keys.png)
To support key rotation, we need to support changing the unseal keys, master key, and the
backend encryption key. We split this into two seperate operations, `rekey` and `rotate`.
The `rekey` operation is used to generate a new master key. When this is being done,
it is possible to change the parameters of the key splitting, so that the number of shares
and the threshold required to unseal can be changed. To perform a rekey a threshold of the
current unseal keys must be provided. This is to prevent a single malicious operator from
performing a rekey and invaliding the existing master key.
Performing a rekey is fairly straightforward. The rekey operation must be initialized with
the new parameters for the split and threshold. Once initialized, the current unseal keys
must be provided until the threshold is met. Once met, Nomad will generate the new master
key, perform the splitting, and re-encrypt the encryption key with the new master key.
The new unseal keys are then provided to the operator, and the old unseal keys are no
longer usable.
The `rotate` operation is used to change the encryption key used to protect data written
to the storage backend. This key is never provided or visible to operators, who only
have unseal keys. This simplifies the rotation, as it does not require the current key
holders unlike the `rekey` operation. When `rotate` is triggered, a new encryption key
is generated and added to a keyring. All new values written to the storage backend are
encrypted with the new key. Old values written with previous encryption keys can still
be decrypted since older keys are saved in the keyring. This allows key rotation to be
done online, without an expensive re-encryption process.
Both the `rekey` and `rotate` operations can be done online and in a highly available
configuration. Only the active Nomad instance can perform either of the operations
but standby instances can still assume an active role after either operation. This is
done by providing an online upgrade path for standby instances. If the current encryption
key is `N` and a rotation installs `N+1`, Nomad creates a special "upgrade" key, which
provides the `N+1` encryption key protected by the `N` key. This upgrade key is only available
for a few minutes enabling standby instances do a periodic check for upgrades.
This allows standby instances to update their keys and stay in-sync with the active Nomad
without requiring operators to perform another unseal.

View File

@ -1,148 +0,0 @@
---
layout: "docs"
page_title: "Security Model"
sidebar_current: "docs-internals-security"
description: |-
Learn about the security model of Nomad.
---
# Security Model
Due to the nature of Nomad and the confidentiality of data it is managing,
the Nomad security model is very critical. The overall goal of Nomad's security
model is to provide [confidentiality, integrity, availability, accountability,
authentication](http://en.wikipedia.org/wiki/Information_security).
This means that data at rest and in transit must be secure from eavesdropping
or tampering. Clients must be appropriately authenticated and authorized
to access data or modify policy. All interactions must be auditable and traced
uniquely back to the origin entity. The system must be robust against intentional
attempts to bypass any of its access controls.
# Threat Model
The following are the various parts of the Nomad threat model:
* Eavesdropping on any Nomad communication. Client communication with Nomad
should be secure from eavesdropping as well as communication from Nomad to
its storage backend.
* Tampering with data at rest or in transit. Any tampering should be detectable
and cause Nomad to abort processing of the transaction.
* Access to data or controls without authentication or authorization. All requests
must be proceeded by the applicable security policies.
* Access to data or controls without accountability. If audit logging
is enabled, requests and responses must be logged before the client receives
any secret material.
* Confidentiality of stored secrets. Any data that leaves Nomad to rest in the
storage backend must be safe from eavesdropping. In practice, this means all
data at rest must be encrypted.
* Availability of secret material in the face of failure. Nomad supports
running in a highly available configuration to avoid loss of availability.
The following are not parts of the Nomad threat model:
* Protecting against arbitrary control of the storage backend. An attacker
that can perform arbitrary operations against the storage backend can
undermine security in any number of ways that are difficult or impossible to protect
against. As an example, an attacker could delete or corrupt all the contents
of the storage backend causing total data loss for Nomad. The ability to control
reads would allow an attacker to snapshot in a well-known state and rollback state
changes if that would be beneficial to them.
* Protecting against the leakage of the existence of secret material. An attacker
that can read from the storage backend may observe that secret material exists
and is stored, even if it is kept confidential.
* Protecting against memory analysis of a running Nomad. If an attacker is able
to inspect the memory state of a running Nomad instance then the confidentiality
of data may be compromised.
# External Threat Overview
Given the architecture of Nomad, there are 3 distinct systems we are concerned with
for Nomad. There is the client, which is speaking to Nomad over an API. There is Nomad
or the server more accurately, which is providing an API and serving requests. Lastly,
there is the storage backend, which the server is utilizing to read and write data.
There is no mutual trust between the Nomad client and server. Clients use
[TLS](http://en.wikipedia.org/wiki/Transport_Layer_Security) to verify the identity
of the server and to establish a secure communication channel. Servers require that
a client provides a client token for every request which is used to identify the client.
A client that does not provide their token is only permitted to make login requests.
The storage backends used by Nomad are also untrusted by design. Nomad uses a security
barrier for all requests made to the backend. The security barrier automatically encrypts
all data leaving Nomad using the [Advanced Encryption Standard (AES)](http://en.wikipedia.org/wiki/Advanced_Encryption_Standard)
cipher in the [Galois Counter Mode (GCM)](http://en.wikipedia.org/wiki/Galois/Counter_Mode).
The nonce is randomly generated for every encrypted object. When data is read from the
security barrier the GCM authentication tag is verified prior to decryption to detect
any tampering.
Depending on the backend used, Nomad may communicate with the backend over TLS
to provide an added layer of security. In some cases, such as a file backend this
is not applicable. Because storage backends are untrusted, an eavesdropper would
only gain access to encrypted data even if communication with the backend was intercepted.
# Internal Threat Overview
Within the Nomad system, a critical security concern is an attacker attempting
to gain access to secret material they are not authorized to. This is an internal
threat if the attacker is already permitted some level of access to Nomad and is
able to authenticate.
When a client first authenticates with Nomad, a credential backend is used to
verify the identity of the client and to return a list of associated ACL policies.
This association is configured by operators of Nomad ahead of time. For example,
GitHub users in the "engineering" team may be mapped to the "engineering" and "ops"
Nomad policies. Nomad then generates a client token which is a randomly generated
UUID and maps it to the policy list. This client token is then returned to the client.
On each request a client provides this token. Nomad then uses it to check that the token
is valid and has not been revoked or expired, and generates an ACL based on the associated
policies. Nomad uses a strict default deny or whitelist enforcement. This means unless
an associated policy allows for a given action, it will be denied. Each policy specifies
a level of access granted to a path in Nomad. When the policies are merged (if multiple
policies are associated with a client), the highest access level permitted is used.
For example, if the "engineering" policy permits read/write access to the "eng/" path,
and the "ops" policy permits read access to the "ops/" path, then the user gets the
union of those. Policy is matched using the most specific defined policy, which may be
an exact match or the longest-prefix match glob pattern.
Certain operations are only permitted by "root" users, which is a distinguished
policy built into Nomad. This is similar to the concept of a root user on a Unix system
or an Administrator on Windows. Although clients could be provided with root tokens
or associated with the root policy, instead Nomad supports the notion of "sudo" privilege.
As part of a policy, users may be granted "sudo" privileges to certain paths, so that
they can still perform security sensitive operations without being granted global
root access to Nomad.
Lastly, Nomad supports using a [Two-man rule](http://en.wikipedia.org/wiki/Two-man_rule) for
unsealing using [Shamir's Secret Sharing technique](http://en.wikipedia.org/wiki/Shamir's_Secret_Sharing).
When Nomad is started, it starts in an _sealed_ state. This means that the encryption key
needed to read and write from the storage backend is not yet known. The process of unsealing
requires providing the master key so that the encryption key can be retrieved. The risk of distributing
the master key is that a single malicious actor with access to it can decrypt the entire
Nomad. Instead, Shamir's technique allows us to split the master key into multiple shares or parts.
The number of shares and the threshold needed is configurable, but by default Nomad generates
5 shares, any 3 of which must be provided to reconstruct the master key.
By using a secret sharing technique, we avoid the need to place absolute trust in the holder
of the master key, and avoid storing the master key at all. The master key is only
retrievable by reconstructing the shares. The shares are not useful for making any requests
to Nomad, and can only be used for unsealing. Once unsealed the standard ACL mechanisms
are used for all requests.
To make an analogy, a bank puts security deposit boxes inside of a vault.
Each security deposit box has a key, while the vault door has both a combination and a key.
The vault is encased in steel and concrete so that the door is the only practical entrance.
The analogy to Nomad, is that the cryptosystem is the steel and concrete protecting the data.
While you could tunnel through the concrete or brute force the encryption keys, it would be
prohibitively time consuming. Opening the bank vault requires two-factors: the key and the combination.
Similarly, Nomad requires multiple shares be provided to reconstruct the master key.
Once unsealed, each security deposit boxes still requires the owner provide a key, and similarly
the Nomad ACL system protects all the secrets stored.

View File

@ -25,24 +25,39 @@ as well as statsd based on providing the appropriate configuration options.
Below is sample output of a telemetry dump:
```text
[2015-04-20 12:24:30 -0700 PDT][G] 'vault.runtime.num_goroutines': 12.000
[2015-04-20 12:24:30 -0700 PDT][G] 'vault.runtime.free_count': 11882.000
[2015-04-20 12:24:30 -0700 PDT][G] 'vault.runtime.total_gc_runs': 9.000
[2015-04-20 12:24:30 -0700 PDT][G] 'vault.expire.num_leases': 1.000
[2015-04-20 12:24:30 -0700 PDT][G] 'vault.runtime.alloc_bytes': 502992.000
[2015-04-20 12:24:30 -0700 PDT][G] 'vault.runtime.sys_bytes': 3999992.000
[2015-04-20 12:24:30 -0700 PDT][G] 'vault.runtime.malloc_count': 17315.000
[2015-04-20 12:24:30 -0700 PDT][G] 'vault.runtime.heap_objects': 5433.000
[2015-04-20 12:24:30 -0700 PDT][G] 'vault.runtime.total_gc_pause_ns': 3794124.000
[2015-04-20 12:24:30 -0700 PDT][S] 'vault.audit.log_response': Count: 2 Min: 0.001 Mean: 0.001 Max: 0.001 Stddev: 0.000 Sum: 0.002
[2015-04-20 12:24:30 -0700 PDT][S] 'vault.route.read.secret-': Count: 1 Sum: 0.036
[2015-04-20 12:24:30 -0700 PDT][S] 'vault.barrier.get': Count: 3 Min: 0.004 Mean: 0.021 Max: 0.050 Stddev: 0.025 Sum: 0.064
[2015-04-20 12:24:30 -0700 PDT][S] 'vault.token.lookup': Count: 2 Min: 0.040 Mean: 0.074 Max: 0.108 Stddev: 0.048 Sum: 0.148
[2015-04-20 12:24:30 -0700 PDT][S] 'vault.policy.get_policy': Count: 2 Min: 0.003 Mean: 0.004 Max: 0.005 Stddev: 0.001 Sum: 0.009
[2015-04-20 12:24:30 -0700 PDT][S] 'vault.core.check_token': Count: 2 Min: 0.053 Mean: 0.087 Max: 0.121 Stddev: 0.048 Sum: 0.174
[2015-04-20 12:24:30 -0700 PDT][S] 'vault.audit.log_request': Count: 2 Min: 0.001 Mean: 0.001 Max: 0.001 Stddev: 0.000 Sum: 0.002
[2015-04-20 12:24:30 -0700 PDT][S] 'vault.barrier.put': Count: 3 Min: 0.004 Mean: 0.010 Max: 0.019 Stddev: 0.008 Sum: 0.029
[2015-04-20 12:24:30 -0700 PDT][S] 'vault.route.write.secret-': Count: 1 Sum: 0.035
[2015-04-20 12:24:30 -0700 PDT][S] 'vault.core.handle_request': Count: 2 Min: 0.097 Mean: 0.228 Max: 0.359 Stddev: 0.186 Sum: 0.457
[2015-04-20 12:24:30 -0700 PDT][S] 'vault.expire.register': Count: 1 Sum: 0.18
[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.nomad.broker.total_blocked': 0.000
[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.nomad.plan.queue_depth': 0.000
[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.runtime.malloc_count': 7568.000
[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.runtime.total_gc_runs': 8.000
[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.nomad.broker.total_ready': 0.000
[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.runtime.num_goroutines': 56.000
[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.runtime.sys_bytes': 3999992.000
[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.runtime.heap_objects': 4135.000
[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.nomad.heartbeat.active': 1.000
[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.nomad.broker.total_unacked': 0.000
[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.nomad.broker.total_waiting': 0.000
[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.runtime.alloc_bytes': 634056.000
[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.runtime.free_count': 3433.000
[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.runtime.total_gc_pause_ns': 6572135.000
[2015-09-17 16:59:40 -0700 PDT][C] 'nomad.memberlist.msg.alive': Count: 1 Sum: 1.000
[2015-09-17 16:59:40 -0700 PDT][C] 'nomad.serf.member.join': Count: 1 Sum: 1.000
[2015-09-17 16:59:40 -0700 PDT][C] 'nomad.raft.barrier': Count: 1 Sum: 1.000
[2015-09-17 16:59:40 -0700 PDT][C] 'nomad.raft.apply': Count: 1 Sum: 1.000
[2015-09-17 16:59:40 -0700 PDT][C] 'nomad.nomad.rpc.query': Count: 2 Sum: 2.000
[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.serf.queue.Query': Count: 6 Sum: 0.000
[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.nomad.fsm.register_node': Count: 1 Sum: 1.296
[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.serf.queue.Intent': Count: 6 Sum: 0.000
[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.runtime.gc_pause_ns': Count: 8 Min: 126492.000 Mean: 821516.875 Max: 3126670.000 Stddev: 1139250.294 Sum: 6572135.000
[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.raft.leader.dispatchLog': Count: 3 Min: 0.007 Mean: 0.018 Max: 0.039 Stddev: 0.018 Sum: 0.054
[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.nomad.leader.reconcileMember': Count: 1 Sum: 0.007
[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.nomad.leader.reconcile': Count: 1 Sum: 0.025
[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.raft.fsm.apply': Count: 1 Sum: 1.306
[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.nomad.client.get_allocs': Count: 1 Sum: 0.110
[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.nomad.worker.dequeue_eval': Count: 29 Min: 0.003 Mean: 363.426 Max: 503.377 Stddev: 228.126 Sum: 10539.354
[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.serf.queue.Event': Count: 6 Sum: 0.000
[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.raft.commitTime': Count: 3 Min: 0.013 Mean: 0.037 Max: 0.079 Stddev: 0.037 Sum: 0.110
[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.nomad.leader.barrier': Count: 1 Sum: 0.071
[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.nomad.client.register': Count: 1 Sum: 1.626
[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.nomad.eval.dequeue': Count: 21 Min: 500.610 Mean: 501.753 Max: 503.361 Stddev: 1.030 Sum: 10536.813
[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.memberlist.gossip': Count: 12 Min: 0.009 Mean: 0.017 Max: 0.025 Stddev: 0.005 Sum: 0.204
```

View File

@ -1,53 +0,0 @@
---
layout: "docs"
page_title: "Token Authentication"
sidebar_current: "docs-internals-token"
description: |-
Learn about the client token authentication in Nomad.
---
# Token Authentication
The `token` authentication backend is built-in and is at the core of
client authentication. Other authentication backends may be used to
authenticate a client, but they eventually result in the generation of a client
token managed by the `token` backend.
Every token has a number of properties:
* ID - The primary ID of a token is a randomly generated UUID
* Display Name - Optionally, a human readable display name
* Metadata - Metadata used for audit logging
* Number of Uses - Optionally, a restricted use count
* Parent ID - Optionally, a parent token which created this child token
* Policies - An associated list of ACL policies
* Source Path - The path at which the token was generated (e.g. `auth/github/login`)
The properties of a token are immutable once created. The exception to this
is the number of uses, which is decremented on each request. Each of these
properties enable Nomad to do a number of interesting things.
Each token maintains the source path, or the login path, that was used
to create the token. This is used to allow source based revocation. For example,
if we believe our GitHub organization was compromised, we may want to revoke
all tokens generated via `auth/github/login`. This would be done by using the
`auth/token/revoke-prefix/` API with the `auth/github/` prefix. Revoking the
prefix will revoke all client tokens generated at that path, as well as all
dynamic secrets generated by those tokens. This provides a powerful "break glass"
procedure during a potential compromise.
If a token is created by another authentication backend, they do not have
a parent token. However, any tokens created by the `auth/token/create` API
have a parent token, namely the token used to make that request. By maintaining
this parent-child relationship, Nomad models token trees. Child tokens can
be created with a subset of the parent policies, allowing for dropping of
privileges. When a token is revoked, the entire sub-tree of tokens is revoked
with it. This allows clients to safely generate child tokens and then revoke
them all along with the root.
Child tokens are very useful, especially when combined with limited use tokens.
When a token is created, its use count can be optionally specified. Providing
a use count of one makes a _one time token_. This means the token can be used
for a single request before being automatically revoked. This can be generalized
to any number of uses. Limited use tokens cannot be used to create sub-tokens,
but they can be a powerful way to allow extremely limited access to Nomad.

View File

@ -15,8 +15,8 @@
<div id="hero-logotype"></div>
</div>
<div id="hero-text">
<h1>Applications on a global fleet.</h1>
<h3>As simple as a Single Machine.</h3>
<h1>Easily deploy applications at any scale</h1>
<h3>Any App. Any OS. Any Cloud.</h3>
<div id="hero-btns">
<a class="h-btn light lrg has-caret intro" href="/intro">Learn More<span class="h-caret"></span></a>
<a class="h-btn green lrg has-caret has-border try" href="">Try Nomad<span class="h-caret"></span></a>
@ -47,10 +47,13 @@
<div id="deploy" class="feature">
<div class="feature-header">
<h3>Deploy to any cloud</h3>
<p>Deploy Applications and Docker containers across datacenters to any cloud</p>
<p>
Nomad supports multi-datacenter and multi-region clusters. Deploy applications that
span multiple geographic locations or cloud providers.
</p>
</div>
<div class="feature-footer">
<p>Phasellus quis arcu nec turpis aliquet malesuada. Pellentesque auctor fermentum cursus.</p>
<p>Applications containerized with Docker can be quickly deployed, making it easy to scale.</p>
<span class="docker-outline-logo"></span>
</div>
</div> <!-- .feature -->
@ -106,7 +109,7 @@
<div id="density" class="feature">
<div class="feature-header">
<h3>Increase density and reduce cost</h3>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Quisque hendrerit nulla ut orci ultricies viverra.</p>
<p>Nomad automatically bin packs tasks to maximize efficency, increase density and reduce costs.</p>
</div>
<div class="feature-graphic"></div>
</div> <!-- .feature -->

View File

@ -13,24 +13,16 @@
<a href="/docs/internals/architecture.html">Architecture</a>
</li>
<li<%= sidebar_current("docs-internals-ha") %>>
<a href="/docs/internals/high-availability.html">High Availability</a>
<li<%= sidebar_current("docs-internals-consensus") %>>
<a href="/docs/internals/consensus.html">Consensus Protocol</a>
</li>
<li<%= sidebar_current("docs-internals-security") %>>
<a href="/docs/internals/security.html">Security Model</a>
<li<%= sidebar_current("docs-internals-gossip") %>>
<a href="/docs/internals/gossip.html">Gossip Protocol</a>
</li>
<li<%= sidebar_current("docs-internals-telemetry") %>>
<a href="/docs/internals/telemetry.html">Telemetry</a>
</li>
<li<%= sidebar_current("docs-internals-token") %>>
<a href="/docs/internals/token.html">Token Authentication</a>
</li>
<li<%= sidebar_current("docs-internals-rotation") %>>
<a href="/docs/internals/rotation.html">Key Rotation</a>
</li>
</ul>
</li>