website: working on internals documentation

2015-09-17 17:59:45 -07:00 · 2015-09-17 17:59:45 -07:00 · 9a0d50bdc2
parent 35881dc469
commit 9a0d50bdc2
11 changed files with 290 additions and 348 deletions
--- a/website/source/docs/internals/architecture.html.md
+++ b/website/source/docs/internals/architecture.html.md
@ -126,6 +126,7 @@ ensuring PCI compliant workloads run on appropriate servers.
 # Getting in Depth

 This has been a brief high-level overview of the architecture of Nomad. There
-are more details available for each of the sub-systems.
+are more details available for each of the sub-systems. The [consensus protocol](/docs/internals/consensus.html) is
+documented in detail as is the [gossip protocol](/docs/internals/gossip.html).

 For other details, either consult the code, ask in IRC or reach out to the mailing list.
--- a/website/source/docs/internals/consensus.html.md
+++ b/website/source/docs/internals/consensus.html.md
@ -0,0 +1,205 @@
+---
+layout: "docs"
+page_title: "Consensus Protocol"
+sidebar_current: "docs-internals-consensus"
+description: |-
+  Nomad uses a consensus protocol to provide Consistency as defined by CAP. The consensus protocol is based on Raft: In search of an Understandable Consensus Algorithm. For a visual explanation of Raft, see The Secret Lives of Data.
+---
+
+# Consensus Protocol
+
+Nomad uses a [consensus protocol](http://en.wikipedia.org/wiki/Consensus_(computer_science))
+to provide [Consistency (as defined by CAP)](http://en.wikipedia.org/wiki/CAP_theorem).
+The consensus protocol is based on
+["Raft: In search of an Understandable Consensus Algorithm"](https://ramcloud.stanford.edu/wiki/download/attachments/11370504/raft.pdf).
+For a visual explanation of Raft, see [The Secret Lives of Data](http://thesecretlivesofdata.com/raft).
+
+~> **Advanced Topic!** This page covers technical details of
+the internals of Nomad. You don't need to know these details to effectively
+operate and use Nomad. These details are documented here for those who wish
+to learn about them without having to go spelunking through the source code.
+
+## Raft Protocol Overview
+
+Raft is a consensus algorithm that is based on
+[Paxos](http://en.wikipedia.org/wiki/Paxos_%28computer_science%29). Compared
+to Paxos, Raft is designed to have fewer states and a simpler, more
+understandable algorithm.
+
+There are a few key terms to know when discussing Raft:
+
+* Log - The primary unit of work in a Raft system is a log entry. The problem
+of consistency can be decomposed into a *replicated log*. A log is an ordered
+sequence of entries. We consider the log consistent if all members agree on
+the entries and their order.
+
+* FSM - [Finite State Machine](http://en.wikipedia.org/wiki/Finite-state_machine).
+An FSM is a collection of finite states with transitions between them. As new logs
+are applied, the FSM is allowed to transition between states. Application of the
+same sequence of logs must result in the same state, meaning behavior must be deterministic.
+
+* Peer set - The peer set is the set of all members participating in log replication.
+For Nomad's purposes, all server nodes are in the peer set of the local region.
+
+* Quorum - A quorum is a majority of members from a peer set: for a set of size `n`,
+quorum requires at least `(n/2)+1` members.
+For example, if there are 5 members in the peer set, we would need 3 nodes
+to form a quorum. If a quorum of nodes is unavailable for any reason, the
+cluster becomes *unavailable* and no new logs can be committed.
+
+* Committed Entry - An entry is considered *committed* when it is durably stored
+on a quorum of nodes. Once an entry is committed it can be applied.
+
+* Leader - At any given time, the peer set elects a single node to be the leader.
+The leader is responsible for ingesting new log entries, replicating to followers,
+and managing when an entry is considered committed.
+
+Raft is a complex protocol and will not be covered here in detail (for those who
+desire a more comprehensive treatment, the full specification is available in this
+[paper](https://ramcloud.stanford.edu/wiki/download/attachments/11370504/raft.pdf)).
+We will, however, attempt to provide a high level description which may be useful
+for building a mental model.
+
+Raft nodes are always in one of three states: follower, candidate, or leader. All
+nodes initially start out as a follower. In this state, nodes can accept log entries
+from a leader and cast votes. If no entries are received for some time, nodes
+self-promote to the candidate state. In the candidate state, nodes request votes from
+their peers. If a candidate receives a quorum of votes, then it is promoted to a leader.
+The leader must accept new log entries and replicate to all the other followers.
+In addition, if stale reads are not acceptable, all queries must also be performed on
+the leader.
+
+Once a cluster has a leader, it is able to accept new log entries. A client can
+request that a leader append a new log entry (from Raft's perspective, a log entry
+is an opaque binary blob). The leader then writes the entry to durable storage and
+attempts to replicate to a quorum of followers. Once the log entry is considered
+*committed*, it can be *applied* to a finite state machine. The finite state machine
+is application specific; in Nomad's case, we use
+[MemDB](https://github.com/hashicorp/go-memdb) to maintain cluster state.
+
+Obviously, it would be undesirable to allow a replicated log to grow in an unbounded
+fashion. Raft provides a mechanism by which the current state is snapshotted and the
+log is compacted. Because of the FSM abstraction, restoring the state of the FSM must
+result in the same state as a replay of old logs. This allows Raft to capture the FSM
+state at a point in time and then remove all the logs that were used to reach that
+state. This is performed automatically without user intervention and prevents unbounded
+disk usage while also minimizing time spent replaying logs. One of the advantages of
+using MemDB is that it allows Nomad to continue accepting new transactions even while
+old state is being snapshotted, preventing any availability issues.
+
+Consensus is fault-tolerant up to the point where quorum is available.
+If a quorum of nodes is unavailable, it is impossible to process log entries or reason
+about peer membership. For example, suppose there are only 2 peers: A and B. The quorum
+size is also 2, meaning both nodes must agree to commit a log entry. If either A or B
+fails, it is now impossible to reach quorum. This means the cluster is unable to add
+or remove a node or to commit any additional log entries. This results in
+*unavailability*. At this point, manual intervention would be required to remove
+either A or B and to restart the remaining node in bootstrap mode.
+
+A Raft cluster of 3 nodes can tolerate a single node failure while a cluster
+of 5 can tolerate 2 node failures. The recommended configuration is to either
+run 3 or 5 Nomad servers per region. This maximizes availability without
+greatly sacrificing performance. The [deployment table](#deployment_table) below
+summarizes the potential cluster size options and the fault tolerance of each.
+
+In terms of performance, Raft is comparable to Paxos. Assuming stable leadership,
+committing a log entry requires a single round trip to half of the cluster.
+Thus, performance is bound by disk I/O and network latency.
+
+## Raft in Nomad
+
+Only Nomad server nodes participate in Raft and are part of the peer set. All
+client nodes forward requests to servers. The clients in Nomad only need to know
+about their allocations and query that information from the servers, while the
+servers need to maintain the global state of the cluster.
+
+Since all servers participate as part of the peer set, they all know the current
+leader. When an RPC request arrives at a non-leader server, the request is
+forwarded to the leader. If the RPC is a *query* type, meaning it is read-only,
+the leader generates the result based on the current state of the FSM. If
+the RPC is a *transaction* type, meaning it modifies state, the leader
+generates a new log entry and applies it using Raft. Once the log entry is committed
+and applied to the FSM, the transaction is complete.
+
+Because of the nature of Raft's replication, performance is sensitive to network
+latency. For this reason, each region elects an independent leader and maintains
+a disjoint peer set. Data is partitioned by region, so each leader is responsible
+only for data in their region. When a request is received for a remote region,
+the request is forwarded to the correct leader. This design allows for lower latency
+transactions and higher availability without sacrificing consistency.
+
+## Consistency Modes
+
+Although all writes to the replicated log go through Raft, reads are more
+flexible. To support various trade-offs that developers may want, Nomad
+supports 2 different consistency modes for reads.
+
+The two read modes are:
+
+* `default` - Raft makes use of leader leasing, providing a time window
+  in which the leader assumes its role is stable. However, if a leader
+  is partitioned from the remaining peers, a new leader may be elected
+  while the old leader is holding the lease. This means there are 2 leader
+  nodes. There is no risk of a split-brain since the old leader will be
+  unable to commit new logs. However, if the old leader services any reads,
+  the values are potentially stale. The default consistency mode relies only
+  on leader leasing, exposing clients to potentially stale values. We make
+  this trade-off because reads are fast, usually strongly consistent, and
+  only stale in a hard-to-trigger situation. The time window of stale reads
+  is also bounded since the leader will step down due to the partition.
+
+* `stale` - This mode allows any server to service the read regardless of if
+  it is the leader. This means reads can be arbitrarily stale but are generally
+  within 50 milliseconds of the leader. The trade-off is very fast and scalable
+  reads but with stale values. This mode allows reads without a leader meaning
+  a cluster that is unavailable will still be able to respond.
+
+## <a name="deployment_table"></a>Deployment Table
+
+Below is a table that shows quorum size and failure tolerance for various
+cluster sizes. The recommended deployment is either 3 or 5 servers. A single
+server deployment is _**highly**_ discouraged as data loss is inevitable in a
+failure scenario.
+
+<table class="table table-bordered table-striped">
+  <tr>
+    <th>Servers</th>
+    <th>Quorum Size</th>
+    <th>Failure Tolerance</th>
+  </tr>
+  <tr>
+    <td>1</td>
+    <td>1</td>
+    <td>0</td>
+  </tr>
+  <tr>
+    <td>2</td>
+    <td>2</td>
+    <td>0</td>
+  </tr>
+  <tr class="warning">
+    <td>3</td>
+    <td>2</td>
+    <td>1</td>
+  </tr>
+  <tr>
+    <td>4</td>
+    <td>3</td>
+    <td>1</td>
+  </tr>
+  <tr class="warning">
+    <td>5</td>
+    <td>3</td>
+    <td>2</td>
+  </tr>
+  <tr>
+    <td>6</td>
+    <td>4</td>
+    <td>2</td>
+  </tr>
+  <tr>
+    <td>7</td>
+    <td>4</td>
+    <td>3</td>
+  </tr>
+</table>
--- a/website/source/docs/internals/gossip.html.md
+++ b/website/source/docs/internals/gossip.html.md
@ -0,0 +1,34 @@
+---
+layout: "docs"
+page_title: "Gossip Protocol"
+sidebar_current: "docs-internals-gossip"
+description: |-
+  Nomad uses a gossip protocol to manage membership. All of this is provided through the use of the Serf library.
+---
+
+# Gossip Protocol
+
+Nomad uses a [gossip protocol](http://en.wikipedia.org/wiki/Gossip_protocol)
+to manage membership. This is provided through the use of the [Serf library](https://www.serfdom.io/).
+The gossip protocol used by Serf is based on
+["SWIM: Scalable Weakly-consistent Infection-style Process Group Membership Protocol"](http://www.cs.cornell.edu/~asdas/research/dsn02-swim.pdf),
+with a few minor adaptations. There are more details about [Serf's protocol here](https://www.serfdom.io/docs/internals/gossip.html).
+
+~> **Advanced Topic!** This page covers technical details of
+the internals of Nomad. You don't need to know these details to effectively
+operate and use Nomad. These details are documented here for those who wish
+to learn about them without having to go spelunking through the source code.
+
+## Gossip in Nomad
+
+Named makes use of a single global WAN gossip pool that all servers participate in.
+Membership information provided by the gossip pool allows servers to perform cross region
+requests. The integrated failure detection allows Nomad to gracefully handle an entire region
+losing connectivity, or just a single server in a remote region. The gossip protocol
+is also used to detect servers in the same region to perform automatic clustering
+via the [consensus protocol](/docs/internals/consensus.html).
+
+All of these features are provided by leveraging [Serf](https://www.serfdom.io/). It
+is used as an embedded library to provide these features. From a user perspective,
+this is not important, since the abstraction should be masked by Nomad. It can be useful
+however as a developer to understand how this library is leveraged.
--- a/website/source/docs/internals/high-availability.html.md
+++ b/website/source/docs/internals/high-availability.html.md
@ -1,49 +0,0 @@
---
-layout: "docs"
-page_title: "High Availability"
-sidebar_current: "docs-internals-ha"
-description: |-
-  Learn about the high availability design of Nomad.
---
-
-# High Availability
-
-Nomad is primarily used in production environments to manage secrets.
-As a result, any downtime of the Nomad service can affect downstream clients.
-Nomad is designed to support a highly available deploy to ensure a machine
-or process failure is minimally disruptive.
-
-~> **Advanced Topic!** This page covers technical details
-of Nomad. You don't need to understand these details to
-effectively use Nomad. The details are documented here for
-those who wish to learn about them without having to go
-spelunking through the source code. However, if you're an
-operator of Nomad, we recommend learning about the architecture
-due to the importance of Nomad in an environment.
-
-# Design Overview
-
-The primary design goal in making Nomad highly availability (HA) was to
-minimize downtime and not horizontal scalability. Nomad is typically
-bound by the IO limits of the storage backend rather than the compute
-requirements. This simplifies the HA approach and allows more complex
-coordination to be avoided.
-
-Certain storage backends, such as Consul, provide additional coordination
-functions that enable Nomad to run in an HA configuration. When supported
-by the backend, Nomad will automatically run in HA mode without additional
-configuration.
-
-When running in HA mode, Nomad servers have two additional states they
-can be in: standby and active. For multiple Nomad servers sharing a storage
-backend, only a single instance will be active at any time while all other
-instances are hot standbys.
-
-The active server operates in a standard fashion and processes all requests.
-The standby servers do not process requests, and instead redirect to the active
-Nomad. Meanwhile, if the active server is sealed, fails, or loses network connectivity
-then one of the standbys will take over and become the active instance.
-
-It is important to note that only _unsealed_ servers act as a standby.
-If a server is still in the sealed state, then it cannot act as a standby
-as it would be unable to serve any requests should the active server fail.
--- a/website/source/docs/internals/index.html.md
+++ b/website/source/docs/internals/index.html.md
@ -9,7 +9,7 @@ description: |-
 # Nomad Internals

 This section covers the internals of Nomad and explains the technical
-details of how Nomad functions, its architecture and security properties.
+details of how Nomad functions, its architecture and sub-systems.

 -> **Note:** Knowledge of Nomad internals is not
 required to use Nomad. If you aren't interested in the internals
--- a/website/source/docs/internals/rotation.html.md
+++ b/website/source/docs/internals/rotation.html.md
@ -1,58 +0,0 @@
---
-layout: "docs"
-page_title: "Key Rotation"
-sidebar_current: "docs-internals-rotation"
-description: |-
-  Learn about the details of key rotation within Nomad.
---
-
-# Key Rotation
-
-Nomad has multiple encryption keys that are used for various purposes. These keys support
-rotation so that they can be periodically changed or in response to a potential leak or
-compromise. It is useful to first understand the
-[high-level architecture](/docs/internals/architecture.html) before learning about key rotation.
-
-As a review, Nomad starts in a _sealed_ state. Nomad is unsealed by providing the unseal keys.
-By default, Nomad uses a technique known as [Shamir's secret sharing algorithm](http://en.wikipedia.org/wiki/Shamir's_Secret_Sharing)
-to split the master key into 5 shares, any 3 of which are required to reconstruct the master
-key. The master key is used to protect the encryption key, which is ultimately used to protect
-data written to the storage backend.
-
-![Keys](/assets/images/keys.png)
-
-To support key rotation, we need to support changing the unseal keys, master key, and the
-backend encryption key. We split this into two seperate operations, `rekey` and `rotate`.
-
-The `rekey` operation is used to generate a new master key. When this is being done,
-it is possible to change the parameters of the key splitting, so that the number of shares
-and the threshold required to unseal can be changed. To perform a rekey a threshold of the
-current unseal keys must be provided. This is to prevent a single malicious operator from
-performing a rekey and invaliding the existing master key.
-
-Performing a rekey is fairly straightforward. The rekey operation must be initialized with
-the new parameters for the split and threshold. Once initialized, the current unseal keys
-must be provided until the threshold is met. Once met, Nomad will generate the new master
-key, perform the splitting, and re-encrypt the encryption key with the new master key.
-The new unseal keys are then provided to the operator, and the old unseal keys are no
-longer usable.
-
-The `rotate` operation is used to change the encryption key used to protect data written
-to the storage backend. This key is never provided or visible to operators, who only
-have unseal keys. This simplifies the rotation, as it does not require the current key
-holders unlike the `rekey` operation. When `rotate` is triggered, a new encryption key
-is generated and added to a keyring. All new values written to the storage backend are
-encrypted with the new key. Old values written with previous encryption keys can still
-be decrypted since older keys are saved in the keyring. This allows key rotation to be
-done online, without an expensive re-encryption process.
-
-Both the `rekey` and `rotate` operations can be done online and in a highly available
-configuration. Only the active Nomad instance can perform either of the operations
-but standby instances can still assume an active role after either operation. This is
-done by providing an online upgrade path for standby instances. If the current encryption
-key is `N` and a rotation installs `N+1`, Nomad creates a special "upgrade" key, which
-provides the `N+1` encryption key protected by the `N` key. This upgrade key is only available
-for a few minutes enabling standby instances do a periodic check for upgrades.
-This allows standby instances to update their keys and stay in-sync with the active Nomad
-without requiring operators to perform another unseal.
-
--- a/website/source/docs/internals/security.html.md
+++ b/website/source/docs/internals/security.html.md
@ -1,148 +0,0 @@
---
-layout: "docs"
-page_title: "Security Model"
-sidebar_current: "docs-internals-security"
-description: |-
-  Learn about the security model of Nomad.
---
-
-# Security Model
-
-Due to the nature of Nomad and the confidentiality of data it is managing,
-the Nomad security model is very critical. The overall goal of Nomad's security
-model is to provide [confidentiality, integrity, availability, accountability,
-authentication](http://en.wikipedia.org/wiki/Information_security).
-
-This means that data at rest and in transit must be secure from eavesdropping
-or tampering. Clients must be appropriately authenticated and authorized
-to access data or modify policy. All interactions must be auditable and traced
-uniquely back to the origin entity. The system must be robust against intentional
-attempts to bypass any of its access controls.
-
-# Threat Model
-
-The following are the various parts of the Nomad threat model:
-
-* Eavesdropping on any Nomad communication. Client communication with Nomad
-  should be secure from eavesdropping as well as communication from Nomad to
-  its storage backend.
-
-* Tampering with data at rest or in transit. Any tampering should be detectable
-  and cause Nomad to abort processing of the transaction.
-
-* Access to data or controls without authentication or authorization. All requests
-  must be proceeded by the applicable security policies.
-
-* Access to data or controls without accountability. If audit logging
-  is enabled, requests and responses must be logged before the client receives
-  any secret material.
-
-* Confidentiality of stored secrets. Any data that leaves Nomad to rest in the
-  storage backend must be safe from eavesdropping. In practice, this means all
-  data at rest must be encrypted.
-
-* Availability of secret material in the face of failure. Nomad supports
-  running in a highly available configuration to avoid loss of availability.
-
-The following are not parts of the Nomad threat model:
-
-* Protecting against arbitrary control of the storage backend. An attacker
-  that can perform arbitrary operations against the storage backend can
-  undermine security in any number of ways that are difficult or impossible to protect
-  against. As an example, an attacker could delete or corrupt all the contents
-  of the storage backend causing total data loss for Nomad. The ability to control
-  reads would allow an attacker to snapshot in a well-known state and rollback state
-  changes if that would be beneficial to them.
-
-* Protecting against the leakage of the existence of secret material. An attacker
-  that can read from the storage backend may observe that secret material exists
-  and is stored, even if it is kept confidential.
-
-* Protecting against memory analysis of a running Nomad. If an attacker is able
-  to inspect the memory state of a running Nomad instance then the confidentiality
-  of data may be compromised.
-
-# External Threat Overview
-
-Given the architecture of Nomad, there are 3 distinct systems we are concerned with
-for Nomad. There is the client, which is speaking to Nomad over an API. There is Nomad
-or the server more accurately, which is providing an API and serving requests. Lastly,
-there is the storage backend, which the server is utilizing to read and write data.
-
-There is no mutual trust between the Nomad client and server. Clients use
-[TLS](http://en.wikipedia.org/wiki/Transport_Layer_Security) to verify the identity
-of the server and to establish a secure communication channel. Servers require that
-a client provides a client token for every request which is used to identify the client.
-A client that does not provide their token is only permitted to make login requests.
-
-The storage backends used by Nomad are also untrusted by design. Nomad uses a security
-barrier for all requests made to the backend. The security barrier automatically encrypts
-all data leaving Nomad using the [Advanced Encryption Standard (AES)](http://en.wikipedia.org/wiki/Advanced_Encryption_Standard)
-cipher in the [Galois Counter Mode (GCM)](http://en.wikipedia.org/wiki/Galois/Counter_Mode).
-The nonce is randomly generated for every encrypted object. When data is read from the
-security barrier the GCM authentication tag is verified prior to decryption to detect
-any tampering.
-
-Depending on the backend used, Nomad may communicate with the backend over TLS
-to provide an added layer of security. In some cases, such as a file backend this
-is not applicable. Because storage backends are untrusted, an eavesdropper would
-only gain access to encrypted data even if communication with the backend was intercepted.
-
-# Internal Threat Overview
-
-Within the Nomad system, a critical security concern is an attacker attempting
-to gain access to secret material they are not authorized to. This is an internal
-threat if the attacker is already permitted some level of access to Nomad and is
-able to authenticate.
-
-When a client first authenticates with Nomad, a credential backend is used to
-verify the identity of the client and to return a list of associated ACL policies.
-This association is configured by operators of Nomad ahead of time. For example,
-GitHub users in the "engineering" team may be mapped to the "engineering" and "ops"
-Nomad policies. Nomad then generates a client token which is a randomly generated
-UUID and maps it to the policy list. This client token is then returned to the client.
-
-On each request a client provides this token. Nomad then uses it to check that the token
-is valid and has not been revoked or expired, and generates an ACL based on the associated
-policies. Nomad uses a strict default deny or whitelist enforcement. This means unless
-an associated policy allows for a given action, it will be denied. Each policy specifies
-a level of access granted to a path in Nomad. When the policies are merged (if multiple
-policies are associated with a client), the highest access level permitted is used.
-For example, if the "engineering" policy permits read/write access to the "eng/" path,
-and the "ops" policy permits read access to the "ops/" path, then the user gets the
-union of those. Policy is matched using the most specific defined policy, which may be
-an exact match or the longest-prefix match glob pattern.
-
-Certain operations are only permitted by "root" users, which is a distinguished
-policy built into Nomad. This is similar to the concept of a root user on a Unix system
-or an Administrator on Windows. Although clients could be provided with root tokens
-or associated with the root policy, instead Nomad supports the notion of "sudo" privilege.
-As part of a policy, users may be granted "sudo" privileges to certain paths, so that
-they can still perform security sensitive operations without being granted global
-root access to Nomad.
-
-Lastly, Nomad supports using a [Two-man rule](http://en.wikipedia.org/wiki/Two-man_rule) for
-unsealing using [Shamir's Secret Sharing technique](http://en.wikipedia.org/wiki/Shamir's_Secret_Sharing).
-When Nomad is started, it starts in an _sealed_ state. This means that the encryption key
-needed to read and write from the storage backend is not yet known. The process of unsealing
-requires providing the master key so that the encryption key can be retrieved. The risk of distributing
-the master key is that a single malicious actor with access to it can decrypt the entire
-Nomad. Instead, Shamir's technique allows us to split the master key into multiple shares or parts.
-The number of shares and the threshold needed is configurable, but by default Nomad generates
-5 shares, any 3 of which must be provided to reconstruct the master key.
-
-By using a secret sharing technique, we avoid the need to place absolute trust in the holder
-of the master key, and avoid storing the master key at all. The master key is only
-retrievable by reconstructing the shares. The shares are not useful for making any requests
-to Nomad, and can only be used for unsealing. Once unsealed the standard ACL mechanisms
-are used for all requests.
-
-To make an analogy, a bank puts security deposit boxes inside of a vault.
-Each security deposit box has a key, while the vault door has both a combination and a key.
-The vault is encased in steel and concrete so that the door is the only practical entrance.
-The analogy to Nomad, is that the cryptosystem is the steel and concrete protecting the data.
-While you could tunnel through the concrete or brute force the encryption keys, it would be
-prohibitively time consuming. Opening the bank vault requires two-factors: the key and the combination.
-Similarly, Nomad requires multiple shares be provided to reconstruct the master key.
-Once unsealed, each security deposit boxes still requires the owner provide a key, and similarly
-the Nomad ACL system protects all the secrets stored.
--- a/website/source/docs/internals/telemetry.html.md
+++ b/website/source/docs/internals/telemetry.html.md
@ -25,24 +25,39 @@ as well as statsd based on providing the appropriate configuration options.
 Below is sample output of a telemetry dump:

 ```text
-[2015-04-20 12:24:30 -0700 PDT][G] 'vault.runtime.num_goroutines': 12.000
-[2015-04-20 12:24:30 -0700 PDT][G] 'vault.runtime.free_count': 11882.000
-[2015-04-20 12:24:30 -0700 PDT][G] 'vault.runtime.total_gc_runs': 9.000
-[2015-04-20 12:24:30 -0700 PDT][G] 'vault.expire.num_leases': 1.000
-[2015-04-20 12:24:30 -0700 PDT][G] 'vault.runtime.alloc_bytes': 502992.000
-[2015-04-20 12:24:30 -0700 PDT][G] 'vault.runtime.sys_bytes': 3999992.000
-[2015-04-20 12:24:30 -0700 PDT][G] 'vault.runtime.malloc_count': 17315.000
-[2015-04-20 12:24:30 -0700 PDT][G] 'vault.runtime.heap_objects': 5433.000
-[2015-04-20 12:24:30 -0700 PDT][G] 'vault.runtime.total_gc_pause_ns': 3794124.000
-[2015-04-20 12:24:30 -0700 PDT][S] 'vault.audit.log_response': Count: 2 Min: 0.001 Mean: 0.001 Max: 0.001 Stddev: 0.000 Sum: 0.002
-[2015-04-20 12:24:30 -0700 PDT][S] 'vault.route.read.secret-': Count: 1 Sum: 0.036
-[2015-04-20 12:24:30 -0700 PDT][S] 'vault.barrier.get': Count: 3 Min: 0.004 Mean: 0.021 Max: 0.050 Stddev: 0.025 Sum: 0.064
-[2015-04-20 12:24:30 -0700 PDT][S] 'vault.token.lookup': Count: 2 Min: 0.040 Mean: 0.074 Max: 0.108 Stddev: 0.048 Sum: 0.148
-[2015-04-20 12:24:30 -0700 PDT][S] 'vault.policy.get_policy': Count: 2 Min: 0.003 Mean: 0.004 Max: 0.005 Stddev: 0.001 Sum: 0.009
-[2015-04-20 12:24:30 -0700 PDT][S] 'vault.core.check_token': Count: 2 Min: 0.053 Mean: 0.087 Max: 0.121 Stddev: 0.048 Sum: 0.174
-[2015-04-20 12:24:30 -0700 PDT][S] 'vault.audit.log_request': Count: 2 Min: 0.001 Mean: 0.001 Max: 0.001 Stddev: 0.000 Sum: 0.002
-[2015-04-20 12:24:30 -0700 PDT][S] 'vault.barrier.put': Count: 3 Min: 0.004 Mean: 0.010 Max: 0.019 Stddev: 0.008 Sum: 0.029
-[2015-04-20 12:24:30 -0700 PDT][S] 'vault.route.write.secret-': Count: 1 Sum: 0.035
-[2015-04-20 12:24:30 -0700 PDT][S] 'vault.core.handle_request': Count: 2 Min: 0.097 Mean: 0.228 Max: 0.359 Stddev: 0.186 Sum: 0.457
-[2015-04-20 12:24:30 -0700 PDT][S] 'vault.expire.register': Count: 1 Sum: 0.18
+[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.nomad.broker.total_blocked': 0.000
+[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.nomad.plan.queue_depth': 0.000
+[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.runtime.malloc_count': 7568.000
+[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.runtime.total_gc_runs': 8.000
+[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.nomad.broker.total_ready': 0.000
+[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.runtime.num_goroutines': 56.000
+[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.runtime.sys_bytes': 3999992.000
+[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.runtime.heap_objects': 4135.000
+[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.nomad.heartbeat.active': 1.000
+[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.nomad.broker.total_unacked': 0.000
+[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.nomad.broker.total_waiting': 0.000
+[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.runtime.alloc_bytes': 634056.000
+[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.runtime.free_count': 3433.000
+[2015-09-17 16:59:40 -0700 PDT][G] 'nomad.runtime.total_gc_pause_ns': 6572135.000
+[2015-09-17 16:59:40 -0700 PDT][C] 'nomad.memberlist.msg.alive': Count: 1 Sum: 1.000
+[2015-09-17 16:59:40 -0700 PDT][C] 'nomad.serf.member.join': Count: 1 Sum: 1.000
+[2015-09-17 16:59:40 -0700 PDT][C] 'nomad.raft.barrier': Count: 1 Sum: 1.000
+[2015-09-17 16:59:40 -0700 PDT][C] 'nomad.raft.apply': Count: 1 Sum: 1.000
+[2015-09-17 16:59:40 -0700 PDT][C] 'nomad.nomad.rpc.query': Count: 2 Sum: 2.000
+[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.serf.queue.Query': Count: 6 Sum: 0.000
+[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.nomad.fsm.register_node': Count: 1 Sum: 1.296
+[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.serf.queue.Intent': Count: 6 Sum: 0.000
+[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.runtime.gc_pause_ns': Count: 8 Min: 126492.000 Mean: 821516.875 Max: 3126670.000 Stddev: 1139250.294 Sum: 6572135.000
+[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.raft.leader.dispatchLog': Count: 3 Min: 0.007 Mean: 0.018 Max: 0.039 Stddev: 0.018 Sum: 0.054
+[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.nomad.leader.reconcileMember': Count: 1 Sum: 0.007
+[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.nomad.leader.reconcile': Count: 1 Sum: 0.025
+[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.raft.fsm.apply': Count: 1 Sum: 1.306
+[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.nomad.client.get_allocs': Count: 1 Sum: 0.110
+[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.nomad.worker.dequeue_eval': Count: 29 Min: 0.003 Mean: 363.426 Max: 503.377 Stddev: 228.126 Sum: 10539.354
+[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.serf.queue.Event': Count: 6 Sum: 0.000
+[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.raft.commitTime': Count: 3 Min: 0.013 Mean: 0.037 Max: 0.079 Stddev: 0.037 Sum: 0.110
+[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.nomad.leader.barrier': Count: 1 Sum: 0.071
+[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.nomad.client.register': Count: 1 Sum: 1.626
+[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.nomad.eval.dequeue': Count: 21 Min: 500.610 Mean: 501.753 Max: 503.361 Stddev: 1.030 Sum: 10536.813
+[2015-09-17 16:59:40 -0700 PDT][S] 'nomad.memberlist.gossip': Count: 12 Min: 0.009 Mean: 0.017 Max: 0.025 Stddev: 0.005 Sum: 0.204
 ```
--- a/website/source/docs/internals/token.html.md
+++ b/website/source/docs/internals/token.html.md
@ -1,53 +0,0 @@
---
-layout: "docs"
-page_title: "Token Authentication"
-sidebar_current: "docs-internals-token"
-description: |-
-  Learn about the client token authentication in Nomad.
---
-
-# Token Authentication
-
-The `token` authentication backend is built-in and is at the core of
-client authentication. Other authentication backends may be used to
-authenticate a client, but they eventually result in the generation of a client
-token managed by the `token` backend.
-
-Every token has a number of properties:
-
-* ID - The primary ID of a token is a randomly generated UUID
-* Display Name - Optionally, a human readable display name
-* Metadata - Metadata used for audit logging
-* Number of Uses - Optionally, a restricted use count
-* Parent ID - Optionally, a parent token which created this child token
-* Policies - An associated list of ACL policies
-* Source Path - The path at which the token was generated (e.g. `auth/github/login`)
-
-The properties of a token are immutable once created. The exception to this
-is the number of uses, which is decremented on each request. Each of these
-properties enable Nomad to do a number of interesting things.
-
-Each token maintains the source path, or the login path, that was used
-to create the token. This is used to allow source based revocation. For example,
-if we believe our GitHub organization was compromised, we may want to revoke
-all tokens generated via `auth/github/login`. This would be done by using the
-`auth/token/revoke-prefix/` API with the `auth/github/` prefix. Revoking the
-prefix will revoke all client tokens generated at that path, as well as all
-dynamic secrets generated by those tokens. This provides a powerful "break glass"
-procedure during a potential compromise.
-
-If a token is created by another authentication backend, they do not have
-a parent token. However, any tokens created by the `auth/token/create` API
-have a parent token, namely the token used to make that request. By maintaining
-this parent-child relationship, Nomad models token trees. Child tokens can
-be created with a subset of the parent policies, allowing for dropping of
-privileges. When a token is revoked, the entire sub-tree of tokens is revoked
-with it. This allows clients to safely generate child tokens and then revoke
-them all along with the root.
-
-Child tokens are very useful, especially when combined with limited use tokens.
-When a token is created, its use count can be optionally specified. Providing
-a use count of one makes a _one time token_. This means the token can be used
-for a single request before being automatically revoked. This can be generalized
-to any number of uses. Limited use tokens cannot be used to create sub-tokens,
-but they can be a powerful way to allow extremely limited access to Nomad.
--- a/website/source/index.html.erb
+++ b/website/source/index.html.erb
@ -15,8 +15,8 @@
 					<div id="hero-logotype"></div>
 				</div>
 				<div id="hero-text">
-					<h1>Applications on a global fleet.</h1>
-					<h3>As simple as a Single Machine.</h3>
+					<h1>Easily deploy applications at any scale</h1>
+					<h3>Any App. Any OS. Any Cloud.</h3>
 					<div id="hero-btns">
 						<a class="h-btn light lrg has-caret intro" href="/intro">Learn More<span class="h-caret"></span></a>
 						<a class="h-btn green lrg has-caret has-border try" href="">Try Nomad<span class="h-caret"></span></a>
@ -47,10 +47,13 @@
 				<div id="deploy" class="feature">
 					<div class="feature-header">
 						<h3>Deploy to any cloud</h3>
-						<p>Deploy Applications and Docker containers across datacenters to any cloud</p>
+                        <p>
+                        Nomad supports multi-datacenter and multi-region clusters. Deploy applications that
+                        span multiple geographic locations or cloud providers.
+                        </p>
 					</div>
 					<div class="feature-footer">
-						<p>Phasellus quis arcu nec turpis aliquet malesuada. Pellentesque auctor fermentum cursus.</p>
+						<p>Applications containerized with Docker can be quickly deployed, making it easy to scale.</p>
 						<span class="docker-outline-logo"></span>
 					</div>
 				</div> <!-- .feature -->
@ -106,7 +109,7 @@
 				<div id="density" class="feature">
 					<div class="feature-header">
 						<h3>Increase density and reduce cost</h3>
-						<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Quisque hendrerit nulla ut orci ultricies viverra.</p>
+						<p>Nomad automatically bin packs tasks to maximize efficency, increase density and reduce costs.</p>
 					</div>
 					<div class="feature-graphic"></div>
 				</div> <!-- .feature -->
--- a/website/source/layouts/docs.erb
+++ b/website/source/layouts/docs.erb
@ -13,25 +13,17 @@
 							<a href="/docs/internals/architecture.html">Architecture</a>
                        </li>

-						<li<%= sidebar_current("docs-internals-ha") %>>
-							<a href="/docs/internals/high-availability.html">High Availability</a>
-						</li>
+                        <li<%= sidebar_current("docs-internals-consensus") %>>
+                        <a href="/docs/internals/consensus.html">Consensus Protocol</a>
+                        </li>

-						<li<%= sidebar_current("docs-internals-security") %>>
-							<a href="/docs/internals/security.html">Security Model</a>
+                        <li<%= sidebar_current("docs-internals-gossip") %>>
+                        <a href="/docs/internals/gossip.html">Gossip Protocol</a>
                        </li>

 						<li<%= sidebar_current("docs-internals-telemetry") %>>
 							<a href="/docs/internals/telemetry.html">Telemetry</a>
                        </li>
-
-						<li<%= sidebar_current("docs-internals-token") %>>
-							<a href="/docs/internals/token.html">Token Authentication</a>
-                        </li>
-
-						<li<%= sidebar_current("docs-internals-rotation") %>>
-							<a href="/docs/internals/rotation.html">Key Rotation</a>
-						</li>
 					</ul>
 				</li>