Remove old refs

This commit is contained in:
Seth Vargo 2016-10-09 14:58:31 +08:00
parent 5dc729a6fc
commit 1d783513bd
No known key found for this signature in database
GPG Key ID: 905A90C2949E8787
3 changed files with 0 additions and 344 deletions

View File

@ -1,75 +0,0 @@
---
layout: "docs"
page_title: "Resource Utilization - Operating a Job"
sidebar_current: "docs-operating-a-job-resource-utilization"
description: |-
Nomad supports reporting detailed job statistics and resource utilization
metrics for most task drivers. This section describes the ways to inspect a
job's resource consumption and utilization.
---
# Determining Resource Utilization
Understanding the resource utilization of your application is important for many
reasons and Nomad supports reporting detailed statistics in many of its drivers.
The main interface for seeing resource utilization is with the [`alloc-status`
command](/docs/commands/alloc-status.html) by specifying the `-stats` flag.
In the below example we are running `redis` and can see its resource utilization
below:
```text
$ nomad alloc-status c3e0
ID = c3e0e3e0
Eval ID = 617e5e39
Name = example.cache[0]
Node ID = 39acd6e0
Job ID = example
Client Status = running
Created At = 06/28/16 16:42:42 UTC
Task "redis" is "running"
Task Resources
CPU Memory Disk IOPS Addresses
957/1000 30 MiB/256 MiB 300 MiB 0 db: 127.0.0.1:34907
Memory Stats
Cache Max Usage RSS Swap
32 KiB 79 MiB 30 MiB 0 B
CPU Stats
Percent Throttled Periods Throttled Time
73.66% 0 0
Recent Events:
Time Type Description
06/28/16 16:43:50 UTC Started Task started by client
06/28/16 16:42:42 UTC Received Task received by client
```
Here we can see that we are near the limit of our configured CPU but we have
plenty of memory headroom. We can use this information to alter our job's
resources to better reflect is actually needs:
```hcl
resource {
cpu = 2000
memory = 100
}
```
Adjusting resources is very important for a variety of reasons:
* Ensuring your application does not get OOM killed if it hits its memory limit.
* Ensuring the application performs well by ensuring it has some CPU allowance.
* Optimizing cluster density by reserving what you need and not over-allocating.
While single point in time resource usage measurements are useful, it is often
more useful to graph resource usage over time to better understand and estimate
resource usage. Nomad supports outputting resource data to statsite and statsd
and is the recommended way of monitoring resources. For more information about
outputting telemetry see the [Telemetry documentation](/docs/agent/telemetry.html).
For more advanced use cases, the resource usage data may also be accessed via
the client's HTTP API. See the documentation of the Client's
[Allocation HTTP API](/docs/http/client-allocation-stats.html)

View File

@ -1,93 +0,0 @@
---
layout: "docs"
page_title: "Submitting Jobs - Operating a Job"
sidebar_current: "docs-operating-a-job-submitting"
description: |-
The job file is the unit of work in Nomad. Upon authoring, the job file is
submitted to the server for evaluation and scheduling. This section discusses
some techniques for submitting jobs.
---
# Submitting Jobs
In Nomad, the description of the job and all its requirements are maintained in
a single file called the "job file". This job file resides locally on disk and
it is highly recommended that you check job files into source control.
The general flow for submitting a job in Nomad is:
1. Author a job file according to the job specification
1. Plan and review changes with a Nomad server
1. Submit the job file to a Nomad server
1. (Optional) Review job status and logs
Here is a very basic example to get you started.
## Author a Job File
Authoring a job file is very easy. For more detailed information, please see the
[job specification](/docs/jobspec/index.html). Here is a sample job file which
runs a small docker container web server.
```hcl
job "docs" {
datacenters = ["dc1"]
group "example" {
task "server" {
driver = "docker"
config {
image = "hashicorp/http-echo"
args = ["-text", "hello world"]
}
resources {
memory = 32
}
}
}
}
```
This job file exists on your local workstation in plain text. When you are
satisfied with this job file, you will plan and review the scheduler decision.
It is generally a best practice to commit job files to source control,
especially if you are working in a team.
## Planning the Job
Once the job file is authored, we need to plan out the changes. The `nomad plan`
command may be used to perform a dry-run of the scheduler and inform us of
which scheduling decisions would take place.
```shell
$ nomad plan example.nomad
```
The resulting output will look like:
```text
TODO: Output
```
Note that no action has been taken. This is a complete dry-run and no
allocations have taken place.
## Submitting the Job
Assuming the output of the plan looks acceptable, we can ask Nomad to execute
this job. This is done via the `nomad run` command. We can optionally supply
the modify index provided to us by the plan command to ensure no changes to this
job have taken place between our plan and now.
```shell
$ nomad run -check-index=123 example.nomad
```
The resulting output will look like:
```text
TODO: Output
```
Now that the job is scheduled, it may or may not be running. We need to inspect
the allocation status and logs to make sure the job started correctly. The next
section on [inspecting state](/docs/operating-a-job/inspecting-state.html) details ways to
examine this job.

View File

@ -1,176 +0,0 @@
---
layout: "docs"
page_title: "Update Strategies - Operating a Job"
sidebar_current: "docs-operating-a-job-updating"
description: |-
Learn how to do safely update Nomad Jobs.
---
# Updating a Job
When operating a service, updating the version of the job will be a common task.
Under a cluster scheduler the same best practices apply for reliably deploying
new versions including: rolling updates, blue-green deploys and canaries which
are special cased blue-green deploys. This section will explore how to do each
of these safely with Nomad.
## Rolling Updates
In order to update a service without introducing down-time, Nomad has build in
support for rolling updates. When a job specifies a rolling update, with the
below syntax, Nomad will only update `max-parallel` number of task groups at a
time and will wait `stagger` duration before updating the next set.
```hcl
job "example" {
# ...
update {
stagger = "30s"
max_parallel = 1
}
# ...
}
```
We can use the [`nomad plan` command](/docs/commands/plan.html) while updating
jobs to ensure the scheduler will do as we expect. In this example, we have 3
web server instances that we want to update their version. After the job file
was modified we can run `plan`:
```text
$ nomad plan my-web.nomad
+/- Job: "my-web"
+/- Task Group: "web" (3 create/destroy update)
+/- Task: "web" (forces create/destroy update)
+/- Config {
+/- image: "nginx:1.10" => "nginx:1.11"
port_map[0][http]: "80"
}
Scheduler dry-run:
- All tasks successfully allocated.
- Rolling update, next evaluation will be in 10s.
Job Modify Index: 7
To submit the job with version verification run:
nomad run -check-index 7 my-web.nomad
When running the job with the check-index flag, the job will only be run if the
server side version matches the the job modify index returned. If the index has
changed, another user has modified the job and the plan's results are
potentially invalid.
```
Here we can see that Nomad will destroy the 3 existing tasks and create 3
replacements but it will occur with a rolling update with a stagger of `10s`.
For more details on the update block, see
the [Jobspec documentation](/docs/jobspec/index.html#update).
## Blue-green and Canaries
Blue-green deploys have several names, Red/Black, A/B, Blue/Green, but the
concept is the same. The idea is to have two sets of applications with only one
of them being live at a given time, except while transitioning from one set to
another. What the term "live" means is that the live set of applications are
the set receiving traffic.
So imagine we have an API server that has 10 instances deployed to production
at version 1 and we want to upgrade to version 2. Hopefully the new version has
been tested in a QA environment and is now ready to start accepting production
traffic.
In this case we would consider version 1 to be the live set and we want to
transition to version 2. We can model this workflow with the below job:
```hcl
job "my-api" {
# ...
group "api-green" {
count = 10
task "api-server" {
driver = "docker"
config {
image = "api-server:v1"
}
}
}
group "api-blue" {
count = 0
task "api-server" {
driver = "docker"
config {
image = "api-server:v2"
}
}
}
}
```
Here we can see the live group is "api-green" since it has a non-zero count. To
transition to v2, we up the count of "api-blue" and down the count of
"api-green". We can now see how the canary process is a special case of
blue-green. If we set "api-blue" to `count = 1` and "api-green" to `count = 9`,
there will still be the original 10 instances but we will be testing only one
instance of the new version, essentially canarying it.
If at any time we notice that the new version is behaving incorrectly and we
want to roll back, all that we have to do is drop the count of the new group to
0 and restore the original version back to 10. This fine control lets job
operators be confident that deployments will not cause down time. If the deploy
is successful and we fully transition from v1 to v2 the job file will look like
this:
```hcl
job "my-api" {
# ...
group "api-green" {
count = 0
task "api-server" {
driver = "docker"
config {
image = "api-server:v1"
}
}
}
group "api-blue" {
count = 10
task "api-server" {
driver = "docker"
config {
image = "api-server:v2"
}
}
}
}
```
Now "api-blue" is the live group and when we are ready to update the api to v3,
we would modify "api-green" and repeat this process. The rate at which the count
of groups are incremented and decremented is totally up to the user. It is
usually good practice to start by transition one at a time until a certain
confidence threshold is met based on application specific logs and metrics.
## Handling Drain Signals
On operating systems that support signals, Nomad will signal the application
before killing it. This gives the application time to gracefully drain
connections and conduct any other cleanup that is necessary. Certain
applications take longer to drain than others and as such Nomad lets the job
file specify how long to wait in-between signaling the application to exit and
forcefully killing it. This is configurable via the `kill_timeout`. More details
can be seen in the [Jobspec documentation](/docs/jobspec/index.html#kill_timeout).