Content for the submitting a job guide

2019-08-16 00:42:59 -07:00 · 2019-08-16 00:42:59 -07:00 · d7261e20a8
parent b3cd3b095a
commit d7261e20a8
1 changed files with 144 additions and 0 deletions
--- a/website/source/guides/web-ui/submitting-a-job.html.md
+++ b/website/source/guides/web-ui/submitting-a-job.html.md
@ -8,24 +8,168 @@ description: |-
 # Submitting a Job
 On the Jobs List page of the Web UI (this is the home page), there is a "Run Job" button in the
 top-left corner. Clicking this button will take you to the Job Run page.
 The first step in running a job is authoring the job HCL or JSON. Code can be authored directly in
 the UI, complete with syntax highlighting, or it can be pasted in. After you have authored the job,
 the next step is to run the plan.
 ~> Screenshot (Job submit editor)
 ## Nomad plan
 It is best practice to run `nomad plan` before running `nomad run`, so the Web UI enforcing this
 best practice. From the Job Run page, underneath the code editor, there is a Plan button. Clicking
 this button will proceed the run process to the second step.
 The second step to submitting a job is reviewing the job plan. If you are submitting a new job, the
 plan will only show additions. If you are submitting a new version of the job, the plan will include
 details on what has been changed, added, and removed.
 ~> Screenshot (Job plan no issues)
 The plan operation will also perform a scheduler dry-run. This dry-run is helpful for catching
 potential issues early. Some potential issues are:
 1. There is not enough capacity in the cluster to start your job.
 2. There is not enough capacity remaining in your quota to start your job.
 3. Your job has an unresolvable hard constraint (e.g., required port not available).
 4. In order to start your job, other jobs must be preempted.
 Read more about placement failures and preemptions below.
 ~> Screenshot (Job plan placement failures)
 From the plan step, you can either cancel to make edits, or run the job. When you run the job, you
 are redirected to the Job Overview page.
 ## Placement failures
 One class of potential issues when planning a job is a placement failure. This happens when Nomad
 can tell ahead of time that a job cannot be started. Since Nomad does bookkeeping on cluster state
 and node metadata, Nomad will already know the answer to basic constraints, such as available
 capacity, available hardware, and available ports.
 Nomad will always let you submit a job to the cluster despite placement failures. The job will just
 remain in a queued state until the placement failures are resolved.
 Keep in mind that there will always be the possibility that Nomad cannot start a job despite there
 being no placement failures (e.g., artifact cannot download or container startup script errors).
 ## Preemptions
 Another class of potential issues when planning a job is preemptions. This happens when the cluster
 does not have capacity for your job, but your job is a high priority and the cluster has preemptions
 enabled.
 ~> Screenshot (Job plan preemptions)
 Unlike with placement failures, when you submit a job that has expected preemptions, the job is will
 start. However, other allocations will be stopped to free up capacity.
 Note that with Nomad OSS, only system jobs can preempt allocations. Nomad Enterprise allows for both
 service and batch type jobs to preempt lower priority allocations.
 ## Job Overview
 Upon submitting a job, you will be redirected to the Job Overview page for the job you submitted.
 If this is a new job, the job will start in a queued state. If there are no placement failures,
 allocations for the job will naturally transition from a starting to a running or failed state.
 Nomad is quick to schedule allocations (i.e., find a client node to start the allocation on), but an
 allocation may sit in the starting state for awhile if it has to download source images or other
 artifacts. It may also sit in a starting state if the task fails to start and requires retry
 attempts.
 If this is was an existing job that was resubmitted, the job overview will just show old allocations
 moving into a completed status before new allocations are spun up. The exact sequence of events
 depends on the configuration of the job.
 No matter the configuration of the job, the Job Overview page will live-update as the state of the
 job and its allocations change.
 ~> Screenshot (Job overview)
 ## Job Definition
 From the subnav on any job detail page, you can access the Job Definition page.
 The Job Definition page will show the job's underlying JSON representation. This can be useful for
 quickly verifying how the job was configuring. Many properties from the job configuration will also
 be on the Job Overview page, but some deeper properties may only be available in the definition
 itself. It can also be convenient to see everything at once rather than traversing through task
 groups, allocations, and tasks.
 ~> Screenshot (Job definition)
 ## Job Versions
 From the subnav on any job detail page, you can access the Job Versions page.
 The Job Versions page will show a timeline view of every version of the job. Each version in the
 timeline includes the version number, the time the version was submitted, whether the version is/was
 stable, the number of changes, and the job diff itself.
 Reviewing the job diffs version by version can be used to debug issues in a similar manner to `git log`.
 ~> Screenshot (Job version)
 ## Job Deployments
 From the subnav on any service job detail page, you can access the Job Deployments page.
 The Job Deployments page will show a timeline view of every deployment of the job. Each deployment
 in the timeline includes the deployment ID, the deployment status, whether or not the deployment
 requires promotion, the associated version number, the relative time the deployment started, and a
 detailed allocation breakdown.
 The allocation breakdown includes information on allocation placement, including how many canaries
 have been placed, how many canaries are expected, how many total allocations have been placed, how
 many total allocations are desired, and the health of each allocation.
 ~> Screenshot (Job deployments)
 ## Job Allocations
 From the subnav on any job detail page, you can access the Job Allocations page.
 The Job Allocations page will show a complete table of every allocation for a job. Allocations,
 being the unit of work in Nomad, are accessible from many places. The Job Overview page lists some
 of the recent allocations for a job for convenience and the Job Task Group page will list all
 allocations for that task group, but only the Job Allocations page shows every allocation across all
 task groups for the job.
 ~> Screenshot (Job allocations)
 ## Job Evaluations
 From the subnav on any job detail page, you can access the Job Evaluations page.
 The Job Evaluations page will show the most recent evaluations for the job. Evaluations are an
 internal detail of Nomad's inner scheduling process and as such are generally unimportant to
 monitor, but an experienced Nomad user can use evaluations to diagnose potential issues.
 ~> Screenshot (Job evaluations)
 ## Access Control
 Depending on the size of your team and the details of your Nomad deployment, you may wish to control
 which features different internal users have access to.
 Nomad has an access control list system for doing just that.
 By default, all features—read and write—are available to all users of the Web UI. Check out the
 [Securing the Web UI with ACLs](/guides/web-ui/securing.html) guide to learn how to prevent
 anonymous users from having write permissions as well as how to continue to use Web UI write
 features as a privileged user.
 ## Best Practices
 Although the Web UI lets users submit jobs in an ad-hoc manner, Nomad was deliberately designed to
 declare jobs using a configuration language. It is recommended to treat your job definitions, like
 the rest of your infrastructure, as code.
 By checking in your job definition files as source control, you will always have a log of changes to
 assist in debugging issues, rolling back versions, and collaborating on changes using development
 best practices like code review.