Commit graph

21279 commits

Author SHA1 Message Date
Tim Gross bf2ab548b8 changelog: ensure all backports shown 2021-04-30 14:53:37 -04:00
Mahmood Ali ba49661198
Docs memory oversubscription (#10478)
* update docs

* document memory_oversubscription_enabled scheduler config
2021-04-30 14:07:56 -04:00
Mahmood Ali e17082b9cf
update golang to 1.16.3 (#10484) 2021-04-30 13:52:05 -04:00
Mahmood Ali 2e01d623b7 batch update changelog 2021-04-30 13:18:00 -04:00
Michael Schurter 547a718ef6
Merge pull request #10248 from hashicorp/f-remotetask-2021
core: propagate remote task handles
2021-04-30 08:57:26 -07:00
Michael Schurter 982c65c0c7 comment out unused consts to make linter happy 2021-04-30 08:31:31 -07:00
Michael Schurter b9f3d8e3c7 docs: make bootstrap installs buf now
No need to specify a version in the contributing docs. Let `make
bootstrap` handle that.
2021-04-30 08:31:31 -07:00
Michael Schurter 641eb1dc1a clarify docs from pr comments 2021-04-30 08:31:31 -07:00
Tim Gross 81afcdc435 docs: remove API doc for license PUT 2021-04-30 10:39:21 -04:00
Buck Doyle ef21c5f75b
Add guard against missing namespace in Mirage (#10474)
Similarly to 735f056, this won’t happen with real data,
but can happen in the current Mirage factory setup.
2021-04-30 09:18:23 -05:00
Mahmood Ali 98a9a9052f
Port OSS changes for Enterprise Quota accounting (#10481) 2021-04-30 09:48:03 -04:00
Mahmood Ali 52d881f567
Allow configuring memory oversubscription (#10466)
Cluster operators want to have better control over memory
oversubscription and may want to enable/disable it based on their
experience.

This PR adds a scheduler configuration field to control memory
oversubscription. It's additional field that can be set in the [API via Scheduler Config](https://www.nomadproject.io/api-docs/operator/scheduler), or [the agent server config](https://www.nomadproject.io/docs/configuration/server#configuring-scheduler-config).

I opted to have the memory oversubscription be an opt-in, but happy to change it.  To enable it, operators should call the API with:
```json
{
  "MemoryOversubscriptionEnabled": true
}
```

If memory oversubscription is disabled, submitting jobs specifying `memory_max` will get a "Memory oversubscription is not
enabled" warnings, but the jobs will be accepted without them accessing
the additional memory.

The warning message is like:
```
$ nomad job run /tmp/j
Job Warnings:
1 warning(s):

* Memory oversubscription is not enabled; Task cache.redis memory_max value will be ignored

==> Monitoring evaluation "7c444157"
    Evaluation triggered by job "example"
==> Monitoring evaluation "7c444157"
    Evaluation within deployment: "9d826f13"
    Allocation "aa5c3cad" created: node "9272088e", group "cache"
    Evaluation status changed: "pending" -> "complete"
==> Evaluation "7c444157" finished with status "complete"

# then you can examine the Alloc AllocatedResources to validate whether the task is allowed to exceed memory:
$ nomad alloc status -json aa5c3cad | jq '.AllocatedResources.Tasks["redis"].Memory'
{
  "MemoryMB": 256,
  "MemoryMaxMB": 0
}
```
2021-04-29 22:09:56 -04:00
Luiz Aoqui 154b2105ac
docs: add FAQ for Docker Desktop for Windows and MacOS (#10390)
* docs: add FAQ for Docker Desktop for Windows and MacOS

* docs: add win

* docs: add docker desktop note to docker driver page
2021-04-29 19:53:12 -04:00
Michael Lange e8593ec1bb
ui: Update namespaces design (#10444)
This rethinks namespaces as a filter on list pages rather than a global setting.

The biggest net-new feature here is being able to select All (*) to list all jobs
or CSI volumes across namespaces.
2021-04-29 15:00:59 -05:00
Luiz Aoqui 2949a40ddf
changelog: add entry for blocked eval metrics (#10475) 2021-04-29 15:32:30 -04:00
Luiz Aoqui f1b9055d21
Add metrics for blocked eval resources (#10454)
* add metrics for blocked eval resources

* docs: add new blocked_evals metrics

* fix to call `pruneStats` instead of `stats.prune` directly
2021-04-29 15:03:45 -04:00
Michael Schurter 76e56254e1 docs: mention remote task drivers 2021-04-29 09:22:33 -07:00
Buck Doyle 1b1805e8d9
changelog: Add missed UI entries for 1.1 (#10467) 2021-04-29 09:11:06 -05:00
Buck Doyle b9f462fdc1
ui: Add optional memory max to task details ribbon (#10459)
This is the first step in #10268. If a maximum is not specified, the
task group sum uses the memory number instead. The maximum is only
shown when it’s higher than the memory sum.
2021-04-28 15:38:14 -05:00
Tim Gross 9e1d4981f0
docs: Enterprise licensing updates 2021-04-28 14:46:06 -04:00
Buck Doyle 6d037633da
ui: Change global search to use fuzzy search API (#10412)
This updates the UI to use the new fuzzy search API. It’s a drop-in
replacement so the / shortcut to jump to search is preserved, and
results can be cycled through and chosen via arrow keys and the
enter key.

It doesn’t use everything returned by the API:
* deployments and evaluations: these match by id, doesn’t seem like
  people would know those or benefit from quick navigation to them
* namespaces: doesn’t seem useful as they currently function
* scaling policies
* tasks: the response doesn’t include an allocation id, which means they
  can’t be navigated to in the UI without an additional query
* CSI volumes: aren’t actually returned by the API

Since there’s no API to check the server configuration and know whether
the feature has been disabled, this adds another query in
route:application#beforeModel that acts as feature detection: if the
attempt to query fails (500), the global search field is hidden.

Upon having added another query on load, I realised that beforeModel was
being triggered any time service:router#transitionTo was being called,
which happens upon navigating to a search result, for instance, because
of refreshModel being present on the region query parameter. This PR
adds a check for transition.queryParamsOnly and skips rerunning the
onload queries (token permissions check, license check, fuzzy search
feature detection).

Implementation notes:

* there are changes to unrelated tests to ignore the on-load feature
  detection query
* some lifecycle-related guards against undefined were required to
  address failures when navigating to an allocation
* the minimum search length of 2 characters is hard-coded as there’s
  currently no way to determine min_term_length in the UI
2021-04-28 13:31:05 -05:00
Tim Gross 7fdfbfc0f0 license: remove "Terminates At" from license get command
The `Terminates At` field can't be removed from the struct for backwards
compatibility reasons, but there's no purpose to it anymore so we shouldn't be
showing it to end users of the command.
2021-04-28 12:00:30 -04:00
Tim Gross 4f9c5c4bac license: update 'license get' command 2021-04-28 12:00:30 -04:00
Seth Hoenig d54a606819
Merge pull request #10439 from hashicorp/pick-ent-acls-changes
e2e: add e2e tests for consul namespaces on ent with acls
2021-04-28 08:30:08 -06:00
Tim Gross 79f81d617e licensing: remove raft storage and sync
This changeset is the OSS portion of the work to remove the raft storage and
sync for Nomad Enterprise.
2021-04-28 10:28:23 -04:00
catinthetap b84cd7d61d
docs: update filesystem.mdx to fix typo 2021-04-28 08:11:05 -04:00
Michael Schurter d8f50ca20d ignore local e2e files
- nomad-driver-ecs is an optional plugin to packer into ami
- ecs.vars is generated by tf
- *.auto.tfvars is just a style I use for local var overrides
2021-04-27 15:07:03 -07:00
Michael Schurter 0eb5d5136f e2e: use public_ip in packer 2021-04-27 15:07:03 -07:00
Michael Schurter a6636723ee vendor: update aws-sdk-go and deps 2021-04-27 15:07:03 -07:00
Michael Schurter e62795798d core: propagate remote task handles
Add a new driver capability: RemoteTasks.

When a task is run by a driver with RemoteTasks set, its TaskHandle will
be propagated to the server in its allocation's TaskState. If the task
is replaced due to a down node or draining, its TaskHandle will be
propagated to its replacement allocation.

This allows tasks to be scheduled in remote systems whose lifecycles are
disconnected from the Nomad node's lifecycle.

See https://github.com/hashicorp/nomad-driver-ecs for an example ECS
remote task driver.
2021-04-27 15:07:03 -07:00
Seth Hoenig 09cd01a5f3 e2e: add e2e tests for consul namespaces on ent with acls
This PR adds e2e tests for Consul Namespaces for Nomad Enterprise
with Consul ACLs enabled.

Needed to add support for Consul ACL tokens with `namespace` and
`namespace_prefix` blocks, which Nomad parses and validates before
tossing the token. These bits will need to be picked back to OSS.
2021-04-27 14:45:54 -06:00
Seth Hoenig d76bcf0e12
Merge pull request #10457 from hashicorp/b-igce-wildcard
consul/connect: fix bug where ingress gateways could not use wildcard services
2021-04-27 14:41:47 -06:00
Seth Hoenig a7a37af94b
Merge pull request #10455 from hashicorp/b-cc-uppercase
consul/connect: check connect group and service names for uppercase characters
2021-04-27 14:40:31 -06:00
Seth Hoenig 7579946235
Merge pull request #10453 from hashicorp/b-envoy-wait-longer
connect: use exp backoff when waiting on consul envoy bootstrap
2021-04-27 14:39:15 -06:00
Marcus Naughton 1597c5a164 Update libraries-and-sdks.mdx 2021-04-27 16:34:55 -04:00
Seth Hoenig 865c7a5841 consul/connect: fix bug where ingress gateways could not use wildcard services
This PR fixes a bug where Nomad was more restrictive on Ingress Gateway Configuration
Entry definitions than Consul. Before, Nomad would not allow for declaring IGCEs with
http listeners with service name "*", which is a special feature allowable by Consul.

Note: to make http protocol work, a service-default must be defined setting the
protocol to http for each service.

Fixes: #9729
2021-04-27 13:42:26 -06:00
Seth Hoenig f47c6d34f7 consul/connect: check connect group and service names for uppercase characters
This PR adds job-submission validation that checks for the use of uppercase characters
in group and service names for services that make use of Consul Connect. This prevents
attempting to launch services that Consul will not validate correctly, which in turn
causes tasks to fail to launch in Nomad.

Underlying Consul issue: https://github.com/hashicorp/consul/issues/6765

Closes #7581 #10450
2021-04-27 11:26:37 -06:00
Seth Hoenig 238ac718f2 connect: use exp backoff when waiting on consul envoy bootstrap
This PR wraps the use of the consul envoy bootstrap command in
an expoenential backoff closure, configured to timeout after 60
seconds. This is an increase over the current behavior of making
3 attempts over 6 seconds.

Should help with #10451
2021-04-27 09:21:50 -06:00
James Rasell 2dfdf99f91
Merge pull request #10448 from hashicorp/f-docs-add-autoscaler-plugin-launch-info
docs: add detail on launching autoscaler external plugins.
2021-04-27 08:29:31 +02:00
James Rasell 4e18e9ea8b
docs: add detail on launching autoscaler external plugins. 2021-04-26 11:19:15 +02:00
Mahmood Ali 2a66922daf
Merge pull request #10442 from kphunter/patch-1
Fix path tip
2021-04-24 08:34:59 -04:00
kphunter 59209f4165
Fix path tip 2021-04-23 22:03:53 -07:00
Mahmood Ali cf24a9eaaf
api: /v1/jobs always include namespaces (#10434)
Add Namespace as a top-level field in `/v1/jobs` stub.

The `/v1/jobs` endpoint already includes the namespace under `JobSummary`, though the API is odd, as typically the job ID and Namespace are in the same level, and the oddity complicates the UI frontend development.

The downside of adding it is redundant field, that makes the response body a bit bigger, specially for clusters with large jobs. Though, it should compress nicely and I expect the overhead to be small to overall response size. The benefit of a cleaner and more consistent API seem worth it.

Fixes #10431
2021-04-23 16:36:54 -04:00
Mahmood Ali ed4aad458c
api: Ignore User provided ParentID (#10424)
ParentID is an internal field that Nomad sets for dispatched or parameterized jobs. Job submitters should not be able to set it directly, as that messes up children tracking.

Fixes #10422 . It specifically stops the scheduler from honoring the ParentID. The reason failure and why the scheduler didn't schedule that job once it was created is very interesting and requires follow up with a more technical issue.
2021-04-23 16:22:17 -04:00
Luiz Aoqui 2011e15d63
changelog: add entry for #10348 (#10349) 2021-04-23 11:25:31 -04:00
Zachary Shilton b1bd139c3c
website: add check for unlinked content (#10425)
* website: add unlinked content check

* website: remove unused guides-navigation.js
2021-04-23 11:11:09 -04:00
Luiz Aoqui 29171be859
docs: reorganize autoscaling agent config (#10383) 2021-04-23 09:53:58 -04:00
changli0617 5c820bae59
Small typo fixes (#10427)
Co-authored-by: Charlie Voiselle <464492+angrycub@users.noreply.github.com>
2021-04-22 12:16:21 -07:00
changli0617 b74cb407bd
Merge pull request #10384 from hashicorp/mw.partnership-page
website: add partnerships page
2021-04-22 11:38:05 -07:00
Seth Hoenig 25d5cebd15
Merge pull request #10426 from hashicorp/up-aws-instances
aws_env: update ec2 instances
2021-04-22 12:19:27 -06:00