Commit graph

18251 commits

Author SHA1 Message Date
Chris Baker 2f7372d29d
Merge pull request #7788 from hashicorp/b-7716-scaling-policy-parsing
parsing should error if scaling block includes multiple policy blocks
2020-04-23 08:57:31 -05:00
Chris Baker beeccc26e4 changelog entries for 7772 and 7788 2020-04-23 12:45:52 +00:00
Chris Baker 8ea4a7e84b return parsing error if scaling policy includes more than one policy block
also, check that parsing a minimal scaling block doesn't throw any errors
2020-04-23 12:37:45 +00:00
Michael Lange 0dac605902
Merge pull request #7689 from hashicorp/ui/plumb-proxy-config-to-proxy
UI Plumb proxy config to proxy
2020-04-22 19:31:27 -07:00
Mahmood Ali 018e39b456
Merge pull request #7785 from hashicorp/b-http-fail-log-level
http: adjust log level for request failure
2020-04-22 17:03:11 -04:00
Mahmood Ali b8fb32f5d2 http: adjust log level for request failure
Failed requests due to API client errors are to be marked as DEBUG.

The Error log level should be reserved to signal problems with the
cluster and are actionable for nomad system operators.  Logs due to
misbehaving API clients don't represent a system level problem and seem
spurius to nomad maintainers at best.  These log messages can also be
attack vectors for deniel of service attacks by filling servers disk
space with spurious log messages.
2020-04-22 16:19:59 -04:00
Mahmood Ali 4559f95c58
Merge pull request #7780 from hashicorp/pre-0.11.2-dev-cycle
Pre 0.11.2 dev cycle
2020-04-22 15:28:34 -04:00
Seth Hoenig 334d2abb20 demo: create build scripts for our countdash demo
We use the education team's "countdash" demo in many places
to showcase Nomad's Consul Connect integration. This change
adds a Dockerfile for each of `counter-dashboard` and
`counter-api` that can be used to build artifacts to publish
to Nomad's Docker Hub organization.

The recent "0.0.3" release of the `countdash` demo includes
changes we want in order to demo task group service checks.
2020-04-22 12:30:26 -06:00
Mahmood Ali 9ad4498c08 prep for 0.11.2 dev cycle 2020-04-22 12:51:49 -04:00
Mahmood Ali 2ca487900d prepare for 0.11.1 and reorder changelog 2020-04-22 12:50:29 -04:00
Mahmood Ali 258ad82d4a
Merge pull request #7779 from hashicorp/docs-website-0.11.1
update website for nomad 0.11.1
2020-04-22 12:15:28 -04:00
Mahmood Ali 8b2b818aab update release to 0.11.1 2020-04-22 12:13:58 -04:00
Buck Doyle 6da959f0ed
UI: Update ember-fetch to 6.7.2 (#7713)
This gets rid of this warning in the console:
Browserslist: caniuse-lite is outdated. Please run next command `yarn upgrade`
2020-04-22 09:10:55 -05:00
Chris Baker a0f2d1fae1
Merge pull request #7772 from hashicorp/b-7768-remove-policies-for-stopped-jobs
delete/create autoscaling policies as job is stopped/started
2020-04-22 08:15:55 -05:00
Buck Doyle 0c2acb01b9 Remove superseded note
This closes #7465.
2020-04-21 19:52:45 -07:00
Michael Lange 82dc694c70 Disable the proxy when Mirage is enabled
This is to prevent max socket connection errors that can stop
the live reload server from responding.
2020-04-21 19:52:44 -07:00
Michael Lange 7a4852d44b Use existing ember proxy config within our custom proxy 2020-04-21 19:52:43 -07:00
Michael Lange 919b15a2db
Merge pull request #7685 from hashicorp/ui/upgrade-lint-staged
UI: Upgrade lint-staged and husky
2020-04-21 17:42:12 -07:00
Chris Baker 09d980be2b modify state store so that autoscaling policies are deleted from their
table as job is stopped (and recreated when job is started)
2020-04-21 23:01:26 +00:00
Tim Gross 5b607d7061
changelog entries for 0.11.1 bugfixes (#7763) 2020-04-21 10:04:13 -04:00
Mahmood Ali 3137d13bc5
Merge pull request #7762 from hashicorp/b-in-place-update-deviceids
Perserve device ids in in-place alloc updates
2020-04-21 09:31:10 -04:00
Mahmood Ali 534275448b add changelog
[ci skip]
2020-04-21 09:27:40 -04:00
Mahmood Ali 9f005201e2 Ensure that alloc updates preserve device offers
When an alloc is updated in-place, ensure that the allocated device are
preserved and carried over to new alloc.
2020-04-21 08:57:15 -04:00
Mahmood Ali 2ff2745374 test for allocated devices on job in-update update
When an alloc is updated in-place, test that the allocated devices are
preserved in new alloc struct.
2020-04-21 08:56:05 -04:00
Buck Doyle 8cd5f798c4
Docs: correct search API (#7756)
This closes #7718. It corrects some inaccuracies and adds
an explanation of the truncations block.
2020-04-21 07:33:24 -05:00
Tim Gross bd74b593d0
csi: nil-check allocs for VolumeDenormalize and claim methods (#7760) 2020-04-21 08:32:24 -04:00
Charlie Voiselle c68c19f3cf
Use ExternalID in NodeStageVolume RPC (#7754) 2020-04-20 17:13:46 -04:00
Michael Dwan ba70c54340
fix panic while deleting CSI plugins for missing job (#7758) 2020-04-20 17:13:33 -04:00
Seth Hoenig dad4d58a1d
Merge pull request #7691 from hashicorp/docs-some-connect-bugs
docs: add bugfix notes for #7690 #7397 #7684 #7683 to changelog
2020-04-20 10:27:18 -06:00
Seth Hoenig cc59227a49 docs: add bugfix notes for #7690 #7397 #7684 #7683 to changelog 2020-04-20 10:25:57 -06:00
Seth Hoenig 40e0f8a346
Merge pull request #7690 from hashicorp/b-inspect-proxy-output
two fixes for inspect on connect proxy
2020-04-20 10:17:54 -06:00
Seth Hoenig 3d16d56fbb
Merge pull request #7705 from hashicorp/docs-remove-connect-limitation
fixup references in connect docs
2020-04-20 10:15:50 -06:00
Mahmood Ali 5b42796f1e
Merge pull request #7704 from hashicorp/b-agent-shutdown-order
agent: shutdown agent http server last
2020-04-20 10:37:26 -04:00
Mahmood Ali 3e741a0caa
Merge pull request #7748 from hashicorp/b-noisy-http-logs
agent: route http logs through hclog
2020-04-20 10:37:15 -04:00
Mahmood Ali 1c0e1cabc9 update changelog
[ci skip]
2020-04-20 10:36:39 -04:00
Mahmood Ali 4e1366f285 agent: route http logs through hclog
Pipe http server log to hclog, so that it uses the same logging format
as rest of nomad logs.  Also, supports emitting them as json logs, when
json formatting is set.

The http server logs are emitted as Trace level, as they are typically
repsent HTTP client errors (e.g. failed tls handshakes, invalid headers,
etc).

Though, Panic logs represent server errors and are relayed as Error
level.
2020-04-20 10:33:40 -04:00
Mahmood Ali 86aa8105b2
Merge pull request #7749 from hashicorp/b-docker-panic
driver/docker: protect against nil container
2020-04-20 10:31:46 -04:00
Mahmood Ali 6bfef2c945 add changelog
[ci skip]
2020-04-20 10:31:09 -04:00
Jeffrey 'jf' Lim 35418efb60
demo/vagrant/Vagrantfile: Update Nomad version (0.11.0) (#7579) 2020-04-20 09:29:12 -04:00
Anthony Scalisi 9664c6b270
fix spelling errors (#6985) 2020-04-20 09:28:19 -04:00
Charles Z e4a669598e
label csi as beta from 0.11 release notes (#7745) 2020-04-20 08:48:04 -04:00
Mahmood Ali dff071c3b9 driver/docker: protect against nil container
Protect against a panic when we attempt to start a container with a name
that conflicts with an existing one.  If the existing one is being
deleted while nomad first attempts to create the container, the
createContainer will fail with `container already exists`, but we get
nil container reference from the `containerByName` lookup, and cause a
crash.

I'm not certain how we get into the state, except for being very
unlucky.  I suspect that this case may be the result of a concurrent
restart or the docker engine API not being fully consistent (e.g. an
earlier call purged the container, but docker didn't free up resources
yet to create a new container with the same name immediately yet).

If that's the case, then re-attempting creation will hopefully succeed,
or we'd at least fail enough times for the alloc to be rescheduled to
another node.
2020-04-19 15:34:45 -04:00
Jeffrey 'jf' Lim eab600d3e1
Fix/improve "job plan" messaging (#7580) 2020-04-17 15:53:16 -04:00
Yishan Lin 164314f7fa
Merge pull request #7741 from hashicorp/yishan/docs-rebased-preemption-update
docs: update preemption page
2020-04-17 11:03:27 -07:00
Yishan Lin b95309dc4b docs: update preemption page
This page has not been updated (yet) to reflect that support for all 3 job types (service, batch, system) which shipped in 0.9.2.

The current page implies that preemption is only available for system jobs.

This is early preparation for Nomad 0.12, where we plan to move Preemption from Enterprise feature suite to OSS for all.
2020-04-17 09:34:07 -07:00
Michael Schurter 85999cbfab docs: add #7730 to changelog 2020-04-15 15:13:30 -07:00
Michael Schurter 4c5a0cae35 core: fix node reservation scoring
The BinPackIter accounted for node reservations twice when scoring nodes
which could bias scores toward nodes with reservations.

Pseudo-code for previous algorithm:
```
	proposed  = reservedResources + sum(allocsResources)
	available = nodeResources - reservedResources
	score     = 1 - (proposed / available)
```

The node's reserved resources are added to the total resources used by
allocations, and then the node's reserved resources are later
substracted from the node's overall resources.

The new algorithm is:
```
	proposed  = sum(allocResources)
	available = nodeResources - reservedResources
	score     = 1 - (proposed / available)
```

The node's reserved resources are no longer added to the total resources
used by allocations.

My guess as to how this bug happened is that the resource utilization
variable (`util`) is calculated and returned by the `AllocsFit` function
which needs to take reserved resources into account as a basic
feasibility check.

To avoid re-calculating alloc resource usage (because there may be a
large number of allocs), we reused `util` in the `ScoreFit` function.
`ScoreFit` properly accounts for reserved resources by subtracting them
from the node's overall resources. However since `util` _also_ took
reserved resources into account the score would be incorrect.

Prior to the fix the added test output:
```
Node: reserved     Score: 1.0000
Node: reserved2    Score: 1.0000
Node: no-reserved  Score: 0.9741
```

The scores being 1.0 for *both* nodes with reserved resources is a good
hint something is wrong as they should receive different scores. Upon
further inspection the double accounting of reserved resources caused
their scores to be >1.0 and clamped.

After the fix the added test outputs:
```
Node: no-reserved  Score: 0.9741
Node: reserved     Score: 0.9480
Node: reserved2    Score: 0.8717
```
2020-04-15 15:13:30 -07:00
Brandon Romano 3b22f5aa72
Merge pull request #7717 from hashicorp/website-alert
website: Adjust the website alert to point to the blog post
2020-04-14 11:36:43 -07:00
Brandon Romano f520757617 Adjust the website alert to point to the blog post 2020-04-14 11:17:06 -07:00
Michael Schurter 165ddda744
Merge pull request #7682 from hashicorp/b-comment-fix
core: fix comment on system stack
2020-04-13 15:13:23 -07:00