Add an RPC timeout for logmon. In
https://github.com/hashicorp/nomad/issues/6461#issuecomment-559747758 ,
`logmonClient.Stop` locked up and indefinitely blocked the task runner
destroy operation.
This is an incremental improvement. We still need to follow up to
understand how we got to that state, and the full impact of locked-up
Stop and its link to pending allocations on restart.
Some code cleanup:
* Use a field for setting EC2 metadata instead of env-vars in testing;
but keep environment variables for backward compatibility reasons
* Update tests to use testify
* fix plugin launcher SetConfig msgpack params
The plugin launcher tool was passing the wrong byte array into
`SetConfig`, resulting in msgpack decoding errors. This was fixed in
a949050 (#6725) but accidentally reverted in 6aff18d (#6590).
Co-Authored-By: Chris Baker <1675087+cgbaker@users.noreply.github.com>
Extends the BasicAllocStats test to include a test for Windows
clients, exercising stats via a powershell `raw_exec` job.
Adds `ListLinuxClientNodes` and `ListWindowsClientNodes` utils so that
we can scope tests to run only when Linux or Windows clients are
available. This prevents waiting on timeouts when running a subset of
the tests against a development cluster (vs our nightly test
cluster).
* Adds a constraint to prevent tests from landing on Windows
* Improve Terraform output for mixed windows/linux clients
* Makes some Windows client config fixes from 0.10.2 testing
The provision shell script tries to install libssl1.1 package as dependency of ca-certificates package.
The installing of libssl requires to restart some services, and it asks for confirmation of this.
But there are no interactive session at this stage, so Vagrant provisioning just hungs.
This commit add a `libraries/restart-without-asking boolean true` setting before installing libssl,
so it doesn't ask confirmation anymore and the provisioning works again.
The test asserts that alloc counts get reported accurately in metrics by
inspecting the metrics endpoint directly. Sadly, the metrics as
collected by `armon/go-metrics` seem to be stateful and may contain info
from other tests.
This means that the test can fail depending on the order of returned
metrics.
Inspecting the metrics output of one failing run, you can see the
duplicate guage entries but for different node_ids:
```
{
"Name": "service-name.default-0a3ba4b6-2109-485e-be74-6864228aed3d.client.allocations.terminal",
"Value": 10,
"Labels": {
"datacenter": "dc1",
"node_class": "none",
"node_id": "67402bf4-00f3-bd8d-9fa8-f4d1924a892a"
}
},
{
"Name": "service-name.default-0a3ba4b6-2109-485e-be74-6864228aed3d.client.allocations.terminal",
"Value": 0,
"Labels": {
"datacenter": "dc1",
"node_class": "none",
"node_id": "a2945b48-7e66-68e2-c922-49b20dd4e20c"
}
},
```