Nomad inherited protocol version numbering configuration from Consul and
Serf, but unlike those projects Nomad has never used it. Nomad's
`protocol_version` has always been `1`.
While the code is effectively unused and therefore poses no runtime
risks to leave, I felt like removing it was best because:
1. Nomad's RPC subsystem has been able to evolve extensively without
needing to increment the version number.
2. Nomad's HTTP API has evolved extensively without increment
`API{Major,Minor}Version`. If we want to version the HTTP API in the
future, I doubt this is the mechanism we would choose.
3. The presence of the `server.protocol_version` configuration
parameter is confusing since `server.raft_protocol` *is* an important
parameter for operators to consider. Even more confusing is that
there is a distinct Serf protocol version which is included in `nomad
server members` output under the heading `Protocol`. `raft_protocol`
is the *only* protocol version relevant to Nomad developers and
operators. The other protocol versions are either deadcode or have
never changed (Serf).
4. If we were to need to version the RPC, HTTP API, or Serf protocols, I
don't think these configuration parameters and variables are the best
choice. If we come to that point we should choose a versioning scheme
based on the use case and modern best practices -- not this 6+ year
old dead code.
FailoverHeartbeatTTL is the amount of time to wait after a server leader failure
before considering reallocating client tasks. This TTL should be fairly long as
the new server leader needs to rebuild the entire heartbeat map for the
cluster. In deployments with a small number of machines, the default TTL (5m)
may be unnecessary long. Let's allow operators to configure this value in their
config files.
* remove event durability
temporarily removing go-memdb event durability until a new strategy is developed on how to best handled increased durability needs
* drop events table schema and state store methods
* fix neweventbuffer invocations
properly wire up durable event count
move newline responsibility
moves newline creation from NDJson to the http handler, json stream only encodes and sends now
ignore snapshot restore if broker is disabled
enable dev mode to access event steam without acl
use mapping instead of switch
use pointers for config sizes, remove unused ttl, simplify closed conn logic
This changeset implements a periodic garbage collection of CSI volumes
with missing allocations. This can happen in a scenario where a node
update fails partially and the allocation updates are written to raft
but the evaluations to GC the volumes are dropped. This feature will
cover this edge case and ensure that upgrades from 0.11.0 and 0.11.1
get any stray claims cleaned up.
This changeset implements a periodic garbage collection of unused CSI
plugins. Plugins are self-cleaning when the last allocation for a
plugin is stopped, but this feature will cover any missing edge cases
and ensure that upgrades from 0.11.0 and 0.11.1 get any stray plugins
cleaned up.
allow oss to parse sink duration
clean up audit sink parsing
ent eventer config reload
fix typo
SetEnabled to eventer interface
client acl test
rm dead code
fix failing test
This change provides an initial pass at setting up the configuration necessary to
enable use of Connect with Consul ACLs. Operators will be able to pass in a Consul
Token through `-consul-token` or `$CONSUL_TOKEN` in the `job run` and `job revert`
commands (similar to Vault tokens).
These values are not actually used yet in this changeset.
This exposes a client flag to disable nomad remote exec support in
environments where access to tasks ought to be restricted.
I used `disable_remote_exec` client flag that defaults to allowing
remote exec. Opted for a client config that can be used to disable
remote exec globally, or to a subset of the cluster if necessary.
This fixes a bug with JSON agent configuration parsing where the AST
for the plugin stanza had unnecessary flattening originating from hcl parsing
library. The workaround fixes the AST by popping off the flattened element and wrapping
it in a list. The workaround comes from similar code in terraform.
There were no existing test cases for json parsing so I added a few.
IOPS have been modelled as a resource since Nomad 0.1 but has never
actually been detected and there is no plan in the short term to add
detection. This is because IOPS is a bit simplistic of a unit to define
the performance requirements from the underlying storage system. In its
current state it adds unnecessary confusion and can be removed without
impacting any users. This PR leaves IOPS defined at the jobspec parsing
level and in the api/ resources since these are the two public uses of
the field. These should be considered deprecated and only exist to allow
users to stop using them during the Nomad 0.9.x release. In the future,
there should be no expectation that the field will exist.
This commit:
* Improves how we combine the old retry-* fields and the new stanza and
how it is validated
* Handles the new stanza setting start_join
* Fixes integration test to not bind to the standard port and instead be
randomized.
* Simplifies parsing of the old retry_interval
* Fixes the errors from retry join being masked
* Flags get parsed into new server_join stanza