This changeset implements the initial registration and fingerprinting
of CSI Plugins as part of #5378. At a high level, it introduces the
following:
* A `csi_plugin` stanza as part of a Nomad task configuration, to
allow a task to expose that it is a plugin.
* A new task runner hook: `csi_plugin_supervisor`. This hook does two
things. When the `csi_plugin` stanza is detected, it will
automatically configure the plugin task to receive bidirectional
mounts to the CSI intermediary directory. At runtime, it will then
perform an initial heartbeat of the plugin and handle submitting it to
the new `dynamicplugins.Registry` for further use by the client, and
then run a lightweight heartbeat loop that will emit task events
when health changes.
* The `dynamicplugins.Registry` for handling plugins that run
as Nomad tasks, in contrast to the existing catalog that requires
`go-plugin` type plugins and to know the plugin configuration in
advance.
* The `csimanager` which fingerprints CSI plugins, in a similar way to
`drivermanager` and `devicemanager`. It currently only fingerprints
the NodeID from the plugin, and assumes that all plugins are
monolithic.
Missing features
* We do not use the live updates of the `dynamicplugin` registry in
the `csimanager` yet.
* We do not deregister the plugins from the client when they shutdown
yet, they just become indefinitely marked as unhealthy. This is
deliberate until we figure out how we should manage deploying new
versions of plugins/transitioning them.
Consul provides a feature of Service Definitions where the tags
associated with a service can be modified through the Catalog API,
overriding the value(s) configured in the agent's service configuration.
To enable this feature, the flag enable_tag_override must be configured
in the service definition.
Previously, Nomad did not allow configuring this flag, and thus the default
value of false was used. Now, it is configurable.
Because Nomad itself acts as a state machine around the the service definitions
of the tasks it manages, it's worth describing what happens when this feature
is enabled and why.
Consider the basic case where there is no Nomad, and your service is provided
to consul as a boring JSON file. The ultimate source of truth for the definition
of that service is the file, and is stored in the agent. Later, Consul performs
"anti-entropy" which synchronizes the Catalog (stored only the leaders). Then
with enable_tag_override=true, the tags field is available for "external"
modification through the Catalog API (rather than directly configuring the
service definition file, or using the Agent API). The important observation
is that if the service definition ever changes (i.e. the file is changed &
config reloaded OR the Agent API is used to modify the service), those
"external" tag values are thrown away, and the new service definition is
once again the source of truth.
In the Nomad case, Nomad itself is the source of truth over the Agent in
the same way the JSON file was the source of truth in the example above.
That means any time Nomad sets a new service definition, any externally
configured tags are going to be replaced. When does this happen? Only on
major lifecycle events, for example when a task is modified because of an
updated job spec from the 'nomad job run <existing>' command. Otherwise,
Nomad's periodic re-sync's with Consul will now no longer try to restore
the externally modified tag values (as long as enable_tag_override=true).
Fixes#2057
Nomad jobs may be configured with a TaskGroup which contains a Service
definition that is Consul Connect enabled. These service definitions end
up establishing a Consul Connect Proxy Task (e.g. envoy, by default). In
the case where Consul ACLs are enabled, a Service Identity token is required
for these tasks to run & connect, etc. This changeset enables the Nomad Server
to recieve RPC requests for the derivation of SI tokens on behalf of instances
of Consul Connect using Tasks. Those tokens are then relayed back to the
requesting Client, which then injects the tokens in the secrets directory of
the Task.
This commit ensures that Alloc.AllocatedResources is properly populated
when read from persistence stores (namely Raft and client state store).
The alloc struct may have been written previously by an arbitrary old
version that may only populate Alloc.TaskResources.
Fixes#6853
Canonicalize jobs first before adding any sidecars. This fixes a bug
where sidecar tasks were added without interpolated names and broke
validation. Sidecar tasks must be canonicalized independently.
Also adds a group network to the mock connect job because it wasn't a
valid connect job before!
The existing version constraint uses logic optimized for package
managers, not schedulers, when checking prereleases:
- 1.3.0-beta1 will *not* satisfy ">= 0.6.1"
- 1.7.0-rc1 will *not* satisfy ">= 1.6.0-beta1"
This is due to package managers wishing to favor final releases over
prereleases.
In a scheduler versions more often represent the earliest release all
required features/APIs are available in a system. Whether the constraint
or the version being evaluated are prereleases has no impact on
ordering.
This commit adds a new constraint - `semver` - which will use Semver
v2.0 ordering when evaluating constraints. Given the above examples:
- 1.3.0-beta1 satisfies ">= 0.6.1" using `semver`
- 1.7.0-rc1 satisfies ">= 1.6.0-beta1" using `semver`
Since existing jobspecs may rely on the old behavior, a new constraint
was added and the implicit Consul Connect and Vault constraints were
updated to use it.
* client: improve group service stanza interpolation and check_restart support
Interpolation can now be done on group service stanzas. Note that some task runtime specific information
that was previously available when the service was registered poststart of a task is no longer available.
The check_restart stanza for checks defined on group services will now properly restart the allocation upon
check failures if configured.
Currently, using a Volume in a job uses the following configuration:
```
volume "alias-name" {
type = "volume-type"
read_only = true
config {
source = "host_volume_name"
}
}
```
This commit migrates to the following:
```
volume "alias-name" {
type = "volume-type"
source = "host_volume_name"
read_only = true
}
```
The original design was based due to being uncertain about the future of storage
plugins, and to allow maxium flexibility.
However, this causes a few issues, namely:
- We frequently need to parse this configuration during submission,
scheduling, and mounting
- It complicates the configuration from and end users perspective
- It complicates the ability to do validation
As we understand the problem space of CSI a little more, it has become
clear that we won't need the `source` to be in config, as it will be
used in the majority of cases:
- Host Volumes: Always need a source
- Preallocated CSI Volumes: Always needs a source from a volume or claim name
- Dynamic Persistent CSI Volumes*: Always needs a source to attach the volumes
to for managing upgrades and to avoid dangling.
- Dynamic Ephemeral CSI Volumes*: Less thought out, but `source` will probably point
to the plugin name, and a `config` block will
allow you to pass meta to the plugin. Or will
point to a pre-configured ephemeral config.
*If implemented
The new design simplifies this by merging the source into the volume
stanza to solve the above issues with usability, performance, and error
handling.
* nomad: add admission controller framework
* nomad: add admission controller framework and Consul Connect hooks
* run admission controllers before checking permissions
* client: add default node meta for connect configurables
* nomad: remove validateJob func since it has been moved to admission controller
* nomad: use new TaskKind type
* client: use consts for connect sidecar image and log level
* Apply suggestions from code review
Co-Authored-By: Michael Schurter <mschurter@hashicorp.com>
* nomad: add job register test with connect sidecar
* Update nomad/job_endpoint_hooks.go
Co-Authored-By: Michael Schurter <mschurter@hashicorp.com>
Track the download status of each artifact independently so that if only
one of many artifacts fails to download, completed artifacts aren't
downloaded again.
IOPS have been modelled as a resource since Nomad 0.1 but has never
actually been detected and there is no plan in the short term to add
detection. This is because IOPS is a bit simplistic of a unit to define
the performance requirements from the underlying storage system. In its
current state it adds unnecessary confusion and can be removed without
impacting any users. This PR leaves IOPS defined at the jobspec parsing
level and in the api/ resources since these are the two public uses of
the field. These should be considered deprecated and only exist to allow
users to stop using them during the Nomad 0.9.x release. In the future,
there should be no expectation that the field will exist.