open-nomad/website/content/docs/job-specification/artifact.mdx
Seth Hoenig 51a2212d3d
client: sandbox go-getter subprocess with landlock (#15328)
* client: sandbox go-getter subprocess with landlock

This PR re-implements the getter package for artifact downloads as a subprocess.

Key changes include

On all platforms, run getter as a child process of the Nomad agent.
On Linux platforms running as root, run the child process as the nobody user.
On supporting Linux kernels, uses landlock for filesystem isolation (via go-landlock).
On all platforms, restrict environment variables of the child process to a static set.
notably TMP/TEMP now points within the allocation's task directory
kernel.landlock attribute is fingerprinted (version number or unavailable)
These changes make Nomad client more resilient against a faulty go-getter implementation that may panic, and more secure against bad actors attempting to use artifact downloads as a privilege escalation vector.

Adds new e2e/artifact suite for ensuring artifact downloading works.

TODO: Windows git test (need to modify the image, etc... followup PR)

* landlock: fixup items from cr

* cr: fixup tests and go.mod file
2022-12-07 16:02:25 -06:00

256 lines
8 KiB
Plaintext

---
layout: docs
page_title: artifact Stanza - Job Specification
description: |-
The "artifact" stanza instructs Nomad to fetch and unpack a remote resource,
such as a file, tarball, or binary, and permits downloading artifacts from a
variety of locations using a URL as the input source.
---
# `artifact` Stanza
<Placement groups={['job', 'group', 'task', 'artifact']} />
The `artifact` stanza instructs Nomad to fetch and unpack a remote resource,
such as a file, tarball, or binary. Nomad downloads artifacts using the popular
[`go-getter`][go-getter] library, which permits downloading artifacts from a
variety of locations using a URL as the input source.
```hcl
job "docs" {
group "example" {
task "server" {
artifact {
source = "https://example.com/file.tar.gz"
destination = "local/some-directory"
options {
checksum = "md5:df6a4178aec9fbdc1d6d7e3634d1bc33"
}
}
}
}
}
```
Nomad supports downloading `http`, `https`, `git`, `hg` and `S3` artifacts. If
these artifacts are archived (`zip`, `tgz`, `bz2`, `xz`), they are
automatically unarchived before the starting the task.
## `artifact` Parameters
- `destination` `(string: "local/")` - Specifies the directory path to
download the artifact, relative to the root of the [task's working
directory]. If omitted, the default value is to place the artifact in
`local/`. The destination is treated as a directory unless `mode` is set to
`file`. Source files will be downloaded into that directory path. For more
details on how the `destination` interacts with task drivers, see the
[Filesystem internals] documentation.
- `mode` `(string: "any")` - One of `any`, `file`, or `dir`. If set to `file`
the `destination` must be a file, not a directory. By default the
`destination` will be `local/<filename>`.
- `options` `(map<string|string>: nil)` - Specifies configuration parameters to
fetch the artifact. The key-value pairs map directly to parameters appended to
the supplied `source` URL. Please see the [`go-getter`
documentation][go-getter] for a complete list of options and examples.
- `headers` `(map<string|string>: nil)` - Specifies HTTP headers to set when
fetching the artifact using `http` or `https` protocol. Please see the
[`go-getter` headers documentation][go-getter-headers] for more information.
- `source` `(string: <required>)` - Specifies the URL of the artifact to download.
See [`go-getter`][go-getter] for details.
## Operation Limits
The client [`artifact`][client_artifact] configuration can set limits to
specific artifact operations to prevent excessive data download or operation
time.
If a task's `artifact` retrieval exceeds one of those limits, the task will be
interrupted and fail to start. Refer to the task events for more information.
## `artifact` Examples
The following examples only show the `artifact` stanzas. Remember that the
`artifact` stanza is only valid in the placements listed above.
### Download File
This example downloads the artifact from the provided URL and places it in
`local/file.txt`. The `local/` path is relative to the [task's working
directory].
```hcl
artifact {
source = "https://example.com/file.txt"
}
```
To set HTTP headers in the request for the source the optional `headers` field
can be configured.
```hcl
artifact {
source = "https://example.com/file.txt"
headers {
User-Agent = "nomad-[${NOMAD_JOB_ID}]-[${NOMAD_GROUP_NAME}]-[${NOMAD_TASK_NAME}]"
X-Nomad-Alloc = "${NOMAD_ALLOC_ID}"
}
}
```
To use HTTP basic authentication, preprend the username and password to the
hostname in the URL. All special characters, including the username and
password, must be URL encoded. For example, for a username `exampleUser` and
the password `pass/word!`:
```hcl
artifact {
source = "https://exampleUser:pass%2Fword%21@example.com/file.txt"
}
```
### Download using git
This example downloads the artifact from the provided GitHub URL and places it at
`local/repo`, as specified by the optional `destination` parameter.
```hcl
artifact {
source = "git::https://github.com/hashicorp/nomad-guides"
destination = "local/repo"
}
```
To download from a private repo, sshkey needs to be set. The key must be
base64-encoded string. On Linux, you can run `base64 -w0 <file>` to encode the
file. Or use [HCL2](https://www.nomadproject.io/docs/job-specification/hcl2)
expressions to read and encode the key from a file on your machine:
```hcl
artifact {
# The git:: prefix forces go-getter's protocol detection to use the git ssh
# protocol. It can also automatically detect the protocol from the domain of
# some git hosting providers (such as GitHub) without the prefix.
source = "git::git@bitbucket.org:example/nomad-examples"
destination = "local/repo"
options {
# Make sure that the system known hosts file is populated:
# ssh-keyscan github.com | sudo tee -a /etc/ssh/ssh_known_hosts
# https://github.com/hashicorp/go-getter/issues/55
sshkey = "${base64encode(file("/etc/ssh/ssh_known_hosts"))}"
}
}
```
To clone specific refs or at a specific depth, use the `ref` and `depth`
options:
```hcl
artifact {
source = "git::https://github.com/hashicorp/nomad-guides"
destination = "local/repo"
options {
ref = "main"
depth = 1
}
}
```
### Download and Unarchive
This example downloads and unarchives the result in `local/file`. Because the
source URL is an archive extension, Nomad will automatically decompress it:
```hcl
artifact {
source = "https://example.com/file.tar.gz"
}
```
To disable automatic unarchiving, set the `archive` option to false:
```hcl
artifact {
source = "https://example.com/file.tar.gz"
options {
archive = false
}
}
```
### Download and Verify Checksums
This example downloads an artifact and verifies the resulting artifact's
checksum before proceeding. If the checksum is invalid, an error will be
returned.
```hcl
artifact {
source = "https://example.com/file.zip"
options {
checksum = "md5:df6a4178aec9fbdc1d6d7e3634d1bc33"
}
}
```
### Download from an S3-compatible Bucket
These examples download artifacts from Amazon S3. There are several different
types of [S3 bucket addressing][s3-bucket-addr] and [S3 region-specific
endpoints][s3-region-endpoints]. As of Nomad 0.6 non-Amazon S3-compatible
endpoints like [Minio] are supported, but you must explicitly set the "s3::"
prefix.
This example uses path-based notation on a publicly-accessible bucket:
```hcl
artifact {
source = "s3://my-bucket-example.s3-us-west-2.amazonaws.com/my_app.tar.gz"
}
```
If a bucket requires authentication, you can avoid the use of credentials by
using [EC2 IAM instance profiles][iam-instance-profiles]. If this is not possible,
credentials may be supplied via the `options` parameter:
```hcl
artifact {
options {
aws_access_key_id = "<id>"
aws_access_key_secret = "<secret>"
aws_access_token = "<token>"
}
}
```
To force the S3-specific syntax, use the `s3::` prefix:
```hcl
artifact {
source = "s3::https://my-bucket-example.s3-eu-west-1.amazonaws.com/my_app.tar.gz"
}
```
Alternatively you can use virtual hosted style:
```hcl
artifact {
source = "s3://my-bucket-example.s3-eu-west-1.amazonaws.com/my_app.tar.gz"
}
```
[client_artifact]: /docs/configuration/client#artifact-parameters
[go-getter]: https://github.com/hashicorp/go-getter 'HashiCorp go-getter Library'
[go-getter-headers]: https://github.com/hashicorp/go-getter#headers 'HashiCorp go-getter Headers'
[minio]: https://www.minio.io/
[s3-bucket-addr]: http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingBucket.html#access-bucket-intro 'Amazon S3 Bucket Addressing'
[s3-region-endpoints]: http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region 'Amazon S3 Region Endpoints'
[iam-instance-profiles]: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use_switch-role-ec2_instance-profiles.html 'EC2 IAM instance profiles'
[task's working directory]: /docs/runtime/environment#task-directories 'Task Directories'
[filesystem internals]: /docs/concepts/filesystem#templates-artifacts-and-dispatch-payloads