Commit Graph

2341 Commits

Author SHA1 Message Date
Mahmood Ali 61e66cb077
Merge pull request #6427 from hashicorp/b-fs-endpoint-errors
agent: report fs log errors as http errors
2019-10-15 20:12:59 -04:00
Mahmood Ali 88f8127820 tests: avoid using unnecessary pipe 2019-10-15 17:22:03 -04:00
Mahmood Ali e6d5635e1a
Merge pull request #6425 from hashicorp/f-cli-show-full-ids
cli: show full id for single node or alloc status
2019-10-15 10:54:25 -04:00
Danielle fee482ae6c
Merge pull request #6331 from hashicorp/dani/f-volume-mount-propagation
volumes: Add support for mount propagation
2019-10-14 14:29:40 +02:00
Danielle Lancashire 4fbcc668d0
volumes: Add support for mount propagation
This commit introduces support for configuring mount propagation when
mounting volumes with the `volume_mount` stanza on Linux targets.

Similar to Kubernetes, we expose 3 options for configuring mount
propagation:

- private, which is equivalent to `rprivate` on Linux, which does not allow the
           container to see any new nested mounts after the chroot was created.

- host-to-task, which is equivalent to `rslave` on Linux, which allows new mounts
                that have been created _outside of the container_ to be visible
                inside the container after the chroot is created.

- bidirectional, which is equivalent to `rshared` on Linux, which allows both
                 the container to see new mounts created on the host, but
                 importantly _allows the container to create mounts that are
                 visible in other containers an don the host_

private and host-to-task are safe, but bidirectional mounts can be
dangerous, as if the code inside a container creates a mount, and does
not clean it up before tearing down the container, it can cause bad
things to happen inside the kernel.

To add a layer of safety here, we require that the user has ReadWrite
permissions on the volume before allowing bidirectional mounts, as a
defense in depth / validation case, although creating mounts should also require
a priviliged execution environment inside the container.
2019-10-14 14:09:58 +02:00
Danielle 2640155ae5
Merge pull request #6429 from hashicorp/f-log-to-file
Add support for logging to a file
2019-10-11 13:35:39 +02:00
Danielle Lancashire 5cedf6d024
logging: Correctly track number of written bytes
Currently this assumes that a short write will never happen. While these
are improbable in a case where rotation being off a few bytes would
matter, this now correctly tracks the number of written bytes.
2019-10-10 14:02:14 +02:00
Danielle Lancashire b67215d4f8
logging: Sort files when pruning old logs
Currently this logging implementation is dependent on the order of files
as returned by filepath.Glob, which although internal methods are
documented to be lexographical, does not publicly document this. Here we
defensively resort.
2019-10-10 13:51:16 +02:00
Mahmood Ali 4b2ba62e35 acl: check ACL against object namespace
Fix a bug where a millicious user can access or manipulate an alloc in a
namespace they don't have access to.  The allocation endpoints perform
ACL checks against the request namespace, not the allocation namespace,
and performs the allocation lookup independently from namespaces.

Here, we check that the requested can access the alloc namespace
regardless of the declared request namespace.

Ideally, we'd enforce that the declared request namespace matches
the actual allocation namespace.  Unfortunately, we haven't documented
alloc endpoints as namespaced functions; we suspect starting to enforce
this will be very disruptive and inappropriate for a nomad point
release.  As such, we maintain current behavior that doesn't require
passing the proper namespace in request.  A future major release may
start enforcing checking declared namespace.
2019-10-08 12:59:22 -04:00
Mahmood Ali 3c0d8c7611
Merge pull request #6441 from hashicorp/b-agent-token
Redact replication tokens in /agent/self
2019-10-08 12:55:45 -04:00
Danielle Lancashire 9eaac48f25
agent: Refactor log setup to support log-to-file 2019-10-07 14:42:32 +02:00
Danielle Lancashire 442f4888b3
agent: Introduce File Logger
This commit introduces a rotating file logger for Nomad Agent Logs. The
logger implementation itself is a lift and shift from Consul, with tests
updated to fit with the Nomad pattern of using require, and not having a
testutil for creating tempdirs cleanly.
2019-10-07 14:37:31 +02:00
Danielle Lancashire d3614ea0a8
config: Add required configuration for logging to a file 2019-10-07 14:16:59 +02:00
Mahmood Ali d09355efe4 cli: show full id for single node or alloc status
Show full ID on individual alloc or node status views.  Shortening
the ID isn't very helpful in these cases, and makes looking up the full
id slightly more complicated when user needs to interact with API.

List views are unmodified and show short id unless `-vebose` flag is passed.

Before
```
$ nomad node status -self | head -n2
ID            = 21fc51f9
Name          = mars-2.local

$ nomad alloc status 15ae54cd | head -n3
ID                  = 15ae54cd-08dd-3681-03cf-4c23ace7e7c3
Eval ID             = a6b15f86
Name                = example.cache[0]
```

After:
```
$ nomad node status -self | head -n2
ID            = 21fc51f9-fd39-0fa0-fb41-f34c7aa36101
Name          = mars-2.local

$ nomad alloc status 15ae54cd | head -n3
ID                  = 15ae54cd-08dd-3681-03cf-4c23ace7e7c3
Eval ID             = a6b15f86-ca8e-e536-b544-4bfb43137ff3
Name                = example.cache[0]
```
2019-10-04 16:36:18 -04:00
Mahmood Ali 317e0f9e44 agent: report fs log errors as http errors
This fixes two bugs:

First, FS Logs API endpoint only propagated error back to user if it was
encoded with code, which isn't common.  Other errors get suppressed and
callers get an empty response with 200 error code.  Now, these endpoints
return  a 500 status code along with the error message.

Before
```
$ curl -v "http://127.0.0.1:4646/v1/client/fs/logs/qwerqwera?follow=false&offset=0&origin=start&region=global&task=redis&type=stdout"; echo
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 4646 (#0)
> GET /v1/client/fs/logs/qwerqwera?follow=false&offset=0&origin=start&region=global&task=redis&type=stdout HTTP/1.1
> Host: 127.0.0.1:4646
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Vary: Accept-Encoding
< Vary: Origin
< Date: Fri, 04 Oct 2019 19:47:21 GMT
< Content-Length: 0
<
* Connection #0 to host 127.0.0.1 left intact
```

After
```
$ curl -v "http://127.0.0.1:4646/v1/client/fs/logs/qwerqwera?follow=false&offset=0&origin=start&region=global&task=redis&type=stdout"; echo
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 4646 (#0)
> GET /v1/client/fs/logs/qwerqwera?follow=false&offset=0&origin=start&region=global&task=redis&type=stdout HTTP/1.1
> Host: 127.0.0.1:4646
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 500 Internal Server Error
< Vary: Accept-Encoding
< Vary: Origin
< Date: Fri, 04 Oct 2019 19:48:12 GMT
< Content-Length: 60
< Content-Type: text/plain; charset=utf-8
<
* Connection #0 to host 127.0.0.1 left intact
alloc lookup failed: index error: UUID must be 36 characters
```

Second, we return 400 status code for request validation errors.

Before
```
$ curl -v "http://127.0.0.1:4646/v1/client/fs/logs/qwerqwera"; echo
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 4646 (#0)
> GET /v1/client/fs/logs/qwerqwera HTTP/1.1
> Host: 127.0.0.1:4646
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 500 Internal Server Error
< Vary: Accept-Encoding
< Vary: Origin
< Date: Fri, 04 Oct 2019 19:47:29 GMT
< Content-Length: 22
< Content-Type: text/plain; charset=utf-8
<
* Connection #0 to host 127.0.0.1 left intact
must provide task name
```

After
```
$ curl -v "http://127.0.0.1:4646/v1/client/fs/logs/qwerqwera"; echo
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 4646 (#0)
> GET /v1/client/fs/logs/qwerqwera HTTP/1.1
> Host: 127.0.0.1:4646
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 400 Bad Request
< Vary: Accept-Encoding
< Vary: Origin
< Date: Fri, 04 Oct 2019 19:49:18 GMT
< Content-Length: 22
< Content-Type: text/plain; charset=utf-8
<
* Connection #0 to host 127.0.0.1 left intact
must provide task name
```
2019-10-04 16:33:58 -04:00
Lang Martin fb41dd86ba default raft protocol v2 2019-09-24 14:37:55 -04:00
Tim Gross cd9c23617f
client/connect: ConsulProxy LocalServicePort/Address (#6358)
Without a `LocalServicePort`, Connect services will try to use the
mapped port even when delivering traffic locally. A user can override
this behavior by pinning the port value in the `service` stanza but
this prevents us from using the Consul service name to reach the
service.

This commits configures the Consul proxy with its `LocalServicePort`
and `LocalServiceAddress` fields.
2019-09-23 14:30:48 -04:00
Danielle Lancashire 39fe07f66b
api: Redact tokens in /agent/self 2019-09-23 19:07:27 +02:00
Danielle Lancashire 8b44369073
api: Redact ACL Replication Token
Currently when hitting the /v1/agent/self API with ACL Replication
enabled results in the token being returned in the API. This commit
redacts that information, as it should be treated as a shared secret.
2019-09-22 14:35:53 +02:00
Chris Baker 6f38cca15a
fixed incorrect CLI documentation in `job deployments`
listed `-all-allocs` instead of `-all`
2019-09-20 12:24:53 -05:00
Mahmood Ali b4a7585e5e
Merge pull request #6328 from hashicorp/b-gh-6269
cli: emit job version number proper
2019-09-17 19:06:44 -04:00
Tim Gross e3e30c15a9
remove resolved TODO from UpdateTTL docstring (#6336) 2019-09-16 16:26:06 -04:00
Mahmood Ali df8a168d06 cli: emit job version number proper
We must emit alloc job number rather than its the field address.
2019-09-13 19:04:32 -04:00
Danielle Lancashire 78b61de45f
config: Hoist volume.config.source into volume
Currently, using a Volume in a job uses the following configuration:

```
volume "alias-name" {
  type = "volume-type"
  read_only = true

  config {
    source = "host_volume_name"
  }
}
```

This commit migrates to the following:

```
volume "alias-name" {
  type = "volume-type"
  source = "host_volume_name"
  read_only = true
}
```

The original design was based due to being uncertain about the future of storage
plugins, and to allow maxium flexibility.

However, this causes a few issues, namely:
- We frequently need to parse this configuration during submission,
scheduling, and mounting
- It complicates the configuration from and end users perspective
- It complicates the ability to do validation

As we understand the problem space of CSI a little more, it has become
clear that we won't need the `source` to be in config, as it will be
used in the majority of cases:

- Host Volumes: Always need a source
- Preallocated CSI Volumes: Always needs a source from a volume or claim name
- Dynamic Persistent CSI Volumes*: Always needs a source to attach the volumes
                                   to for managing upgrades and to avoid dangling.
- Dynamic Ephemeral CSI Volumes*: Less thought out, but `source` will probably point
                                  to the plugin name, and a `config` block will
                                  allow you to pass meta to the plugin. Or will
                                  point to a pre-configured ephemeral config.
*If implemented

The new design simplifies this by merging the source into the volume
stanza to solve the above issues with usability, performance, and error
handling.
2019-09-13 04:37:59 +02:00
Mahmood Ali 877260afd8 fix 'nomad namespace apply' help
Named arguments need to preceed positional arguments.
2019-09-09 10:04:41 -07:00
Nomad Release bot dc7d728a82 Generate files for 0.10.0-beta1 release 2019-09-06 18:47:09 +00:00
Michael Schurter 31eb8375e5
Merge pull request #6282 from hashicorp/f-connect-dev-path
connect: check if consul is on PATH
2019-09-05 12:25:23 -07:00
Michael Schurter 457684e34e connect: check if consul is on PATH
Only in -dev-connect mode for now since its valid to install Consul
after Nomad has started in production.
2019-09-05 12:05:42 -07:00
Jasmine Dahilig e1c73cdab5
add validation for job_gc_interval (#6277) 2019-09-05 11:20:46 -07:00
Mahmood Ali 6d73ca0cfb
Merge pull request #6250 from hashicorp/f-raft-protocol-v3
Update default raft protocol to version 3
2019-09-04 09:34:41 -04:00
Tim Gross 0f29dcc935
support script checks for task group services (#6197)
In Nomad prior to Consul Connect, all Consul checks work the same
except for Script checks. Because the Task being checked is running in
its own container namespaces, the check is executed by Nomad in the
Task's context. If the Script check passes, Nomad uses the TTL check
feature of Consul to update the check status. This means in order to
run a Script check, we need to know what Task to execute it in.

To support Consul Connect, we need Group Services, and these need to
be registered in Consul along with their checks. We could push the
Service down into the Task, but this doesn't work if someone wants to
associate a service with a task's ports, but do script checks in
another task in the allocation.

Because Nomad is handling the Script check and not Consul anyways,
this moves the script check handling into the task runner so that the
task runner can own the script check's configuration and
lifecycle. This will allow us to pass the group service check
configuration down into a task without associating the service itself
with the task.

When tasks are checked for script checks, we walk back through their
task group to see if there are script checks associated with the
task. If so, we'll spin off script check tasklets for them. The
group-level service and any restart behaviors it needs are entirely
encapsulated within the group service hook.
2019-09-03 15:09:04 -04:00
Jasmine Dahilig 4edebe389a
add default update stanza and max_parallel=0 disables deployments (#6191) 2019-09-02 10:30:09 -07:00
Evan Ercolano fcf66918d0 Remove unused canary param from MakeTaskServiceID 2019-08-31 16:53:23 -04:00
Michael Schurter 4bd53deba9
Merge pull request #6236 from hashicorp/b-ignore-connect-services
consul: ignore connect services when syncing
2019-08-30 13:11:09 -07:00
Michael Schurter 67b7bc1e90 consul: ignore connect services when syncing
Consul registers Connect services automatically, however Nomad thinks it
owns them due to the _nomad prefix. Since the services are managed by
Consul, Nomad needs to explicitly ignore them or otherwies they will be
removed.
2019-08-30 11:53:41 -07:00
Tim Gross b79021adfd cli: split -dev and -dev-connect flags 2019-08-30 09:33:30 -04:00
Mahmood Ali 6eabf53b91 Default raft protocol to version 3 2019-08-28 15:56:59 -04:00
Nick Ethier 9e96971a75
cli: display group ports and address in alloc status command output (#6189)
* cli: display group ports and address in alloc status command output

* add assertions for port.To = -1 case and convert assertions to testify
2019-08-27 23:59:36 -04:00
Jasmine Dahilig ffceab0879
remove network stanza from job init --short example jobspec (#6179) 2019-08-27 07:36:32 -07:00
Tim Gross 11030f7aa0 init: add generated assets into bindata 2019-08-26 14:24:15 -04:00
Tim Gross 4d4461d1f5 agent: -dev=connect mode bind to 0.0.0.0
The dev mode flag for connect was binding to the default interface's
IP, but this makes for a bad user experience for the CLI which will
default to 127.0.0.1. If we bind to 0.0.0.0 instead the CLI will work
without further configuration by the user.
2019-08-23 13:51:16 -04:00
Jerome Gravel-Niquet cbdc1978bf Consul service meta (#6193)
* adds meta object to service in job spec, sends it to consul

* adds tests for service meta

* fix tests

* adds docs

* better hashing for service meta, use helper for copying meta when registering service

* tried to be DRY, but looks like it would be more work to use the
helper function
2019-08-23 12:49:02 -04:00
Michael Schurter 95b8048553
Merge pull request #6121 from hashicorp/f-connect-bootstrap
connect: task hook for bootstrapping envoy sidecar
2019-08-22 10:58:31 -07:00
Michael Schurter 59e0b67c7f connect: task hook for bootstrapping envoy sidecar
Fixes #6041

Unlike all other Consul operations, boostrapping requires Consul be
available. This PR tries Consul 3 times with a backoff to account for
the group services being asynchronously registered with Consul.
2019-08-22 08:15:32 -07:00
Danielle Lancashire 2e5f28029f
remove hidden field from host volumes
We're not shipping support for "hidden" volumes in 0.10 any more, I'll
convert this to an issue+mini RFC for future enhancement.
2019-08-22 08:48:05 +02:00
Danielle c280e97619
Merge pull request #6184 from hashicorp/dani/fix-api
api: Fix definition of HostVolumeInfo
2019-08-22 00:13:28 +02:00
Danielle Lancashire 112b986736
api: Fix definition of HostVolumeInfo 2019-08-21 22:34:41 +02:00
Danielle Lancashire 9df7e0eb72
clientconfig: Fix parsing multiple host volumes 2019-08-21 22:19:58 +02:00
Michael Schurter 050cc32fde
Merge pull request #6157 from hashicorp/f-connect-register
Register connect enabled group services with Consul
2019-08-20 14:45:38 -07:00
Michael Schurter b008fd1724 connect: register group services with Consul
Fixes #6042

Add new task group service hook for registering group services like
Connect-enabled services.

Does not yet support checks.
2019-08-20 12:25:10 -07:00