open-nomad/client
Michael Schurter 27b8112123 connect: write envoy bootstrap debugging info
When Consul Connect just works, it's wonderful. When it doesn't work it
can be exceeding difficult to debug: operators have to check task
events, Nomad logs, Consul logs, Consul APIs, and even then critical
information is missing.

Using Consul to generate a bootstrap config for Envoy is notoriously
difficult. Nomad doesn't even log stderr, so operators are left trying
to piece together what went wrong.

This patch attempts to provide *maximal* context which unfortunately
includes secrets. **Secrets are always restricted to the secrets/
directory.** This makes debugging a little harder, but allows operators
to know exactly what operation Nomad was trying to perform.

What's added:

- stderr is sent to alloc/logs/envoy_bootstrap.stderr.0
- the CLI is written to secrets/.envoy_bootstrap.cmd
- the environment is written to secrets/.envoy_bootstrap.env as JSON

Accessing this information is unfortunately awkward:
```
nomad alloc exec -task connect-proxy-count-countdash b36a cat secrets/.envoy_bootstrap.env
nomad alloc exec -task connect-proxy-count-countdash b36a cat secrets/.envoy_bootstrap.cmd
nomad alloc fs b36a alloc/logs/envoy_bootstrap.stderr.0
```

The above assumes an alloc id that starts with `b36a` and a Connect
sidecar proxy for a service named `count-countdash`.

If the alloc is unable to start successfully, the debugging files are
only accessible from the host filesystem.
2022-02-18 12:02:36 -08:00
..
allocdir client: check escaping of alloc dir using symlinks 2022-02-09 19:50:13 -05:00
allochealth Fix health checking for ephemeral poststart tasks (#11945) 2022-02-02 16:29:49 -05:00
allocrunner connect: write envoy bootstrap debugging info 2022-02-18 12:02:36 -08:00
allocwatcher client: never embed alloc_dir in chroot 2021-10-18 09:22:01 -07:00
config Update IsEmpty to check for pre-1.2.4 fields (#11930) 2022-01-28 14:41:49 -05:00
consul consul: plubming for specifying consul namespace in job/group 2021-04-05 10:03:19 -06:00
devicemanager Fix some errcheck errors (#9811) 2021-01-14 12:46:35 -08:00
dynamicplugins Add gosimple linter (#9590) 2020-12-09 11:05:18 -08:00
fingerprint fingerprint: remove metadata from digitalocean (#12032) 2022-02-09 07:31:45 -05:00
interfaces replace 'a alloc' with 'an alloc' where appropriate (#11792) 2022-01-10 11:59:46 -05:00
lib client: change test to not poke cgroupv2 edge case 2022-01-27 08:27:40 -06:00
logmon logmon: Fix a memory leak on task restart 2021-10-05 12:11:53 -04:00
pluginmanager CSI: node unmount from the client before unpublish RPC (#11892) 2022-01-28 14:43:58 -05:00
servers client: drop unused DC field from servers list 2019-05-20 14:19:15 -07:00
state gofmt all the files 2021-10-01 10:14:28 -04:00
stats deps: bump gopsutil to v3.21.2 2021-03-30 16:02:51 -04:00
structs Generate files for 1.2.4 release 2022-01-18 23:43:00 +00:00
taskenv client: task env vars should take precendece over host env vars. 2021-09-20 09:15:28 +01:00
testutil cli: refactor operator debug capture (#11466) 2021-11-05 19:43:10 -04:00
vaultclient Add gosimple linter (#9590) 2020-12-09 11:05:18 -08:00
acl.go Audit config, seams for enterprise audit features 2020-03-23 13:47:42 -04:00
acl_test.go Event Stream: Track ACL changes, unsubscribe on invalidating changes (#9447) 2020-12-01 11:11:34 -05:00
agent_endpoint.go json handles were moved to a new package in #10202 2021-04-02 13:31:10 +00:00
agent_endpoint_test.go fix params for Agent.Host client RPC (#8795) 2020-08-31 17:14:26 -04:00
alloc_endpoint.go client: fix multiple imports (#10537) 2021-05-13 14:30:31 -04:00
alloc_endpoint_test.go tests: remove duplicate import statements. 2021-06-11 09:39:22 +02:00
alloc_watcher_e2e_test.go tests: swap lib/freeport for tweaked helper/freeport 2019-12-09 08:37:32 -06:00
client.go replace 'a alloc' with 'an alloc' where appropriate (#11792) 2022-01-10 11:59:46 -05:00
client_stats_endpoint.go Server side impl + touch ups 2018-02-15 13:59:02 -08:00
client_stats_endpoint_test.go tests: swap lib/freeport for tweaked helper/freeport 2019-12-09 08:37:32 -06:00
client_test.go client: defensively log reserved ports 2021-10-04 15:43:35 -07:00
csi_endpoint.go CSI: use AccessMode/AttachmentMode from CSIVolumeClaim 2021-04-07 11:24:09 -04:00
csi_endpoint_test.go CSI ListSnapshots secrets implementation 2021-07-28 11:30:29 -07:00
driver_manager_test.go tests: fix data race in client TestDriverManager_Fingerprint_Periodic 2019-05-21 09:49:56 -04:00
enterprise_client_oss.go gofmt all the files 2021-10-01 10:14:28 -04:00
fingerprint_manager.go chore: fixup inconsistent method receiver names. (#11704) 2021-12-20 11:44:21 +01:00
fingerprint_manager_test.go use allow/deny instead of the colored alternatives (#9019) 2020-10-12 08:47:05 -04:00
fs_endpoint.go Fix log streaming missing frames (#11721) 2022-01-04 14:07:16 -05:00
fs_endpoint_test.go Fix log streaming missing frames (#11721) 2022-01-04 14:07:16 -05:00
gc.go chore: fix incorrect docstring formatting. 2021-08-30 11:08:12 +02:00
gc_test.go Events/msgtype cleanup (#9117) 2020-10-19 09:30:15 -04:00
heartbeatstop.go Delayed evaluations for `stop_after_client_disconnect` can cause unwanted extra followup evaluations around job garbage collection (#8099) 2020-06-03 09:48:38 -04:00
heartbeatstop_test.go docs: s/hearbeat/heartbeat and fix link 2020-07-23 11:33:34 -07:00
node_updater.go client: use NewNodeEvent builder for consistency (#7559) 2020-03-31 10:02:16 -04:00
rpc.go pool: track usage of incoming streams (#10710) 2021-06-07 10:22:37 -04:00
rpc_test.go Simplify Bootstrap logic in tests 2020-03-02 13:47:43 -05:00
testing.go test infrastructure for mock client RPCs (#10193) 2021-03-31 16:37:09 -04:00
util.go Revert "client: defensive against getting stale alloc updates" 2020-06-19 15:39:44 -04:00
util_test.go Update state with server 2018-10-16 16:53:29 -07:00