open-nomad/.changelog
Michael Schurter 27b8112123 connect: write envoy bootstrap debugging info
When Consul Connect just works, it's wonderful. When it doesn't work it
can be exceeding difficult to debug: operators have to check task
events, Nomad logs, Consul logs, Consul APIs, and even then critical
information is missing.

Using Consul to generate a bootstrap config for Envoy is notoriously
difficult. Nomad doesn't even log stderr, so operators are left trying
to piece together what went wrong.

This patch attempts to provide *maximal* context which unfortunately
includes secrets. **Secrets are always restricted to the secrets/
directory.** This makes debugging a little harder, but allows operators
to know exactly what operation Nomad was trying to perform.

What's added:

- stderr is sent to alloc/logs/envoy_bootstrap.stderr.0
- the CLI is written to secrets/.envoy_bootstrap.cmd
- the environment is written to secrets/.envoy_bootstrap.env as JSON

Accessing this information is unfortunately awkward:
```
nomad alloc exec -task connect-proxy-count-countdash b36a cat secrets/.envoy_bootstrap.env
nomad alloc exec -task connect-proxy-count-countdash b36a cat secrets/.envoy_bootstrap.cmd
nomad alloc fs b36a alloc/logs/envoy_bootstrap.stderr.0
```

The above assumes an alloc id that starts with `b36a` and a Connect
sidecar proxy for a service named `count-countdash`.

If the alloc is unable to start successfully, the debugging files are
only accessible from the host filesystem.
2022-02-18 12:02:36 -08:00
..
9160.txt changelog: add entries for #9160 and #11078 (#11290) 2021-10-14 08:43:36 -04:00
9198.txt docs and changelog for `nomad config validate` (#12031) 2022-02-09 10:20:45 -05:00
10236.txt changelog: fix entry markdown (#11911) 2022-01-24 11:04:14 -05:00
10237.txt changelog: fix entry markdown (#11911) 2022-01-24 11:04:14 -05:00
10563.txt changelog: add entry for GH-10563 (#10894) 2021-07-13 16:12:41 -04:00
10666.txt edit hierarchy to lead with namespace before job 2021-08-10 10:35:36 -04:00
10675.txt changelog: add entry for #10675 (#10919) 2021-07-21 10:05:48 -04:00
10796.txt changlog: add entry for #10796 (#11312) 2021-10-14 09:01:43 -04:00
10804.txt Adopt go-changelog in Nomad (#10825) 2021-07-06 10:46:53 -04:00
10806.txt update changelog (#10963) 2021-07-28 16:02:04 -04:00
10818.txt Adopt go-changelog in Nomad (#10825) 2021-07-06 10:46:53 -04:00
10822.txt Adopt go-changelog in Nomad (#10825) 2021-07-06 10:46:53 -04:00
10823.txt Adopt go-changelog in Nomad (#10825) 2021-07-06 10:46:53 -04:00
10832.txt deps: upgrade docker and runc 2022-01-18 08:35:26 -06:00
10840.txt CSI: Snapshot volume create should use vol.Secrets (#10840) 2021-07-02 08:28:22 -04:00
10842.txt consul: avoid triggering unnecessary sync when removing workload 2021-07-06 14:08:42 -05:00
10848.txt CSI ListSnapshots secrets implementation 2021-07-28 11:30:29 -07:00
10849.txt docs: add changelog entry for #10849 2021-07-16 15:58:58 -07:00
10855.txt csi: account for nil volume_mount in API-to-structs conversion (#10855) 2021-07-07 08:06:39 -04:00
10859.txt cli: add -task flag to alloc signal, restart (#10859) 2021-07-07 09:58:16 -07:00
10861.txt changelog: add entry for #10861 2021-07-07 11:03:39 +02:00
10864.txt scheduler: datacenter updates should be destructive 2021-07-07 11:18:30 -04:00
10865.txt consul: avoid extra sync operations when no action required 2021-07-07 11:24:56 -05:00
10868.txt service: remove duplicate name check during validation (#10868) 2021-07-08 09:43:38 -04:00
10870.txt cni: respect default `cni_config_dir` and `cni_path` (#10870) 2021-07-08 09:56:57 -04:00
10872.txt consul/connect: Avoid assumption of parent service when filtering connect proxies 2021-07-08 09:43:41 -05:00
10873.txt consul/connect: remove sidecar proxy before removing parent service 2021-07-08 13:30:19 -05:00
10875.txt cli: `-namespace` should override job namespace 2021-07-08 13:17:27 -04:00
10876.txt client: interpolate meta blocks with task environment (#10876) 2021-07-08 16:03:15 -04:00
10883.txt consul/connect: fix bug causing high cpu with multiple connect sidecars in group 2021-07-09 14:34:44 -05:00
10892.txt changelog: add entry for #10892 2021-07-13 10:29:44 +02:00
10893.txt ui: fixes #10885 2021-07-22 11:44:25 -04:00
10896.txt docs: add changelog entry 2021-07-14 10:46:40 -05:00
10897.txt core: remove internalization of affinity strings 2021-07-15 15:15:39 -05:00
10898.txt docs: update changelog 2021-07-14 11:21:00 -05:00
10907.txt client: avoid acting on stale data after launch (#10907) 2021-07-20 15:13:28 -04:00
10916.txt docs: add changelog for #10916 2021-07-27 11:51:38 -07:00
10917.txt fix `nomad alloc signal` help message (#10917) 2021-07-21 11:02:44 -04:00
10926.txt cli: Use glint to determine if os.Stdout is tty (#10926) 2021-07-23 11:27:47 -04:00
10929.txt changelog: add entry for #10929 2021-08-05 10:48:36 +02:00
10930.txt add dispatch idempotency token support in the CLI (#10930) 2021-10-22 12:39:05 -04:00
10931.txt update changelog (#10963) 2021-07-28 16:02:04 -04:00
10934.txt changelog: add entry for #10934 (#11001) 2021-08-04 11:33:18 -04:00
10935.txt events: fix wildcard namespace handling (#10935) 2021-09-02 09:36:55 -07:00
10951.txt consul/connect: avoid warn messages on connect proxy errors 2021-08-05 11:27:35 +02:00
10952.txt nomad: only activate one-time auth tokens with 1.1.0 (#10952) 2021-07-27 13:17:55 -04:00
10958.txt api: revert to defaulting to http/1 (#10958) 2021-07-28 11:21:53 -04:00
10965.txt Fixed plan diffing to handle non-unique service names. (#10965) 2021-10-12 16:42:39 -04:00
10975.txt Added support for `-force-color` to the CLI. (#10975) 2021-10-06 10:02:42 -04:00
10990.txt Only initialize task.VolumeMounts when not-nil (#10990) 2021-08-02 13:08:10 -04:00
11005.txt Speed up client startup and registration (#11005) 2021-08-10 17:06:18 -04:00
11007.txt system: re-evaluate node on feasibility changes (#11007) 2021-08-10 17:17:44 -04:00
11013.txt deployments: canary=0 is implicitly autopromote (#11013) 2021-08-10 17:06:40 -04:00
11019.txt ui: only dipslay "Dispatch Job" button on parameterized jobs (#11019) 2021-08-09 17:49:08 -04:00
11020.txt ui: add missing pipe separator in parameterized and periodic jobs (#11020) 2021-08-10 13:48:20 -04:00
11054.txt Consider all system jobs for a new node (#11054) 2021-08-18 09:50:37 -04:00
11065.txt Update x/sys to support go 1.17 (#11065) 2021-08-25 17:23:01 +02:00
11070.txt Don't timestamp active log file (#11070) 2021-08-23 11:27:34 -04:00
11078.txt changelog: add entries for #9160 and #11078 (#11290) 2021-10-14 08:43:36 -04:00
11083.txt update golang to 1.16.7 (#11083) 2021-08-25 11:56:46 -04:00
11084.txt link to cve listing in changelog 2021-08-27 10:42:39 -04:00
11110.txt ui: set namespace when looking for and displaying children jobs (#11110) 2021-09-01 14:40:25 -04:00
11111.txt docs: focus changelog entry for #11111 on the ux 2021-09-10 16:45:43 -07:00
11127.txt Add FailoverHeartbeatTTL to config (#11127) 2021-10-06 18:48:12 -04:00
11128.txt cli: include all possible scores in alloc status metric table (#11128) 2021-09-08 17:30:11 -04:00
11132.txt changelog batch (#11517) 2021-11-17 11:24:32 -05:00
11141.txt ui: set the job namespace when redirecting after the job is dispatched (#11141) 2021-09-07 12:27:33 -04:00
11143.txt changelog: add entry for #11143. 2021-09-07 09:51:17 +02:00
11144.txt Making RPC Upgrade mode reloadable. (#11144) 2021-11-01 16:30:53 -04:00
11165.txt changelog: add entry for #11165 2021-11-04 15:35:02 +01:00
11167.txt changelog: fixup entry extension for #11167 2021-11-04 15:28:34 +01:00
11168.txt Fixed usage of NOMAD_CLI_NO_COLOR env variable. (#11168) 2021-09-17 20:37:05 -04:00
11173.txt changelog: add entry for #11173. 2021-09-15 11:44:10 +02:00
11183.txt Disable PowerShell profile and simplify fingerprinting link speed on Windows (#11183) 2021-09-22 11:17:47 -04:00
11184.txt Log network device name during fingerprinting (#11184) 2021-09-16 10:48:31 -04:00
11185.txt deps: update go-memdb to 1.3.2 (#11185) 2021-09-14 20:26:45 -04:00
11187.txt connect: update allowed protocols in ingress gateway config (#11187) 2021-09-16 10:47:53 -04:00
11198.txt docs: add changelog entry for audit log naming 2021-09-16 16:21:57 -07:00
11206.txt changelog: add entry for #11206 2021-09-20 18:05:42 +01:00
11213.txt add `-show-url` option for `ui` command (#11213) 2021-10-05 20:08:42 -04:00
11215.txt client: add NOMAD_LICENSE to default env deny list 2021-09-21 13:51:17 -07:00
11238.txt Fixed creation of ControllerCreateVolumeRequest. (#11238) 2021-10-06 10:17:39 -04:00
11242.txt cli: Add nomad job allocs command (#11242) 2021-10-12 16:30:36 -04:00
11249.txt changelog: add entry for #11249 2021-10-01 12:50:51 +01:00
11251.txt update docs and changelog 2021-10-04 13:50:42 -04:00
11257.txt fix panic when Connect mesh gateway doesn't have a proxy block (#11257) 2021-10-04 15:52:07 -04:00
11261.txt add changelog 2021-10-05 13:01:19 -04:00
11269.txt debug: Improve namespace and region support (#11269) 2021-10-12 16:58:41 -04:00
11270.txt cli: Improved autocomplete support for job dispatch and operator debug (#11270) 2021-10-12 20:01:54 -04:00
11273.txt executor: suppress spurious log messages (#11273) 2021-10-06 12:42:35 -04:00
11280.txt Log error if there are no event handlers registered 2021-10-11 19:44:52 +00:00
11283.txt changelog: add entry for #11283 2021-10-07 08:16:05 +01:00
11284.txt Added a `-hcl2-strict` flag to allow for lenient hcl variable parsing. (#11284) 2021-11-04 16:33:09 +01:00
11287.txt executor: set CpuWeight in cgroup-v2 (#11287) 2021-10-14 08:46:07 -04:00
11291.txt wrap `log` messages with `hclog` (#11291) 2021-10-12 14:38:44 -04:00
11307.txt cli: rename paths in debug bundle for clarity (#11307) 2021-10-13 18:00:55 -04:00
11323.txt changelog batch (#11517) 2021-11-17 11:24:32 -05:00
11325.txt changelog batch (#11517) 2021-11-17 11:24:32 -05:00
11331.txt docs: add #11331 to changelog 2021-10-19 16:30:06 -07:00
11334.txt docs: add #11334 to changelog 2021-10-18 09:22:01 -07:00
11346.txt document GH-11346 fix (#11350) 2021-10-20 22:03:19 -04:00
11353.txt logging: Log the cause behind agent startup failure (#11353) 2021-10-27 10:41:17 -07:00
11357.txt ui: set * as the default namespace selector (#11357) 2021-10-21 10:24:07 -04:00
11358.txt ui: add client name tooltip when displaying client ID in tables (#11358) 2021-10-21 10:23:06 -04:00
11364.txt cli: allow setting namespace and region in the `nomad ui` command (#11364) 2021-10-21 16:24:39 -04:00
11366.txt ui: display Nomad version in the Clients and Servers table (#11366) 2021-10-22 10:33:06 -04:00
11368.txt ui: persist node drain settings (#11368) 2021-10-22 10:51:31 -04:00
11371.txt ui: update favicon (#11371) 2021-10-22 09:40:38 -04:00
11386.txt prevent active log from being overwritten when agent starts (#11386) 2021-10-26 20:57:07 -04:00
11391.txt scheduler: stop allocs in unrelated nodes (#11391) 2021-10-27 07:04:13 -07:00
11396.txt Fix arm64 panics by updating google/snappy library to latest, 0.0.4 (#11396) 2021-10-27 06:39:16 -07:00
11397.txt vault: set JobID in Vault metadata (#11397) 2021-10-27 07:20:29 -07:00
11398.txt debug: update default node-id and docs (#11398) 2021-10-27 13:43:56 -04:00
11400.txt Allow using specific object ID on diff (#11400) 2021-11-01 15:16:31 -04:00
11411.txt changelog: add entry for #11411. 2021-10-29 09:08:10 +02:00
11416.txt core: bump rejected plans from debug -> info 2021-10-31 12:51:42 -07:00
11423.txt Parse `job > group > consul` block in HCL1 (#11423) 2021-11-03 13:49:32 -04:00
11426.txt changelog: add entry for #11426 2021-11-02 11:43:13 +01:00
11429.txt changelog: add entry for #11429 2021-11-02 12:58:10 +01:00
11432.txt cli: show `host_network` in `nomad status` (#11432) 2021-11-05 09:02:46 -04:00
11461.txt build: bump go version to 1.17.3 (#11461) 2021-11-05 15:34:24 -04:00
11466.txt cli: refactor operator debug capture (#11466) 2021-11-05 19:43:10 -04:00
11481.txt deps: update go-getter to 1.5.9 (#11481) 2021-11-17 11:14:44 -05:00
11482.txt api: return 404 for alloc FS list/stat endpoints (#11482) 2021-11-17 11:15:07 -05:00
11490.txt changelog batch (#11517) 2021-11-17 11:24:32 -05:00
11500.txt metrics: added `mapped_file` metric (#11500) 2022-01-10 15:35:19 -05:00
11504.txt changelog: add entry for #11504 2021-11-15 12:01:52 +01:00
11507.txt changelog: fix entry markdown (#11911) 2022-01-24 11:04:14 -05:00
11526.txt Add support for setting pids_limit in docker plugin config. (#11526) 2021-12-21 13:31:34 -05:00
11532.txt core: allow setting and propagation of eval priority on job de/registration (#11532) 2021-11-23 09:23:31 +01:00
11542.txt qemu: add `args_allowlist` to sandbox VM command line inputs 2021-11-19 11:11:52 -05:00
11544.txt changelog: fix entry for #11544 2021-12-17 18:57:54 -05:00
11545.txt changelog: add entry for #11545 2021-12-17 18:49:56 -05:00
11550.txt CLI: Return non-zero exit code when deployment fails in `nomad run` (#11550) 2021-12-09 09:09:28 -05:00
11555.txt changelog: add entries for #11555, #11557, and #11687 (#11706) 2021-12-20 13:45:20 -05:00
11556.txt Don't emit scaling event error when a deployment is underway (#11556) 2021-11-23 10:20:18 -05:00
11557.txt changelog: add entries for #11555, #11557, and #11687 (#11706) 2021-12-20 13:45:20 -05:00
11565.txt scheduler: fix panic in system jobs when nodes filtered by class (#11565) 2021-11-24 12:59:15 -05:00
11572.txt raft: default to protocol v3 (#11572) 2022-02-03 15:03:12 -05:00
11579.txt changelog: add entry for #11579 2021-11-26 11:16:17 +01:00
11582.txt docs/tests for multiple HTTP address config (#11760) 2022-01-03 10:17:13 -05:00
11585.txt client: respect `client_auto_join` after connection loss (#11585) 2021-11-30 13:20:42 -05:00
11590.txt chore: edit mirage scenario to populate csi 2021-12-21 07:42:23 -05:00
11592.txt Add change log entry for PR 11592 (#11609) 2021-12-02 16:18:56 -05:00
11593.txt Make number of scheduler workers reloadable (#11593) 2022-01-06 11:56:13 -05:00
11596.txt provide `-no-shutdown-delay` flag for job/alloc stop (#11596) 2021-12-13 14:54:53 -05:00
11606.txt Expose Consul template configuration parameters (#11606) 2022-01-10 10:19:07 -05:00
11610.txt scheduler: config option to reject job registration (#11610) 2021-12-06 15:20:34 -05:00
11631.txt cli: Add var args to plan output. (#11631) 2021-12-07 10:43:52 -05:00
11644.txt api: return error when `LicenseGet` status is not `200` (#11644) 2021-12-14 19:47:09 -05:00
11648.txt evaluations list pagination and filtering (#11648) 2021-12-10 13:43:03 -05:00
11662.txt golang security update 1.17.5 2021-12-10 13:50:22 -05:00
11672.txt Fix ACL requirements for job details UI (#11672) 2022-01-12 21:26:02 -05:00
11675.txt `nomad eval list` command (#11675) 2021-12-15 11:58:38 -05:00
11678.txt cli: ensure `-stale` flag is respected by `nomad operator debug` (#11678) 2021-12-15 10:44:03 -05:00
11682.txt cli: unhide advanced operator raft debugging commands (#11682) 2021-12-16 10:32:11 -05:00
11687.txt changelog: add entries for #11555, #11557, and #11687 (#11706) 2021-12-20 13:45:20 -05:00
11710.txt api: respect wildcard in evaluations list API (#11710) 2021-12-20 12:23:50 -05:00
11712.txt scheduler: fix quadratic performance with spread blocks (#11712) 2021-12-21 10:10:01 -05:00
11721.txt Fix log streaming missing frames (#11721) 2022-01-04 14:07:16 -05:00
11724.txt api: support namespace wildcard in CSI volume list (#11724) 2021-12-21 17:19:45 -05:00
11728.txt fix host network reserved port fingerprint (#11728) 2021-12-22 15:29:54 -05:00
11741.txt task runner: fix goroutine leak in prestart hook (#11741) 2021-12-23 11:50:51 -05:00
11743.txt api: paginate deployment list and accept wildcard namespace (#11743) 2022-01-03 08:36:02 -05:00
11744.txt add changelog for Node.Copy fix 2021-12-23 12:34:05 -08:00
11776.txt csi: reap unused volume claims at leadership transitions (#11776) 2022-01-05 11:40:20 -05:00
11783.txt docs: improve changelog for PR #11783 (#11818) 2022-01-11 11:54:12 -05:00
11793.txt changelog: fix entry markdown (#11911) 2022-01-24 11:04:14 -05:00
11820.txt chore: add changelog 2022-01-14 10:23:09 -05:00
11830.txt doc: add changelog for #11830 2022-01-12 14:21:47 -08:00
11838.txt Fix log level parsing from lines that include a timestamp (#11838) 2022-01-13 09:56:35 -05:00
11843.txt deps: adjust to gzip handler zero length response body 2022-01-19 11:52:19 -06:00
11848.txt changelog: add entry for #11848 2022-01-14 13:40:50 +01:00
11852.txt csi: volume deregistration should require exact ID (#11852) 2022-01-14 12:26:03 -05:00
11853.txt changelog: fix entry markdown (#11911) 2022-01-24 11:04:14 -05:00
11854.txt changelog: fix entry markdown (#11911) 2022-01-24 11:04:14 -05:00
11855.txt deps: pty has new home 2022-01-19 12:33:05 -06:00
11856.txt drivers: set world-readable permissions on copied resolv.conf (#11856) 2022-01-14 12:25:23 -05:00
11865.txt cli: Add event stream capture to nomad operator debug (#11865) 2022-01-17 21:35:51 -05:00
11878.txt docs: add changelog for #11878 2022-01-31 12:21:31 -08:00
11889.txt build: upgrade and speedup circleci configuration 2022-01-24 08:28:14 -06:00
11890.txt CSI: resolve invalid claim states (#11890) 2022-01-28 14:43:35 -05:00
11891.txt csi: update leader's ACL in volumewatcher (#11891) 2022-01-28 14:43:27 -05:00
11892.txt docs: missing changelog for #11892 (#11959) 2022-01-28 15:08:48 -05:00
11896.txt ui: move volume link to the source column and fix the link target (#11896) 2022-01-26 14:17:29 -05:00
11902.txt Update IsEmpty to check for pre-1.2.4 fields (#11930) 2022-01-28 14:41:49 -05:00
11910.txt deps: add missing cl note 2022-01-24 10:13:13 -06:00
11918.txt deps: update api go version and dependencies 2022-01-24 12:23:26 -06:00
11927.txt changelog: use pr number not issue number 2022-01-26 06:32:10 -06:00
11945.txt Fix health checking for ephemeral poststart tasks (#11945) 2022-02-02 16:29:49 -05:00
11956.txt Verify TLS certificate on endpoints that are used between agents only (#11956) 2022-02-02 15:03:18 -05:00
11975.txt connect: write envoy bootstrap debugging info 2022-02-18 12:02:36 -08:00
11980.txt correct task row memory unit (#11980) 2022-02-02 17:00:25 -05:00
11983.txt cleanup: prevent leaks from time.After 2022-02-02 14:32:26 -06:00
11995.txt style: fix up very long tag word breaking the allocation service table width (#11995) 2022-02-04 19:40:03 -05:00
12008.txt scheduler: seed random shuffle nodes with eval ID (#12008) 2022-02-08 12:16:33 -05:00
12009.txt scheduler: recover from panic (#12009) 2022-02-07 11:47:53 -05:00
12012.txt improve error message on service length (#12012) 2022-02-04 19:39:34 -05:00
12015.txt cl: fix DO name 2022-02-08 10:28:57 -06:00
12036.txt client: fix race condition in use of go-getter 2022-02-09 19:48:28 -05:00
12037.txt client: check escaping of alloc dir using symlinks 2022-02-09 19:50:13 -05:00
12038.txt api: prevent excessice CPU load on job parse 2022-02-09 19:51:47 -05:00
12039.txt scheduler: prevent panic in spread iterator during alloc stop 2022-02-09 19:53:06 -05:00
12040.txt changelog: add entry for #12040 2022-02-10 08:36:32 +01:00
12050.txt csi: provide `CSI_ENDPOINT` env var to plugins (#12050) 2022-02-11 08:46:21 -05:00
12051.txt csi: volume cli prefix matching should accept exact match (#12051) 2022-02-11 08:53:03 -05:00
_664.txt changelog: new metrics in Nomad Enterprise (#11591) 2021-12-01 09:15:12 -05:00
changelog.tmpl Adopt go-changelog in Nomad (#10825) 2021-07-06 10:46:53 -04:00
note.tmpl Adopt go-changelog in Nomad (#10825) 2021-07-06 10:46:53 -04:00