test_checks.sh was removed in 2019 and now just breaks if VERBOSE is
set when running tests via make targets
in GHA, use verbose mode to display what tests are running
Part 2 of breaking up https://github.com/hashicorp/nomad/pull/12255
This PR makes it so gotestsum is invoked only in CircleCI. Also the
HCLogger(t) is plumbed more correctly in TestServer and TestAgent so
that they respect NOMAD_TEST_LOG_LEVEL.
The reason for these is we'll want to disable logging in GHA,
where spamming the disk with logs really drags performance.
This PR updates GNUMakefile to respect $GOBIN if it is set in the
environment or via an $GOENV file. Previously we hard-coded the output
to $GOPATH/bin, which is not necessarily the desired behavior.
Since switching to `golangci-lint` we have set the `-j 1` flag, which
restricts the tool to using 1 CPU thread.
This PR removes the flag so `make check` takes less time on good
computers.
This PR upgrades our CI images and fixes some affected tests.
- upgrade go-machine-image to premade latest ubuntu LTS (ubuntu-2004:202111-02)
- eliminate go-machine-recent-image (no longer necessary)
- manage GOPATH in GNUMakefile (see https://discuss.circleci.com/t/gopath-is-set-to-multiple-directories/7174)
- fix tcp dial error check (message seems to be OS specific)
- spot check values measured instead of specifically 'RSS' (rss no longer reported in cgroups v2)
- use safe MkdirTemp for generating tmpfiles
NOT applied: (too flakey)
- eliminate setting GOMAXPROCS=1 (build tools were also affected by this setting)
- upgrade resource type for all imanges to large (2C -> 4C)
## Development Environment Changes
* Added stringer to build deps
## New HTTP APIs
* Added scheduler worker config API
* Added scheduler worker info API
## New Internals
* (Scheduler)Worker API refactor—Start(), Stop(), Pause(), Resume()
* Update shutdown to use context
* Add mutex for contended server data
- `workerLock` for the `workers` slice
- `workerConfigLock` for the `Server.Config.NumSchedulers` and
`Server.Config.EnabledSchedulers` values
## Other
* Adding docs for scheduler worker api
* Add changelog message
Co-authored-by: Derek Strickland <1111455+DerekStrickland@users.noreply.github.com>
Meant for development purposes only, so one can compile binary on a
macos host then start a Docker container or scp the binary to a linux
host easily.
The resulting binary is statically linked and has very subtle
differences. e.g. static binaries use go native network stack that
honor /etc/hosts and /etc/resolve differently from the glibc
implementation. In development environment, I don't expect these to
materially change our experience.
Also format terraform scripts with hclfmt, equivalent to terraform fmt.
I opted not to use terraform fmt, because I didn't want to introduce dev dependency on the terraform CLI.
Also, I've optimized the find command to ignore spurious directories (e.g. .git, node_modules) that seem to be populated with too many files! make hclfmt takes 0.3s on my mac down from 7 seconds!
This PR removes the vendor directory from the Nomad repository.
Contributers will no longer need to deal with our `make sync`
step when working on Nomad, which was suprising when making changes
to the api. It also causes huge diffs in PRs that nobody looks at.
Adopts [`go-changelog`](https://github.com/hashicorp/go-changelog) for managing Nomad's changelog. `go-changelog` is becoming the HashiCorp defacto standard tool for managing changelog, e.g. [Consul](https://github.com/hashicorp/consul/pull/8387), [Vault](https://github.com/hashicorp/vault/pull/10363), [Waypoint](https://github.com/hashicorp/waypoint/pull/1179). [Consul](https://github.com/hashicorp/consul/pull/8387) seems to be the first product to adopt it, and its PR has the most context - though I've updated `.changelog/README.md` with the relevant info here.
## Changes to developers workflow
When opening PRs, developers should add a changelog entry in `.changelog/<PR#>.txt`. Check [`.changelog/README.md`](https://github.com/hashicorp/nomad/blob/docs-adopt-gochangelog/.changelog/README.md#developer-guide).
For the WIP release, entries can be amended even after the PR merged, and new files may be added post-hoc (e.g. during transition period, missed accidentally, community PRs, etc).
### Transitioning
Pending PRs can start including the changelog entry files immediately.
For 1.1.3/1.0.9 cycle, the release coordinator should create the entries for any PR that gets merged without a changelog entry file. They should also move any 1.1.3 entry in CHANGELOG.md to a changelog entry file, as this PR done for GH-10818.
## Changes to release process
Before cutting a release, release coordinator should update the changelog by inserting the output of `make changelog` to CHANGELOG.md with appropriate headers. See [`.changelog/README.md`](https://github.com/hashicorp/nomad/blob/docs-adopt-gochangelog/.changelog/README.md#how-to-generate-changelog-entries-for-release) for more details.
## Details
go-changelog is a basic templating engine for maintaining changelog in HashiCorp environment.
It expects the changelog entries as files indexed by their PR number. The CLI generates the changelog section for a release by comparing two git references (e.g. `HEAD` and the latest release, e.g. `v1.1.2`), and still requires manual process for updating CHANGELOG.md and final formatting.
The approach has many nice advantages:
* Avoids changelog related merge conflicts: Each PR touches different file!
* Copes with amendments and post-PR updates: Just add or update a changelog entry file using the original PR numbers.
* Addresses the release backporting scenario: Cherry-picking PRs will cherry-pick the relevant changelog entry automatically!
* Only relies on data available through `git` - no reliance on GitHub metadata or require GitHub credentials
The approach has few downsides though:
* CHANGELOG.md going stale during development and must be updated manually before cutting the release
* Repository watchers can no longer glance at the CHANGELOG.md to see upcoming changes
* We can periodically update the file, but `go-changelog` tool does not aid with that
* `go-changelog` tool does not offer good error reporting. If an entry is has an invalid tag (e.g. uses `release-note:bugfix` instead of `release-note:bug`), the entry will be dropped silently
* We should update go-changelog to warn against unexpected entry tags
* TODO: Meanwhile, PR reviewers and release coordinators should watch out
## Potential follow ups
We should follow up with CI checks to ensure PR changes include a warning. I've opted not to include that now. We still make many non-changelog-worth PRs for website/docs, for large features that get merged in multiple small PRs. I did not want to include a check that fails often.
Also, we should follow up to have `go-changelog` emit better warnings on unexpected tag.
Previously installing buf was left out of `make bootstrap` because it
had conflicts with the `tools/go.mod` file and dependencies used by
other tools. With Go 1.16 we eliminated that `go.mod` file, and can
now just install `buf` with `go install` like everything else.
This change disables using msgpack generated serializers in dev by
default.
In released binaries, we use code-generated msgpack serializers to
improve performance. However, in development, code generated
serializers are a pain. If a developer forgets to re-generate code, the
code generated gets out of sync with the go structs, and result into
subtle bugs where some values appear not to persist as expected.
The CI and release scripts will continue to use the msgpack
code-generation. Devs who want to test locally can set
`GO_TAGS=codegen_generated` as well.
This is required because Go does not pull CC from the make variable. This uses
whatever Go's default CC unless CC is overridden, as it is for the ARM targets.
This also makes it easier to build Nomad on a native ARM device, via:
```
make CC= pkg/linux_arm/nomad
```
* Set 'only' ALL_TARGETS rather than append
This is functionally no different than before, but it's more correct.
* Re-scope VERBOSE=true
Previously this was only set when the OS was Linux; this was added in
805ade7d3.
* Warn about unsupported OS rather than error
Also:
* Only print the warning when trying to build Nomad
* Print correct list of supported OSes
This removes small differences between the targets, like the statement
about what's being built.
The CGO/Windows related comments were deleted as being not relevant.
See https://github.com/hashicorp/nomad/pull/9643 for context.
Add a build target for Apple Silicon (m1) macs.
Note that Go must have been built with c4f497da6f for
Nomad to work on darwin/arm64 (i.e. wait for go1.16).
Closes#9408
Parameterize it so we can arbitrary target other versions, if we
are doing some manual checking, specially in the beginning when we may
want to validate compatibilities for skip release upgrades.
Also, introduce `checkbuf` target so we can run buf linter without the
rest.
use beta
Previously, it was required that you `go get github.com/hashicorp/nomad` to be
able to build protos, as the protoc invocation added an include directive that
pointed to `$GOPATH/src`, which is how dependent protos were discovered. As
Nomad now uses Go modules, it won't necessarily be cloned to `$GOPATH`.
(Additionally, if you _had_ go-gotten Nomad at some point, protoc compilation
would have possibly used the _wrong_ protos, as those wouldn't necessarily be
the most up-to-date ones.)
This change modifies the proto files and the `protoc` invocation to handle
discovering dependent protos via protoc plugin modifier statements that are
specific to the protoc plugin being used.
In this change, `make proto` was run to recompile the protos, which results in
changes only to the gzipped `FileDescriptorProto`.
-I ../../.. is meant to navigate from `GOPATH/src/github.com/hashicorp/nomad` to `GOPATH/src`
This is fine but it assumes a few things about how the dev has setup nomad, which is also fine if that is the expected dev environment, however the `../../..` is not as explicit as "GOPATH/src" and it would also enable a few more scenarios so it seems strictly better to me.
Random example: nomad is a subrepo of ours, but with this change we can symlink from GOPATH/src/github.com/hashicorp/nomad and `make proto` will work.
Currently we compile (but don't run) the e2e tests as part of `test-other`,
which is skipped for branches named `e2e-*`. Move this check into the
`test-e2e` job. Split out the vault compatibility integration check as its own
makefile target for clarity.
With Go modules, `go mod tidy` supplants `vendorfmt`. Unfortunately,
`tidy` will try to reach out to the network and download modules, and
there is no way to disable that behavior (e.g. the -mod=vendor) option
does not apply. This means we cannot use the `tidy` target in nomad
enterprise, which will be unable to reach private repositories like
consul-enterprise.
This isn't a big deal, since `vendorfmt` served the purpose of rewriting
the output of `govendor`, wheras `tidy` is a part of the `sync` target
that is required to be run when modifying dependencies anyway.
This PR switches the Nomad repository from using govendor to Go modules
for managing dependencies. Aspects of the Nomad workflow remain pretty
much the same. The usual Makefile targets should continue to work as
they always did. The API submodule simply defers to the parent Nomad
version on the repository, keeping the semantics of API versioning that
currently exists.
We have been using fatih/hclfmt which is long abandoned. Instead, switch
to HashiCorp's own hclfmt implementation. There are some trivial changes in
behavior around whitespace.
Use v1.1.5 of go-msgpack/codec/codecgen, so go-msgpack codecgen matches
the library version.
We branched off earlier to pick up
f51b518921
, but apparently that's not needed as we could customize the package via
`-c` argument.
Examples for HTTP based task-group service healthchecks are
covered by the `countdash` demo, but gRPC checks currently
have no runnable examples.
This PR adds a trivial gRPC enabled application that provides
a Service implementing the standard gRPC healthcheck interface.
Running `make dev` runs `hclfmt`, but this isn't checked as part of
CI. That makes it possible to merge un-formatted HCL and Nomad
jobspecs that later will make for dirty git staging areas when
developers pull master.
This changeset adds HCL linting to the `make check` target.
Use go mod for github.com/hashicorp/go-bindata/go-bindata and
github.com/elazarl/go-bindata-assetfs/go-bindata-assetfs but use
`@master` to pull the latest master. These packages don't have release
tags so `@master` worksaround it.
Adding `-trimpath` to builds removes the local working directory from
the goroutine stack traces, which makes our builds more reproducible
and doesn't leak information about our local development workstations
or CI environment.
This allows using https download and go mod cache proxies, over using
git and downloading entire dependencies git history, hopefully,
resulting into a faster installation process.
You'd think since golangci-lint embeds misspell we could use that,
but it fails to run if it finds no Go source files, which is the
case in our website/ directory that we want to check.
gometalinter has been deprecated, with golangci-lint as its spiritual
and recommended successor. Here we switch to using it with an equivalent
configuration, albeit with newer versions of some linters.
To maintain compatibility with existing settings, we have a couple of
things disabled here, specifically:
- tests
We have a lot of unused code in our tests that choke deadcode.
We should attempt to clean these up soon so that we can lint our
testcode.
- govet.check-shadowing = false
This breaks on redefining `err` which we do all over the nomad
codebase.
This is an attempt to ease dependency management for external driver
plugins, by avoiding requiring them to compile ugorji/go generated
files. Plugin developers reported some pain with the brittleness of
ugorji/go dependency in particular, specially when using go mod, the
default go mod manager in golang 1.13.
Context
--------
Nomad uses msgpack to persist and serialize internal structs, using
ugorji/go library. As an optimization, we use ugorji/go code generation
to speedup process and aovid the relection-based slow path.
We commit these generated files in repository when we cut and tag the
release to ease reproducability and debugging old releases. Thus,
downstream projects that depend on release tag, indirectly depends on
ugorji/go generated code.
Sadly, the generated code is brittle and specific to the version of
ugorji/go being used. When go mod picks another version of ugorji/go
then nomad (go mod by default uses release according to semver),
downstream projects face compilation errors.
Interestingly, downstream projects don't commonly serialize nomad
internal structs. Drivers and device plugins use grpc instead of
msgpack for the most part. In the few cases where they use msgpag (e.g.
decoding task config), they do without codegen path as they run on
driver specific structs not the nomad internal structs. Also, the
ugorji/go serialization through reflection is generally backward
compatible (mod some ugorji/go regression bugs that get introduced every
now and then :( ).
Proposal
---------
The proposal here is to keep committing ugorji/go codec generated files
for releases but to use a go tag for them.
All nomad development through the makefile, including releasing, CI and
dev flow, has the tag enabled.
Downstream plugin projects, by default, will skip these files and life
proceed as normal for them.
The downside is that nomad developers who use generated code but avoid
using make must start passing additional go tag argument. Though this
is not a blessed configuration.
We currently use an container image for `test-devices` job only; while
all other jobs use machine executor.
This allows us to switch golang and protoc verions easily without
manually managing Docker images (which requires building them manually
on a dev machines, etc). All that while, we install dependencies on
every build in all other jobs..
`test-devices` now is one of the fastest jobs and isn't a constraint or
a bottleneck, so increasing its overhead by few seconds doesn't hurt the
overall developer iteration.
If we split tests effectively later, we can revisit.