Commit graph

110 commits

Author SHA1 Message Date
Matt Keeler df507a4a55 A few misc fixes found by go vet 2018-10-19 12:28:36 -04:00
Matt Keeler 99e0a124cb
New ACLs (#4791)
This PR is almost a complete rewrite of the ACL system within Consul. It brings the features more in line with other HashiCorp products. Obviously there is quite a bit left to do here but most of it is related docs, testing and finishing the last few commands in the CLI. I will update the PR description and check off the todos as I finish them over the next few days/week.
Description

At a high level this PR is mainly to split ACL tokens from Policies and to split the concepts of Authorization from Identities. A lot of this PR is mostly just to support CRUD operations on ACLTokens and ACLPolicies. These in and of themselves are not particularly interesting. The bigger conceptual changes are in how tokens get resolved, how backwards compatibility is handled and the separation of policy from identity which could lead the way to allowing for alternative identity providers.

On the surface and with a new cluster the ACL system will look very similar to that of Nomads. Both have tokens and policies. Both have local tokens. The ACL management APIs for both are very similar. I even ripped off Nomad's ACL bootstrap resetting procedure. There are a few key differences though.

    Nomad requires token and policy replication where Consul only requires policy replication with token replication being opt-in. In Consul local tokens only work with token replication being enabled though.
    All policies in Nomad are globally applicable. In Consul all policies are stored and replicated globally but can be scoped to a subset of the datacenters. This allows for more granular access management.
    Unlike Nomad, Consul has legacy baggage in the form of the original ACL system. The ramifications of this are:
        A server running the new system must still support other clients using the legacy system.
        A client running the new system must be able to use the legacy RPCs when the servers in its datacenter are running the legacy system.
        The primary ACL DC's servers running in legacy mode needs to be a gate that keeps everything else in the entire multi-DC cluster running in legacy mode.

So not only does this PR implement the new ACL system but has a legacy mode built in for when the cluster isn't ready for new ACLs. Also detecting that new ACLs can be used is automatic and requires no configuration on the part of administrators. This process is detailed more in the "Transitioning from Legacy to New ACL Mode" section below.
2018-10-19 12:04:07 -04:00
Jack Pearkes 197d62c6ca New command: consul debug (#4754)
* agent/debug: add package for debugging, host info

* api: add v1/agent/host endpoint

* agent: add v1/agent/host endpoint

* command/debug: implementation of static capture

* command/debug: tests and only configured targets

* agent/debug: add basic test for host metrics

* command/debug: add methods for dynamic data capture

* api: add debug/pprof endpoints

* command/debug: add pprof

* command/debug: timing, wg, logs to disk

* vendor: add gopsutil/disk

* command/debug: add a usage section

* website: add docs for consul debug

* agent/host: require operator:read

* api/host: improve docs and no retry timing

* command/debug: fail on extra arguments

* command/debug: fixup file permissions to 0644

* command/debug: remove server flags

* command/debug: improve clarity of usage section

* api/debug: add Trace for profiling, fix profile

* command/debug: capture profile and trace at the same time

* command/debug: add index document

* command/debug: use "clusters" in place of members

* command/debug: remove address in output

* command/debug: improve comment on metrics sleep

* command/debug: clarify usage

* agent: always register pprof handlers and protect

This will allow us to avoid a restart of a target agent
for profiling by always registering the pprof handlers.

Given this is a potentially sensitive path, it is protected
with an operator:read ACL and enable debug being
set to true on the target agent. enable_debug still requires
a restart.

If ACLs are disabled, enable_debug is sufficient.

* command/debug: use trace.out instead of .prof

More in line with golang docs.

* agent: fix comment wording

* agent: wrap table driven tests in t.run()
2018-10-19 08:41:03 -07:00
Freddy 54e4cd4801 Improve resilience of api pkg tests (#4676)
* Add function to wait for serfHealth in api tests

* Disable connect when creating semaphore test clients

* Wait for serfHealth when creating sessions in their tests

* Add helper functions to create lock/semaphore sessions without checks

* Log passing tests to prevent timeout in Travis due to lack of output
2018-09-18 17:47:01 +01:00
Pierre Souchay 22bacabf7e Fixed message: testutil/server.go:317: missing ... in args forwarded to print-like function 2018-07-10 11:04:25 +02:00
Paul Banks ba6e909ed7 Misc test fixes 2018-06-25 12:25:39 -07:00
Paul Banks 530d4acc57 Misc rebase and test fixes 2018-06-25 12:25:38 -07:00
Paul Banks 834ed1d25f
Fixed many tests after rebase. Some still failing and seem unrelated to any connect changes. 2018-06-14 09:42:16 -07:00
Paul Banks 02ab461dae
TLS watching integrated into Service with some basic tests.
There are also a lot of small bug fixes found when testing lots of things end-to-end for the first time and some cleanup now it's integrated with real CA code.
2018-06-14 09:42:07 -07:00
Josh Soref 1dd8c378b9 Spelling (#3958)
* spelling: another

* spelling: autopilot

* spelling: beginning

* spelling: circonus

* spelling: default

* spelling: definition

* spelling: distance

* spelling: encountered

* spelling: enterprise

* spelling: expands

* spelling: exits

* spelling: formatting

* spelling: health

* spelling: hierarchy

* spelling: imposed

* spelling: independence

* spelling: inspect

* spelling: last

* spelling: latest

* spelling: client

* spelling: message

* spelling: minimum

* spelling: notify

* spelling: nonexistent

* spelling: operator

* spelling: payload

* spelling: preceded

* spelling: prepared

* spelling: programmatically

* spelling: required

* spelling: reconcile

* spelling: responses

* spelling: request

* spelling: response

* spelling: results

* spelling: retrieve

* spelling: service

* spelling: significantly

* spelling: specifies

* spelling: supported

* spelling: synchronization

* spelling: synchronous

* spelling: themselves

* spelling: unexpected

* spelling: validations

* spelling: value
2018-03-19 16:56:00 +00:00
James Phillips 9572e504d3
Bumps the standard retry up. 2018-02-07 18:03:39 -08:00
Frank Schroeder 74859ff3c0 test: replace porter tool with freeport lib
This patch removes the porter tool which hands out free ports from a
given range with a library which does the same thing. The challenge for
acquiring free ports in concurrent go test runs is that go packages are
tested concurrently and run in separate processes. There has to be some
inter-process synchronization in preventing processes allocating the
same ports.

freeport allocates blocks of ports from a range expected to be not in
heavy use and implements a system-wide mutex by binding to the first
port of that block for the lifetime of the application. Ports are then
provided sequentially from that block and are tested on localhost before
being returned as available.
2017-10-21 22:01:09 +02:00
James Phillips 28c960f207
Adds a canary test to fail Consul's test w/o porter. 2017-10-20 17:05:25 -07:00
Alex Dadgar 098d71c04d Testutil falls back to random ports w/o porter (#3604)
* Testutil falls back to random ports w/o porter

This PR allows the testutil server to be used without porter.

* Adds sterner-sounding fallback comments.
2017-10-20 16:46:13 -07:00
Frank Schröder c7cc62ab5a agent: consolidate handling of 405 Method Not Allowed (#3405)
* agent: consolidate http method not allowed checks

This patch uses the error handling of the http handlers to handle HTTP
method not allowed errors across all available endpoints. It also adds a
test for testing whether the endpoints respond with the correct status
code.

* agent: do not panic on metrics tests

* agent: drop other tests for MethodNotAllowed

* agent: align /agent/join with reality

/agent/join uses PUT instead of GET as documented.

* agent: align /agent/check/{fail,warn,pass} with reality

/agent/check/{fail,warn,pass} uses PUT instead of GET as documented.

* fix some tests

* Drop more tests for method not allowed

* Align TestAgent_RegisterService_InvalidAddress with reality

* Changes API client join to use PUT instead of GET.

* Fixes agent endpoint verbs and removes obsolete tests.

* Updates the change log.
2017-09-25 23:11:19 -07:00
Frank Schröder 69a088ca85 New config parser, HCL support, multiple bind addrs (#3480)
* new config parser for agent

This patch implements a new config parser for the consul agent which
makes the following changes to the previous implementation:

 * add HCL support
 * all configuration fragments in tests and for default config are
   expressed as HCL fragments
 * HCL fragments can be provided on the command line so that they
   can eventually replace the command line flags.
 * HCL/JSON fragments are parsed into a temporary Config structure
   which can be merged using reflection (all values are pointers).
   The existing merge logic of overwrite for values and append
   for slices has been preserved.
 * A single builder process generates a typed runtime configuration
   for the agent.

The new implementation is more strict and fails in the builder process
if no valid runtime configuration can be generated. Therefore,
additional validations in other parts of the code should be removed.

The builder also pre-computes all required network addresses so that no
address/port magic should be required where the configuration is used
and should therefore be removed.

* Upgrade github.com/hashicorp/hcl to support int64

* improve error messages

* fix directory permission test

* Fix rtt test

* Fix ForceLeave test

* Skip performance test for now until we know what to do

* Update github.com/hashicorp/memberlist to update log prefix

* Make memberlist use the default logger

* improve config error handling

* do not fail on non-existing data-dir

* experiment with non-uniform timeouts to get a handle on stalled leader elections

* Run tests for packages separately to eliminate the spurious port conflicts

* refactor private address detection and unify approach for ipv4 and ipv6.

Fixes #2825

* do not allow unix sockets for DNS

* improve bind and advertise addr error handling

* go through builder using test coverage

* minimal update to the docs

* more coverage tests fixed

* more tests

* fix makefile

* cleanup

* fix port conflicts with external port server 'porter'

* stop test server on error

* do not run api test that change global ENV concurrently with the other tests

* Run remaining api tests concurrently

* no need for retry with the port number service

* monkey patch race condition in go-sockaddr until we understand why that fails

* monkey patch hcl decoder race condidtion until we understand why that fails

* monkey patch spurious errors in strings.EqualFold from here

* add test for hcl decoder race condition. Run with go test -parallel 128

* Increase timeout again

* cleanup

* don't log port allocations by default

* use base command arg parsing to format help output properly

* handle -dc deprecation case in Build

* switch autopilot.max_trailing_logs to int

* remove duplicate test case

* remove unused methods

* remove comments about flag/config value inconsistencies

* switch got and want around since the error message was misleading.

* Removes a stray debug log.

* Removes a stray newline in imports.

* Fixes TestACL_Version8.

* Runs go fmt.

* Adds a default case for unknown address types.

* Reoders and reformats some imports.

* Adds some comments and fixes typos.

* Reorders imports.

* add unix socket support for dns later

* drop all deprecated flags and arguments

* fix wrong field name

* remove stray node-id file

* drop unnecessary patch section in test

* drop duplicate test

* add test for LeaveOnTerm and SkipLeaveOnInt in client mode

* drop "bla" and add clarifying comment for the test

* split up tests to support enterprise/non-enterprise tests

* drop raft multiplier and derive values during build phase

* sanitize runtime config reflectively and add test

* detect invalid config fields

* fix tests with invalid config fields

* use different values for wan sanitiziation test

* drop recursor in favor of recursors

* allow dns_config.udp_answer_limit to be zero

* make sure tests run on machines with multiple ips

* Fix failing tests in a few more places by providing a bind address in the test

* Gets rid of skipped TestAgent_CheckPerformanceSettings and adds case for builder.

* Add porter to server_test.go to make tests there less flaky

* go fmt
2017-09-25 11:40:42 -07:00
James Phillips 6a6eadd8c7
Adds open source side of network segments (feature is Enterprise-only). 2017-08-30 11:58:29 -07:00
James Phillips 788dd255a1 Adds new config to make script checks opt-in, updates documentation. (#3284) 2017-07-17 11:20:35 -07:00
Frank Schroeder f0f00a1401 api: do not block in waitForLeader since the retry loop is handling this 2017-07-07 09:22:34 +02:00
James Phillips 26c8a9ad4c Adds a temporary band-aid to the API tests.
Once we can run the agent directly in-process with random ports from the kernel,
and the special retry just in memberlist, we shouldn't need this.
2017-07-07 09:22:34 +02:00
Frank Schroeder e203e10031 retry: relax timing for test a bit 2017-07-07 09:22:34 +02:00
Frank Schroeder 0f67ea550e
test: shutdown server properly 2017-05-31 00:29:22 +02:00
Frank Schroeder 111e197462
test: sanitize temp dir names since subtest names have slashes 2017-05-31 00:29:22 +02:00
Frank Schroeder bab25b1c84 test: fix go vet issue 2017-05-12 22:12:47 +02:00
Frank Schroeder afaf4431ee test: bump ReadyTimeout to 10s 2017-05-12 22:12:47 +02:00
Frank Schroeder eedf0f3ac5 test: add helper for ioutil.TempDir/TempFile
This creates a simplified helper for temporary directories and files.
All path names are prefixed with the name of the current test.
All files and directories are stored either in /tmp/consul-test
or /tmp if the former could not be created.

Using the system temp dir breaks some tests on macOS where the unix
socket path becomes too long.
2017-05-12 22:12:47 +02:00
Frank Schroeder 15590a8446 test: include test name in temp file/dir
This helps identifying hanging tests by looking
at the process list.
2017-05-12 22:12:47 +02:00
Cyril Gaudin cef53ee906 test: Make testserver start timeout configurable. 2017-05-09 17:48:53 +02:00
James Phillips 6103198732
Tweaks some tests that were having a hard time in Travis CI and
bumps up the default retry time.
2017-05-09 06:48:26 -07:00
Frank Schroeder 0a7898f039
test: simplify tests that check for peers 2017-05-05 17:07:04 +02:00
Frank Schroeder 16a7b52026
test: Drop WaitForResult from testutil and testrpc 2017-05-05 17:07:03 +02:00
James Phillips 7007c89870
retry: Removes the description parameter. 2017-05-05 17:07:03 +02:00
Frank Schroeder 9f369c93c7
retry: add retry package for retriable tests
The current retry framework in testutil/testprc.WaitForResult uses
a func() (bool, error) callback until it succeeds or times out.
It captures the last error and returns it.

    if err := testutil.WaitForResult(t, func() (bool, error) {
	if err := foo(); err != nil {
	    return false, err
	}
	...
	return true, nil
    }); err != nil {
	t.Fatal(err)
    }

This makes the test functions more complex than they need to be since
both the boolean and the error indicate a success or a failure.

The retry.Run framework uses a an approach similar to t.Run()
from the testing framework.

    retry.Run(t, func(r *retry.R) {
	if err := foo(); err != nil {
	    r.Fatal(err)
	}
    })

The behavior of the Run function is configurable so that different
timeouts can be used for different tests.
2017-05-05 17:07:02 +02:00
Ethan Chu 90b67f115c
Add AddAccessibleService to testutil
`AddAccessibleService` works just like `AddService` but also passing
"address" and "port". It is helpfu when you need to prepare a
fakeService that will be accessed later in target source code.
2017-05-01 13:56:48 -07:00
Kyle Havlovitz b5ed2ba536 Add separate option for verifying incoming HTTPS traffic (#2974)
* Add separate option for verifying incoming HTTPS traffic
2017-04-28 16:15:55 -07:00
Frank Schroeder 36380826b0 test: Retry more aggressively 2017-04-27 10:34:30 -07:00
Frank Schroeder 9e2332f6a1 golint: Fix existing comments
This needs more work.
2017-04-25 09:26:13 -07:00
Frank Schroeder 9de4555c0c golint: Untangle if blocks with return in else 2017-04-25 09:26:13 -07:00
Frank Schroeder 8c7bb7b65a golint: Rename fields and structs 2017-04-25 09:26:13 -07:00
Frank Schroeder 9f8f258d4d Remove duplicate constants
This patch removes duplicate internal copies of constants in the structs
package which are also defined in the api package. The api.KVOp type
with all its values for the TXN endpoint and the api.HealthXXX constants
are now used throughout the codebase.

This resulted in some circular dependencies in the testutil package
which have been resolved by copying code and constants and moving the
WaitForLeader function into a separate testrpc package.
2017-04-20 09:54:49 -07:00
Kyle Havlovitz cdd8b5b880
Add tls client options to api/cli 2017-04-14 13:37:29 -07:00
James Phillips 424d70ba5d
Cleans up a stray mark and fixes unit tests.
Ended up removing the leader_test.go server address change test as part
of this. The join was failing becase we were using a new node name with
the new logic here, but realized this was hitting some of the memberlist
conflict logic and not working as we expected. We need some additional
work to fully support address changes, so removed the test for now.
2017-03-27 01:28:54 -07:00
James Phillips c553e1d93a
Changes verson 8 ACLs to opt-out. 2017-03-24 12:12:24 -07:00
James Phillips 4c1d31e934
Fixes test server startup returning wrong error.
Seen in https://travis-ci.org/hashicorp/consul/builds/214521317.
2017-03-23 21:40:14 -07:00
James Phillips 55b4446fbb
Makes WaitForResult always return an error if it times out. 2017-03-23 20:04:39 -07:00
Seth Vargo 6ffb801f1e
Add RPC 2017-03-23 18:41:20 -04:00
Seth Vargo 3077d0f68c
Reduce coupling with testing.T
This reduces the coupling with testing.T, allowing many of the
server's startup-related functions to return an error. This makes them
more re-usable.
2017-03-23 18:41:19 -04:00
Kyle Havlovitz 8130f9b1c1
Cleaned up and reorganized some autopilot-related code 2017-03-09 18:21:40 -08:00
Kyle Havlovitz 01fe3fb399
Fix up command and api tests 2017-02-28 14:12:55 -08:00
Jeff Mitchell 726590367f Update unix dial functions to use DialContext with new go-cleanhttp 2017-02-10 21:11:21 -05:00