open-nomad/contributing/checklist-rpc-endpoint.md
Michael Schurter f8884d8b52
client/metadata: fix crasher caused by AllowStale = false (#16549)
Fixes #16517

Given a 3 Server cluster with at least 1 Client connected to Follower 1:

If a NodeMeta.{Apply,Read} for the Client request is received by
Follower 1 with `AllowStale = false` the Follower will forward the
request to the Leader.

The Leader, not being connected to the target Client, will forward the
RPC to Follower 1.

Follower 1, seeing AllowStale=false, will forward the request to the
Leader.

The Leader, not being connected to... well hoppefully you get the
picture: an infinite loop occurs.
2023-03-20 16:32:32 -07:00

2.8 KiB

New/Updated RPC Endpoint Checklist

Prefer adding a new message to changing any existing RPC messages.

Code

  • Request struct and *RequestType constant in nomad/structs/structs.go. Append the constant, old constant values must remain unchanged

  • In nomad/fsm.go, add a dispatch case to the switch statement in (n *nomadFSM) Apply

    • *nomadFSM method to decode the request and call the state method
  • State method for modifying objects in a Txn in nomad/state/state_store.go

    • nomad/state/state_store_test.go
  • Handler for the request in nomad/foo_endpoint.go

    • RPCs are resolved by matching the method name for bound structs net/rpc
    • Check ACLs for security, list endpoints filter by ACL
    • Register new RPC struct in nomad/server.go
    • Check ACLs to enforce security
  • Wrapper for the HTTP request in command/agent/foo_endpoint.go

    • Backwards compatibility requires a new endpoint, an upgraded client or server may be forwarding this request to an old server, without support for the new RPC
    • RPCs triggered by an internal process may not need support
    • Check ACLs as an optimization
  • Endpoint added/updated in the nomad-openapi repository.

    • New endpoints will need to be configured in that repository's generator package.
    • Updated endpoints may require the generator configuration to change, especially if parameters or headers change.
    • If the accepted or returned struct schema changes, the Nomad version references in generator/go.mod will need to be updated. Once the version is updated, regenerate the spec and all all clients so that the new schema is reflected in the spec and thus the generated models.
    • If QueryOptions, QueryMeta, WriteOptions, or WriteMeta change, the v1 framework will need to updated to support the change.
  • nomad/core_sched.go sends many RPCs

    • ServersMeetMinimumVersion asserts that the server cluster is upgraded, so use this to guard sending the new RPC, else send the old RPC
    • Version must match the actual release version!
  • If implementing a Client RPC...

    • Use QueryOptions instead of WriteRequest in the Request struct as WriteRequest is only for Raft writes.
    • Set QueryOptions.AllowStale = true in the Server RPC forwarder to avoid an infinite loop between leaders and followers when a Client RPC is forwarded through a follower. See https://github.com/hashicorp/nomad/issues/16517

Docs

  • Changelog
  • Metrics
  • API docs for RPCs with an HTTP endpoint, include ACLs, params, and example response body.