112 lines
2.8 KiB
Plaintext
112 lines
2.8 KiB
Plaintext
---
|
|
layout: docs
|
|
page_title: 'Autoscaling Plugins: Nomad API'
|
|
description: The "nomad-apm" APM plugin queries the Nomad API for metrics.
|
|
---
|
|
|
|
# Nomad APM Plugin
|
|
|
|
The Nomad APM plugin allows querying the Nomad API for metric data. This provides
|
|
an immediate starting point without addition applications but comes at the price
|
|
of efficiency. When using this APM, it is advised to monitor Nomad carefully
|
|
ensuring it is not put under excessive load pressure.
|
|
|
|
~> The Nomad APM plugin should only be used when scaling based on CPU and
|
|
memory usage. For more advanced scenarios, such as scaling a cluster to
|
|
zero clients, you should use a different APM plugin.
|
|
|
|
## Agent Configuration Options
|
|
|
|
```hcl
|
|
apm "nomad-apm" {
|
|
driver = "nomad-apm"
|
|
}
|
|
```
|
|
|
|
When using a Nomad cluster with ACLs enabled, following ACL policy will provide the appropriate
|
|
permissions for obtaining task group metrics:
|
|
|
|
```hcl
|
|
namespace "default" {
|
|
policy = "read"
|
|
capabilities = ["read-job"]
|
|
}
|
|
```
|
|
|
|
In order to obtain cluster level metrics, the following ACL policy will be required:
|
|
|
|
```hcl
|
|
node {
|
|
policy = "read"
|
|
}
|
|
|
|
namespace "default" {
|
|
policy = "read"
|
|
capabilities = ["read-job"]
|
|
}
|
|
```
|
|
|
|
## Policy Configuration Options - Task Groups
|
|
|
|
The Nomad APM allows querying Nomad to understand the current resource usage of
|
|
a task group.
|
|
|
|
```hcl
|
|
check {
|
|
source = "nomad-apm"
|
|
query = "avg_cpu"
|
|
# ...
|
|
}
|
|
```
|
|
|
|
Querying Nomad task group metrics is be done using the `operation_metric` syntax,
|
|
where valid operations are:
|
|
|
|
- `avg` - returns the average of the metric value across allocations in the task
|
|
group.
|
|
|
|
- `min` - returns the lowest metric value among the allocations in the task group.
|
|
|
|
- `max` - returns the highest metric value among the allocations in the task
|
|
group.
|
|
|
|
- `sum` - returns the sum of all the metric values for the allocations in the
|
|
task group.
|
|
|
|
The metric value can be:
|
|
|
|
- `cpu` - CPU usage as reported by the `nomad.client.allocs.cpu.total_percent`
|
|
metric.
|
|
|
|
- `memory` - Memory usage as reported by the `nomad.client.allocs.memory.usage`
|
|
metric.
|
|
|
|
## Policy Configuration Options - Client Nodes
|
|
|
|
The Nomad APM allows querying Nomad to understand the current allocated resource
|
|
as a percentage of the total available.
|
|
|
|
```hcl
|
|
check {
|
|
source = "nomad-apm"
|
|
query = "percentage-allocated_cpu"
|
|
...
|
|
}
|
|
```
|
|
|
|
Querying Nomad client node metrics is be done using the `operation_metric` syntax,
|
|
where valid operations are:
|
|
|
|
- `percentage-allocated` - returns the allocated percentage of the desired
|
|
resource.
|
|
|
|
The metric value can be:
|
|
|
|
- `cpu` - allocated CPU as reported by calculating total allocatable against the
|
|
total allocated by the scheduler.
|
|
|
|
- `memory` - allocated memory as reported by calculating total allocatable against
|
|
the total allocated by the scheduler.
|
|
|
|
[nomad_telemetry_stanza]: /docs/configuration/telemetry#inlinecode-publish_allocation_metrics
|