Quick links
Overview
Below is the documentation of the public APIs provided for AutoOps users. These APIs are accessible through the http rest interface, using authentication tokens created in the AutoOps dashboard.
There are two APIs: metrics and events. The metrics API provides access to the metrics collected and calculated in AutoOps. These are raw metrics that are provided per-node, or per-index. The events API provides access to the high level events that are detected and computed in AutoOps.
Metrics API
The metrics API will allow users to fetch different metrics for specific time ranges.
Path: public-api.opster.com/monitoring/v1/metrics/nodes
Method: POST
Headers:
X-OPSTER-IDENTITY-API-KEY – provide your API token from dashboard here: https://autoops.opster.com/settings/tokens
Request:
- clusterId – ID of cluster, string.
- metricName – the name of the metric to query, this is a closed list of specific metrics that are exposed (see list below).
- nodeIds – node IDs, list of string.
- aggType – aggregation type (possible values: AVG, MAX). Default – MAX.
- from – the start time from which the metrics should be collected, long since epoch.
- to – the end time until which the metrics should be collected, long since epoch.
- numberOfBuckets – the number of date histogram buckets to retrieve, int.
- sizeOfBucket – the size of each date histogram bucket in time, string (example 1d).
*Provides either numberOfBuckets or sizeOfBucket.
Request example:
{ "clusterId": “clusterId”, "nodeIds": [ "iku_N0gDRxG4_EaW06XSVg", "31Xoac-qROWR00pxku9mRA", "GsjdPIHGRmq2gp-rJpqQIw" ], "from": 1658448000000, "to": 1661299200000, "metricName": "SEARCH_QUEUE", "numberOfBuckets": 5, "aggType": "MAX" }
Response:
- List of objects with metric name and list of metric values for given time range.
Response example:
{ "metricSeries": [ { "metricName": "SEARCH_QUEUE", "metrics": [ { "name": "GsjdPIHGRmq2gp-rJpqQIw", "data": [ [ 1658361600000, 0.0 ], [ 1658880000000, 0.0 ], [ 1659398400000, 0.0 ], [ 1659916800000, 1.0 ], [ 1660435200000, 1.0 ], [ 1660953600000, 0.0 ] ] }, { "name": "31Xoac-qROWR00pxku9mRA", "data": [ [ 1658361600000, 0.0 ], [ 1658880000000, 0.0 ], [ 1659398400000, 0.0 ], [ 1659916800000, 0.0 ], [ 1660435200000, 0.0 ], [ 1660953600000, 0.0 ] ] }, { "name": "iku_N0gDRxG4_EaW06XSVg", "data": [ [ 1658361600000, 0.0 ], [ 1658880000000, 0.0 ], [ 1659398400000, 0.0 ], [ 1659916800000, 1.0 ], [ 1660435200000, 1.0 ], [ 1660953600000, 0.0 ] ] } ] } ] }
Path: public-api.opster.com/monitoring/v1/metrics/indices
Method: POST
Headers:
X-OPSTER-IDENTITY-API-KEY – provide your API token from dashboard. You can create a token here: https://autoops.opster.com/settings/tokens
Request:
- clusterId – ID of cluster, string.
- metricName – the name of the metric to query, this is a closed list of specific metrics that are exposed (see list below).
- indices – index names, list of string.
- aggType – aggregation type (possible values: AVG, MAX). Default – MAX.
- from – the start time from which the metrics should be collected, long since epoch.
- to – the end time until which the metrics should be collected, long since epoch.
- numberOfBuckets – the number of date histogram buckets to retrieve, int.
- sizeOfBucket – the size of each date histogram bucket in time, string (example 1d).
*Provides either numberOfBuckets or sizeOfBucket.
Request example:
{ "clusterId": “clusterId”, "indices": [ "index1", "index2", "index3" ], "from": 1661247159000, "to": 1661333559000, "metricName": "SEARCH_QUEUE", "numberOfBuckets": 20, "aggType": "MAX" }
Response:
List of objects with metric name and list of metric values for given time range.
Response example:
{ "metricSeries": [ { "metricName": "SEARCH_QUEUE", "metrics": [ { "name": "index1", "data": [ [ 1658361600000, 0.0 ], [ 1658880000000, 0.0 ], [ 1659398400000, 0.0 ], [ 1659916800000, 1.0 ], [ 1660435200000, 1.0 ], [ 1660953600000, 0.0 ] ] }, { "name": "index2", "data": [ [ 1658361600000, 0.0 ], [ 1658880000000, 0.0 ], [ 1659398400000, 0.0 ], [ 1659916800000, 0.0 ], [ 1660435200000, 0.0 ], [ 1660953600000, 0.0 ] ] }, { "name": "index3", "data": [ [ 1658361600000, 0.0 ], [ 1658880000000, 0.0 ], [ 1659398400000, 0.0 ], [ 1659916800000, 1.0 ], [ 1660435200000, 1.0 ], [ 1660953600000, 0.0 ] ] } ] } ] }
Events API
The events API returns information about opened and closed events.
Path: public-api.opster.com/analyzer/v1/analyses
Method: POST
Headers:
X-OPSTER-IDENTITY-API-KEY – provide your API token from dashboard. Create a token here: https://autoops.opster.com/settings/tokens
Request:
- clusterId – ID of cluster, string.
- types – list of names of analyses to query, list of strings (see list below).
- from – the start time from which the metrics should be collected, long since epoch.
- to – the end time until which the metrics should be collected, long since epoch.
- isOpen – flag, that indicates whether the event is still open or already closed, boolean (true – for open event, false – for closed event).
- Size – the number of events to return, int (Default 20, maximum value – 100).
- Offset – the number of events to skip, int.
Request example:
{ “clusterId”: “clusterId”, “types": [ "STATUS_YELLOW", "DISK_WATERMARK_LOW", "LOADED_DATA_NODES" ], "from": 1661247159000, "to": 1661333559000, "isOpen": "false", "size": 20, "offset": 0 }
Response:
- analyses – list of analyses, Analysis Object
Analysis Object
- clusterId – ID of cluster, string
- type – event type, string
- severity – event severity, string (possible value : LOW, MEDIUM, HIGH
- description – description of the event, string
- affectedContext – context of the event, string
- actions – list of Action Object, (can be empty)
- startTime
- endTime (Nullable)
Action Object
- type – type of the action, string
- title – title of the action, string
- description – description of the action, string
- command – command, string (Nullable)
Response example:
{ "analyses": [ { "clusterId": "clusterId", "type": "STATUS_YELLOW", "severity": "MEDIUM", "title": "The cluster status is Yellow", "description": "Impact: When cluster status is Yellow, it means there is a higher risk of permanent or temporary loss of data.", "affectedContext": "The current number of unassigned replica shards is: 9.</br></br>The current number of total(replica and primary) unassigned shards is: 9.</br></br>The maximum number of total(replica and primary) unassigned shards was: 9.</br></br>The number of initializing shards is: 6.</br></br>The Unassigned reasons with sample indices: </br><li>INDEX_CREATED: indexSample, indexSample</li></br></br>See list: [[indexSample](/indexView?cluster=clusterId&indices=indexSample&searchIndices=indexSample)].", "startTime": 1661247159000, "endTime": 1661333449000, "actions": [] }, { "clusterId": "clusterId", "type": "DISK_WATERMARK_LOW", "severity": "MEDIUM", "title": "The low disk watermark has been exceeded in the following node/s:nodeSample", "description": "Impact: Shards will no longer be allocated on the node. If no other nodes are available, this may prevent primary shards or replicas being created, resulting in data loss.</br>The cluster uses several parameters to enable it to manage hard disk storage across the cluster. There are various “watermark” thresholds on each cluster. As the disk fills up on a node, the first threshold to be crossed will be the “low disk watermark”.</br>The master node will not allocate replica shards to nodes that have exceeded the low disk watermark threshold, which could result in the cluster becoming Yellow or even Red. Passing this threshold is a warning and it should be addressed in order to avoid the risk of data loss.</br>It’s important to note that existing shards on the node will continue to receive data normally, so disk usage on the node may continue to increase.", "affectedContext": "The low disk watermark has been exceeded by: 46.5 GB</br>Disk space left until the high watermark: 34.0 GB</br>The affected nodes are: [nodeSample](/nodeView?cluster=clusterId&from=1661247169000&to=1661333469000&nodes=nodeSmaple)", "startTime": 1661247169000, "endTime": 1661333469000, "actions": [ { "type":"INCREASE_WATERMARK", "title":"Increase watermark", "description": "It's recommended to increase the temporary watermark", "command": curl -X PUT "10.1.1.1:9200/_cluster/settings" -H 'Content-Type: application/json' -d' { "transient": { "cluster.routing.allocation.disk.watermark.low": "<value>" } }' }, { "type":"MOVE_SHARD", "title":"Move shard node", "description": "It's recommended to move shard 0 of index indexSample from node nodeSample to node nodeSample2. Use this command:", "command": curl -X POST "10.1.1.1:9200/_cluster/reroute" -H 'Content-Type: application/json' -d' { "commands" : [ { "move" : { "index" : "indexSample", "shard" : 0, "from_node" : "nodeSample", "to_node" : "nodeSample2" } } ] }' } ] } ] }
List of supported analysis types:
STATUS_YELLOW,
STATUS_RED,
DISK_WATERMARK_LOW_THRESHOLD,
DISK_WATERMARK_LOW,
DISK_WATERMARK_HIGH,
DISK_WATERMARK_FLOOD_STAGE,
DISK_WATERMARKS_WRONG_CONFIGURATION,
COORDINATING_NODE_DISCONNECTED,
MASTER_NODE_DISCONNECTED,
DATA_NODE_DISCONNECTED,
SHARD_TOO_LARGE,
SHARD_TOO_SMALL,
MASTER_NOT_DISCOVERED,
CLUSTER_BLOCKS_READ_ONLY,
CLUSTER_BLOCKS_READ_ONLY_ALLOW_DELETE,
MAX_SHARD_PER_NODE,
TOTAL_SHARD_PER_NODE,
NODE_CONTAINS_TOO_MANY_SHARDS,
MIXED_MASTER_NODES,
TOTAL_SHARD_PER_NODE_UNLIMITED,
MIN_MASTER_NODE_HIGHER_THAN_ELIGIBLE,
MIN_MASTER_NODE_LESS_THAN_QUORUM,
MIN_MASTER_NODE_HIGHER_THAN_QUORUM,
DEDICATED_MASTER_NODES,
DEDICATED_CLIENT_NODES,
CLUSTER_CONCURRENT_REBALANCE_HIGH,
CLUSTER_CONCURRENT_REBALANCE_LOW,
NODE_CONCURRENT_RECOVERIES_HIGH,
NODE_CONCURRENT_RECOVERIES_LOW,
LOADED_DATA_NODES,
LOADED_MASTER_NODES,
LOADED_CLIENT_NODES,
REJECTED_INDEXING,
REJECTED_SEARCH,
NODE_INDEXING_FAILED,
INDEX_INDEXING_FAILED,
MAX_HEAP_SIZE_REACHED,
SHARD_ALLOCATION_ENABLE_ALL,
SHARD_REBALANCE_ENABLE_ALL,
NUMBER_OF_MASTER_NODES,
SEARCH_REJECTED_QUEUE,
REPOSITORY_SNAPSHOT,
MANAGEMENT_QUEUE_SIZE,
SEARCH_QUEUE_SIZE,
CIRCUIT_BREAKER,
INDEX_QUEUE_SIZE,
UNBALANCED_SHARDS,
NO_SHARDS_IN_DATA_NODE,
COORDINATING_NODE_NOT_UTILIZED,
HIGH_CLUSTER_PENDING_TASKS,
SLOW_SEARCH,
SLOW_INDEXING,
CIRCUIT_BREAKER_USED_IS_HIGH,
DETECTED_EMPTY_INDICES,
DETECTED_EMPTY_REPLICAS,
LONG_RUNNING_SEARCH_TASK,
LONG_RUNNING_INDEX_TASK,
LONG_RUNNING_SHARD_TASK,
LONG_RUNNING_SNAPSHOT_TASK,
List of supported metrics for nodes:
WRITE_QUEUE,
WRITE_THREADS,
WRITE_REJECTED,
WRITE_COMPLETED,
SEARCH_QUEUE,
SEARCH_THREADS,
SEARCH_REJECTED,
SEARCH_COMPLETED,
MANAGEMENT_QUEUE,
MANAGEMENT_THREADS,
MANAGEMENT_REJECTED,
MANAGEMENT_COMPLETED,
SNAPSHOT_QUEUE,
SNAPSHOT_THREADS,
SNAPSHOT_REJECTED,
SNAPSHOT_COMPLETED,
HTTP_CURRENT_OPEN,
HTTP_TOTAL_OPENED,
INDEX_FAILED,
SHARDS_COUNT,
INITIALIZING_SHARDS,
SEGMENTS_COUNT,
DOC_COUNT,
SIZE_IN_BYTES,
INDEX_RATE_IN_SEC,
QUERY_RATE_IN_SEC,
MERGE_RATE_IN_SEC,
DELETE_RATE_IN_SEC,
INDEX_LATENCY_IN_MILLIS,
QUERY_LATENCY_IN_MILLIS,
MERGE_LATENCY_IN_MILLIS,
DELETE_LATENCY_IN_MILLIS, NULLS
INDEX_FAILED_IN_SEC,
GET_DOC_MISSING_RATE_IN_SEC,
GET_DOC_MISSING_LATENCY_IN_MILLIS
List of supported metrics for indices:
SEGMENTS_COUNT,
DOC_COUNT,
SIZE_IN_BYTES,
INDEX_RATE_IN_SEC,
QUERY_RATE_IN_SEC,
MERGE_RATE_IN_SEC,
DELETE_RATE_IN_SEC,
INDEX_LATENCY_IN_MILLIS,
QUERY_LATENCY_IN_MILLIS,
MERGE_LATENCY_IN_MILLIS,
DELETE_LATENCY_IN_MILLIS,
INDEX_FAILED_IN_SEC,
GET_DOC_MISSING_RATE_IN_SEC,
GET_DOC_MISSING_LATENCY_IN_MILLIS