All the cat commands accept a query string parameter help to see all the headers and info they provide, and the /_cat command alone lists all the available commands.

verbose

-Each of the commands accepts a query string parameter v to turn on verbose output.

% curl 'localhost:9200/_cat/master?v'
id ip node
EGtKWZlWQYWDmX29fUnp3Q 127.0.0.1 Grey, Sara

help

-Each of the commands accepts a query string parameter help which will output its available columns.

% curl 'localhost:9200/_cat/master?help'
id | node id
ip | node transport ip address
node | node name

headers

-Each of the commands accepts a query string parameter h which forces only those columns to appear.

% curl 'n1:9200/_cat/nodes?h=ip,port,heapPercent,name'
192.168.56.40 9300 40.3 Captain Universe
192.168.56.20 9300 15.3 Kaluu
192.168.56.50 9300 17.0 Yellowjacket
192.168.56.10 9300 12.3 Remy LeBeau
192.168.56.30 9300 43.9 Ramsey, Doug

Numeric formats

Many commands provide a few types of numeric output, either a byte value or a time value. By default, these types are human-formatted, for example, 3.5mb instead of 3763212. The human values are not sortable numerically, so in order to operate on these values where order is important, you can change it. Say you want to find the largest index in your cluster (storage used by all the shards, not number of documents). The /_cat/indices API is ideal. We only need to tweak two things. First, we want to turn off human mode. We’ll use a byte-level resolution. Then we’ll pipe our output into sort using the appropriate column, which in this case is the eight one.

% curl '192.168.56.10:9200/_cat/indices?bytes=b' | sort -rnk8
green wiki2 3 0 10000 0 105274918 105274918
green wiki1 3 0 10000 413 103776272 103776272
green foo 1 0 227 0 2065131 2065131

cat allocation

allocation provides a snapshot of how shards have located around the cluster and the state of disk usage.

% curl '192.168.56.10:9200/_cat/allocation?v'
shards diskUsed diskAvail diskRatio ip node
1 5.6gb 72.2gb 7.8% 192.168.56.10 Jarella
1 5.6gb 72.2gb 7.8% 192.168.56.30 Solarr
1 5.5gb 72.3gb 7.6% 192.168.56.20 Adam II

Here we can see that each node has been allocated a single shard and that they’re all using about the same amount of space.

cat count

count provides quick access to the document count of the entire cluster, or individual indices.

% curl 192.168.56.10:9200/_cat/indices
green wiki1 3 0 10000 331 168.5mb 168.5mb
green wiki2 3 0 428 0 8mb 8mb

% curl 192.168.56.10:9200/_cat/count
1384314124582 19:42:04 10428

% curl 192.168.56.10:9200/_cat/count/wiki2
1384314139815 19:42:19 428

cat fielddata

fielddata shows information about currently loaded fielddata on a per-node basis.

% curl '192.168.56.10:9200/_cat/fielddata?v'
id host ip node total body text
c223lARiSGeezlbrcugAYQ myhost1 10.20.100.200 Jessica Jones 385.6kb 159.8kb 225.7kb
waPCbitNQaCL6xC8VxjAwg myhost2 10.20.100.201 Adversary 435.2kb 159.8kb 275.3kb
yaDkp-G3R0q1AJ-HUEvkSQ myhost3 10.20.100.202 Microchip 284.6kb 109.2kb 175.3kb
Fields can be specified either as a query parameter, or in the URL path:

% curl '192.168.56.10:9200/_cat/fielddata?v&fields=body'
id host ip node total body
c223lARiSGeezlbrcugAYQ myhost1 10.20.100.200 Jessica Jones 385.6kb 159.8kb
waPCbitNQaCL6xC8VxjAwg myhost2 10.20.100.201 Adversary 435.2kb 159.8kb
yaDkp-G3R0q1AJ-HUEvkSQ myhost3 10.20.100.202 Microchip 284.6kb 109.2kb

% curl '192.168.56.10:9200/_cat/fielddata/body,text?v'
id host ip node total body text
c223lARiSGeezlbrcugAYQ myhost1 10.20.100.200 Jessica Jones 385.6kb 159.8kb 225.7kb
waPCbitNQaCL6xC8VxjAwg myhost2 10.20.100.201 Adversary 435.2kb 159.8kb 275.3kb
yaDkp-G3R0q1AJ-HUEvkSQ myhost3 10.20.100.202 Microchip 284.6kb 109.2kb 175.3kb

The output shows the total fielddata and then the individual fielddata for the body and text fields

cat health

health is a terse, one-line representation of the same information from /_cluster/health. It has one option ts to disable the timestamping.

% curl 192.168.56.10:9200/_cat/health
1384308967 18:16:07 foo green 3 3 3 3 0 0 0

% curl '192.168.56.10:9200/_cat/health?v&ts=0'
cluster status nodeTotal nodeData shards pri relo init unassign tasks
foo green 3 3 3 3 0 0 0 0

A common use of this command is to verify the health is consistent across nodes:

% pssh -i -h list.of.cluster.hosts curl -s localhost:9200/_cat/health
[1] 20:20:52 [SUCCESS] es3.vm
1384309218 18:20:18 foo green 3 3 3 3 0 0 0 0
[2] 20:20:52 [SUCCESS] es1.vm
1384309218 18:20:18 foo green 3 3 3 3 0 0 0 0
[3] 20:20:52 [SUCCESS] es2.vm
1384309218 18:20:18 foo green 3 3 3 3 0 0 0 0

A less obvious use is to track recovery of a large cluster over time. With enough shards, starting a cluster, or even recovering after losing a node, can take time (depending on your network & disk). A way to track its progress is by using this command in a delayed loop:

% while true; do curl 192.168.56.10:9200/_cat/health; sleep 120; done
1384309446 18:24:06 foo red 3 3 20 20 0 0 1812 0
1384309566 18:26:06 foo yellow 3 3 950 916 0 12 870 0
1384309686 18:28:06 foo yellow 3 3 1328 916 0 12 492 0
1384309806 18:30:06 foo green 3 3 1832 916 4 0 0
^C

In this scenario, we can tell that recovery took roughly four minutes. If this were going on for hours, we would be able to watch the UNASSIGNED shards drop precipitously. If that number remained static, we would have an idea that there is a problem. Why the timestamp?edit You typically are using the health command when a cluster is malfunctioning. During this period, it’s extremely important to correlate activities across log files, alerting systems, etc. There are two outputs. The HH:MM:SS output is simply for quick human consumption. The epoch time retains more information, including date, and is machine sortable if your recovery spans days.

cat indices

The indices command provides a cross-section of each index. This information spans nodes.

% curl 'localhost:9200/_cat/indices/twi*?v'
health status index pri rep docs.count docs.deleted store.size pri.store.size
green open twitter 5 1 11434 0 64mb 32mb
green open twitter2 2 0 2030 0 5.8mb 5.8mb

We can tell quickly how many shards make up an index, the number of docs, deleted docs, primary store size, and total store size (all shards including replicas). Primariesedit The index stats by default will show them for all of an index’s shards, including replicas. A pri flag can be supplied to enable the view of relevant stats in the context of only the primaries.

Examplesedit

Which indices are yellow?

% curl localhost:9200/_cat/indices | grep ^yell
yellow open wiki 2 1 6401 1115 151.4mb 151.4mb
yellow open twitter 5 1 11434 0 32mb 32mb

What’s my largest index by disk usage not including replicas?

% curl 'localhost:9200/_cat/indices?bytes=b' | sort -rnk8
green open wiki 2 0 6401 1115 158843725 158843725
green open twitter 5 1 11434 0 67155614 33577857
green open twitter2 2 0 2030 0 6125085 6125085

How many merge operations have the shards for the wiki completed?

% curl 'localhost:9200/_cat/indices/wiki?pri&v&h=health,index,prirep,docs.count,mt'
health index docs.count mt pri.mt
green wiki 9646 16 16

How much memory is used per index?

% curl 'localhost:9200/_cat/indices?v&h=i,tm'
i tm
wiki 8.1gb
test 30.5kb
user 1.9mb

cat master

master doesn’t have any extra options. It simply displays the master’s node ID, bound IP address, and node name.

% curl 'localhost:9200/_cat/master?v'
id ip node
Ntgn2DcuTjGuXlhKDUD4vA 192.168.56.30 Solarr

This information is also available via the nodes command, but this is slightly shorter when all you want to do, for example, is verify all nodes agree on the master:

% pssh -i -h list.of.cluster.hosts curl -s localhost:9200/_cat/master
[1] 19:16:37 [SUCCESS] es3.vm
Ntgn2DcuTjGuXlhKDUD4vA 192.168.56.30 Solarr
[2] 19:16:37 [SUCCESS] es2.vm
Ntgn2DcuTjGuXlhKDUD4vA 192.168.56.30 Solarr
[3] 19:16:37 [SUCCESS] es1.vm
Ntgn2DcuTjGuXlhKDUD4vA 192.168.56.30 Solarr

cat nodes

(https://www.elastic.co/guide/en/elasticsearch/reference/current/cat-nodes.html) The nodes command shows the cluster topology.

% curl 192.168.56.10:9200/_cat/nodes
SP4H 4727 192.168.56.30 9300 1.5.2 1.8.0_25 72.1gb 35.4 93.9mb 79 239.1mb 0.45 3.4h d m Boneyard
_uhJ 5134 192.168.56.10 9300 1.5.2 1.8.0_25 72.1gb 33.3 93.9mb 85 239.1mb 0.06 3.4h d * Athena
HfDp 4562 192.168.56.20 9300 1.5.2 1.8.0_25 72.2gb 74.5 93.9mb 83 239.1mb 0.12 3.4h d m Zarek

The first few columns tell you where your nodes live. For sanity it also tells you what version of ES and the JVM each one runs.

nodeId pid ip port version jdk
u2PZ 4234 192.168.56.30 9300 1.5.2 1.8.0_25
URzf 5443 192.168.56.10 9300 1.5.2 1.8.0_25
ActN 3806 192.168.56.20 9300 1.5.2 1.8.0_25

The next few give a picture of your heap, memory, and load.

diskAvail heapPercent heapMax ramPercent ramMax load
72.1gb 31.3 93.9mb 81 239.1mb 0.24
72.1gb 19.6 93.9mb 82 239.1mb 0.05
72.2gb 64.9 93.9mb 84 239.1mb 0.12

The last columns provide ancillary information that can often be useful when looking at the cluster as a whole, particularly large ones. How many master-eligible nodes do I have? How many client nodes? It looks like someone restarted a node recently; which one was it?

uptime data/client master name
3.5h d m Boneyard
3.5h d * Athena
3.5h d m Zarek
Columnsedit

Below is an exhaustive list of the existing headers that can be passed to nodes?h= to retrieve the relevant details in ordered columns. If no headers are specified, then those marked to Appear by Default will appear. If any header is specified, then the defaults are not used. Aliases can be used in place of the full header name for brevity. Columns appear in the order that they are listed below unless a different order is specified (e.g., h=pid,id versus h=id,pid). When specifying headers, the headers are not placed in the output by default. To have the headers appear in the output, use verbose mode (v). The header name will match the supplied value (e.g., pid versus p). For example:

% curl 192.168.56.10:9200/_cat/nodes?v&h=id,ip,port,v,m
id ip port version m
pLSN 192.168.56.30 9300 1.5.2 m
k0zy 192.168.56.10 9300 1.5.2 m
6Tyi 192.168.56.20 9300 1.5.2 *

% curl 192.168.56.10:9200/_cat/nodes?h=id,ip,port,v,m
pLSN 192.168.56.30 9300 1.5.2 m
k0zy 192.168.56.10 9300 1.5.2 m
6Tyi 192.168.56.20 9300 1.5.2 *

cat pending tasks

pending_tasks provides the same information as the /_cluster/pending_tasks API in a convenient tabular format.

% curl 'localhost:9200/_cat/pending_tasks?v'
insertOrder timeInQueue priority source
1685 855ms HIGH update-mapping [foo][t]
1686 843ms HIGH update-mapping [foo][t]
1693 753ms HIGH refresh-mapping [foo][[t]]
1688 816ms HIGH update-mapping [foo][t]
1689 802ms HIGH update-mapping [foo][t]
1690 787ms HIGH update-mapping [foo][t]
1691 773ms HIGH update-mapping [foo][t]

cat plugins

he plugins command provides a view per node of running plugins. This information spans nodes.

% curl 'localhost:9200/_cat/plugins?v'
name component version type isolation url
Abraxas cloud-azure 2.1.0-SNAPSHOT j x
Abraxas lang-groovy 2.0.0 j x
Abraxas lang-javascript 2.0.0-SNAPSHOT j x
Abraxas marvel NA j/s x /_plugin/marvel/
Abraxas lang-python 2.0.0-SNAPSHOT j x
Abraxas inquisitor NA s /_plugin/inquisitor/
Abraxas kopf 0.5.2 s /_plugin/kopf/
Abraxas segmentspy NA s /_plugin/segmentspy/

We can tell quickly how many plugins per node we have and which versions.

cat recovery

The recovery command is a view of index shard recoveries, both on-going and previously completed. It is a more compact view of the JSON recovery API. A recovery event occurs anytime an index shard moves to a different node in the cluster. This can happen during a snapshot recovery, a change in replication level, node failure, or on node startup. This last type is called a local gateway recovery and is the normal way for shards to be loaded from disk when a node starts up. As an example, here is what the recovery state of a cluster may look like when there are no shards in transit from one node to another:

> curl -XGET 'localhost:9200/_cat/recovery?v'
index shard time type stage source target files percent bytes percent
wiki 0 73 gateway done hostA hostA 36 100.0% 24982806 100.0%
wiki 1 245 gateway done hostA hostA 33 100.0% 24501912 100.0%
wiki 2 230 gateway done hostA hostA 36 100.0% 30267222 100.0%

In the above case, the source and target nodes are the same because the recovery type was gateway, i.e. they were read from local storage on node start. Now let’s see what a live recovery looks like. By increasing the replica count of our index and bringing another node online to host the replicas, we can see what a live shard recovery looks like.

> curl -XPUT 'localhost:9200/wiki/_settings' -d'{"number_of_replicas":1}'
{"acknowledged":true}

> curl -XGET 'localhost:9200/_cat/recovery?v'
index shard time type stage source target files percent bytes percent
wiki 0 1252 gateway done hostA hostA 4 100.0% 23638870 100.0%
wiki 0 1672 replica index hostA hostB 4 75.0% 23638870 48.8%
wiki 1 1698 replica index hostA hostB 4 75.0% 23348540 49.4%
wiki 1 4812 gateway done hostA hostA 33 100.0% 24501912 100.0%
wiki 2 1689 replica index hostA hostB 4 75.0% 28681851 40.2%
wiki 2 5317 gateway done hostA hostA 36 100.0% 30267222 100.0%

We can see in the above listing that our 3 initial shards are in various stages of being replicated from one node to another. Notice that the recovery type is shown as replica. The files and bytes copied are real-time measurements. Finally, let’s see what a snapshot recovery looks like. Assuming I have previously made a backup of my index, I can restore it using the snapshot and restore API.

> curl -XPOST 'localhost:9200/_snapshot/imdb/snapshot_2/_restore'
{"acknowledged":true}

> curl -XGET 'localhost:9200/_cat/recovery?v'
index shard time type stage repository snapshot files percent bytes percent
imdb 0 1978 snapshot done imdb snap_1 79 8.0% 12086 9.0%
imdb 1 2790 snapshot index imdb snap_1 88 7.7% 11025 8.1%
imdb 2 2790 snapshot index imdb snap_1 85 0.0% 12072 0.0%
imdb 3 2796 snapshot index imdb snap_1 85 2.4% 12048 7.2%
imdb 4 819 snapshot init imdb snap_1 0 0.0% 0 0.0%

cat thread pool

(https://www.elastic.co/guide/en/elasticsearch/reference/current/cat-thread-pool.html) The thread_pool command shows cluster wide thread pool statistics per node. By default the active, queue and rejected statistics are returned for the bulk, index and search thread pools.

% curl 192.168.56.10:9200/_cat/thread_pool
host1 192.168.1.35 0 0 0 0 0 0 0 0 0
host2 192.168.1.36 0 0 0 0 0 0 0 0 0

The first two columns contain the host and ip of a node.

host ip
host1 192.168.1.35
host2 192.168.1.36

The next three columns show the active queue and rejected statistics for the bulk thread pool.

bulk.active bulk.queue bulk.rejected
0 0 0

The remaining columns show the active queue and rejected statistics of the index and search thread pool respectively. Also other statistics of different thread pools can be retrieved by using the h (header) parameter.

% curl 'localhost:9200/_cat/thread_pool?v&h=id,host,suggest.active,suggest.rejected,suggest.completed'
host suggest.active suggest.rejected suggest.completed
host1 0 0 0
host2 0 0 0

Here the host columns and the active, rejected and completed suggest thread pool statistic are displayed. The suggest thread pool won’t be displayed by default, so you always need to be specific about what statistic you want to display.

cat shards

The shards command is the detailed view of what nodes contain which shards. It will tell you if it’s a primary or replica, the number of docs, the bytes it takes on disk, and the node where it’s located. Here we see a single index, with three primary shards and no replicas:

% curl 192.168.56.20:9200/_cat/shards
wiki1 0 p STARTED 3014 31.1mb 192.168.56.10 Stiletto
wiki1 1 p STARTED 3013 29.6mb 192.168.56.30 Frankie Raye
wiki1 2 p STARTED 3973 38.1mb 192.168.56.20 Commander Kraken

index pattern

If you have many shards, you may wish to limit which indices show up in the output. You can always do this with grep, but you can save some bandwidth by supplying an index pattern to the end.

% curl 192.168.56.20:9200/_cat/shards/wiki2
wiki2 0 p STARTED 197 3.2mb 192.168.56.10 Stiletto
wiki2 1 p STARTED 205 5.9mb 192.168.56.30 Frankie Raye
wiki2 2 p STARTED 275 7.8mb 192.168.56.20 Commander Kraken

Realocating

Let’s say you’ve checked your health and you see two relocating shards. Where are they from and where are they going?

% curl 192.168.56.10:9200/_cat/health
1384315316 20:01:56 foo green 3 3 12 6 2 0 0
% curl 192.168.56.10:9200/_cat/shards | fgrep RELO
wiki1 0 r RELOCATING 3014 31.1mb 192.168.56.20 Commander Kraken -> 192.168.56.30 Frankie Raye
wiki1 1 r RELOCATING 3013 29.6mb 192.168.56.10 Stiletto -> 192.168.56.30 Frankie Raye

shard states

Before a shard can be used, it goes through an INITIALIZING state. shards can show you which ones.

% curl -XPUT 192.168.56.20:9200/_settings -d'{"number_of_replicas":1}'
{"acknowledged":true}

% curl 192.168.56.20:9200/_cat/shards
wiki1 0 p STARTED 3014 31.1mb 192.168.56.10 Stiletto
wiki1 0 r INITIALIZING 0 14.3mb 192.168.56.30 Frankie Raye
wiki1 1 p STARTED 3013 29.6mb 192.168.56.30 Frankie Raye
wiki1 1 r INITIALIZING 0 13.1mb 192.168.56.20 Commander Kraken
wiki1 2 r INITIALIZING 0 14mb 192.168.56.10 Stiletto
wiki1 2 p STARTED 3973 38.1mb 192.168.56.20 Commander Kraken

If a shard cannot be assigned, for example you’ve overallocated the number of replicas for the number of nodes in the cluster, they will remain UNASSIGNED.

% curl -XPUT 192.168.56.20:9200/_settings -d'{"number_of_replicas":3}'
% curl 192.168.56.20:9200/_cat/health
1384316325 20:18:45 foo yellow 3 3 9 3 0 0 3

% curl 192.168.56.20:9200/_cat/shards
wiki1 0 p STARTED 3014 31.1mb 192.168.56.10 Stiletto
wiki1 0 r STARTED 3014 31.1mb 192.168.56.30 Frankie Raye
wiki1 0 r STARTED 3014 31.1mb 192.168.56.20 Commander Kraken
wiki1 0 r UNASSIGNED
wiki1 1 r STARTED 3013 29.6mb 192.168.56.10 Stiletto
wiki1 1 p STARTED 3013 29.6mb 192.168.56.30 Frankie Raye
wiki1 1 r STARTED 3013 29.6mb 192.168.56.20 Commander Kraken
wiki1 1 r UNASSIGNED
wiki1 2 r STARTED 3973 38.1mb 192.168.56.10 Stiletto
wiki1 2 r STARTED 3973 38.1mb 192.168.56.30 Frankie Raye
wiki1 2 p STARTED 3973 38.1mb 192.168.56.20 Commander Kraken
wiki1 2 r UNASSIGNED

cat segments

The segments command provides low level information about the segments in the shards of an index. It provides information similar to the _segments endpoint.

% curl 'http://localhost:9200/_cat/segments?v'
index shard prirep ip segment generation docs.count [...]
test 4 p 192.168.2.105 _0 0 1
test1 2 p 192.168.2.105 _0 0 1
test1 3 p 192.168.2.105 _2 2 1
[...] docs.deleted size size.memory committed searchable version compound
0 2.9kb 7818 false true 4.10.2 true
0 2.9kb 7818 false true 4.10.2 true
0 2.9kb 7818 false true 4.10.2 true

The output shows information about index names and shard numbers in the first two columns. If you only want to get information about segments in one particular index, you can add the index name in the URL, for example /_cat/segments/test. Also, several indexes can be queried like /_cat/segments/test,test1 The following columns provide additional monitoring information:

prirep

Whether this segment belongs to a primary or replica shard.

ip

The ip address of the segments shard.

segment

A segment name, derived from the segment generation. The name is internally used to generate the file names in the directory of the shard this segment belongs to.

generation

The generation number is incremented with each segment that is written. The name of the segment is derived from this generation number.

docs.count

The number of non-deleted documents that are stored in this segment.

docs.deleted

The number of deleted documents that are stored in this segment. It is perfectly fine if this number is greater than 0, space is going to be reclaimed when this segment gets merged.

size

The amount of disk space that this segment uses.

size.memory

Segments store some data into memory in order to be searchable efficiently. This column shows the number of bytes in memory that are used.

committed

Whether the segment has been sync’ed on disk. Segments that are committed would survive a hard reboot. No need to worry in case of false, the data from uncommitted segments is also stored in the transaction log so that Elasticsearch is able to replay changes on the next start.

searchable

True if the segment is searchable. A value of false would most likely mean that the segment has been written to disk but no refresh occurred since then to make it searchable.

version

The version of Lucene that has been used to write this segment.

compound

Whether the segment is stored in a compound file. When true, this means that Lucene merged all files from the segment in a single one in order to save file descriptors.