Ceph operation, maintenance and repair: Difference between revisions
Line 20: | Line 20: | ||
=== <code>ceph health detail</code> === | === <code>ceph health detail</code> === | ||
Below is a more detailed output from the <code>ceph health</code> from above. | |||
[root@vmfram1 ~]# '''ceph health detail''' | [root@vmfram1 ~]# '''ceph health detail''' |
Revision as of 11:34, 11 July 2023
How to see the status of the cluster
On any cluster node, you can run the ceph health
, ceph health detail
or ceph status
commands to get an increasingly-detailed overview of the cluster's status.
ceph health
ceph health
gives a very condensed status of the cluster. Ideally, it's output will look like this:
root@storage00:~# ceph health HEALTH_OK root@storage00:~#
Less ideally it'll display a WARN or ERROR status, which might look like this:
[root@vmfram1 ~]# ceph health HEALTH_WARN Low space hindering backfill (add storage if this doesn't resolve itself): 2 pgs backfill_toofull [root@vmfram1 ~]#
The above warning indicates that the cluster is not able to shuffle some objects around (backfilling) due to a lack of disk space. This state might be temporary while it is doing other backfilling, the result of which be some extra space available. More likely the "too full" state will persist and you need to either add storage (best) or make some architectural or OSD weight changes (less than ideal) to force space to become available.
ceph health detail
Below is a more detailed output from the ceph health
from above.
[root@vmfram1 ~]# ceph health detail HEALTH_WARN Low space hindering backfill (add storage if this doesn't resolve itself): 2 pgs backfill_toofull PG_BACKFILL_FULL Low space hindering backfill (add storage if this doesn't resolve itself): 2 pgs backfill_toofull pg 9.5 is active+remapped+backfill_wait+backfill_toofull, acting [305,406,103] pg 9.e is active+remapped+backfill_wait+backfill_toofull, acting [306,406,104] [root@vmfram1 ~]#
ceph status
root@storage00:~# ceph status cluster: id: db5b6a5a-1080-46d2-974a-80fe8274c8ba health: HEALTH_OK services: mon: 3 daemons, quorum storage00,storage01,compute01 (age 12d) mgr: storage01(active, since 12d), standbys: storage00 mds: vm:1 {0=storage01=up:active} 1 up:standby osd: 10 osds: 8 up (since 12d), 8 in (since 3M) data: pools: 4 pools, 448 pgs objects: 15.60k objects, 59 GiB usage: 177 GiB used, 14 TiB / 14 TiB avail pgs: 448 active+clean io: client: 341 B/s wr, 0 op/s rd, 0 op/s wr root@storage00:~#