Ceph operation, maintenance and repair: Difference between revisions

From techdocs
Jump to navigation Jump to search
Line 4: Line 4:


=== <code>ceph health</code> ===
=== <code>ceph health</code> ===
<code>ceph health</code> gives a very condensed status of the cluster. Ideally, it's output will look like this:
root@storage00:~# '''ceph health'''
HEALTH_OK
root@storage00:~#
Less ideally it'll display a WARN or ERROR status, which might look like this:


  [root@vmfram1 ~]# '''ceph health'''
  [root@vmfram1 ~]# '''ceph health'''
  HEALTH_WARN Low space hindering backfill (add storage if this doesn't resolve itself): 2 pgs backfill_toofull
  HEALTH_WARN Low space hindering backfill (add storage if this doesn't resolve itself): 2 pgs backfill_toofull
  [root@vmfram1 ~]#
  [root@vmfram1 ~]#
The above warning indicates that the cluster is not able to shuffle some objects around (backfilling) due to a lack of disk space. This state '''might''' be temporary while it is doing other backfilling, the result of which be some extra space available. More likely the "too full" state will persist and you need to either add storage (best) or make some architectural or OSD weight changes (less than ideal) to force space to become available.


=== <code>ceph health detail</code> ===
=== <code>ceph health detail</code> ===

Revision as of 11:33, 11 July 2023

How to see the status of the cluster

On any cluster node, you can run the ceph health, ceph health detail or ceph status commands to get an increasingly-detailed overview of the cluster's status.

ceph health

ceph health gives a very condensed status of the cluster. Ideally, it's output will look like this:

root@storage00:~# ceph health
HEALTH_OK
root@storage00:~#

Less ideally it'll display a WARN or ERROR status, which might look like this:

[root@vmfram1 ~]# ceph health
HEALTH_WARN Low space hindering backfill (add storage if this doesn't resolve itself): 2 pgs backfill_toofull
[root@vmfram1 ~]#

The above warning indicates that the cluster is not able to shuffle some objects around (backfilling) due to a lack of disk space. This state might be temporary while it is doing other backfilling, the result of which be some extra space available. More likely the "too full" state will persist and you need to either add storage (best) or make some architectural or OSD weight changes (less than ideal) to force space to become available.

ceph health detail

[root@vmfram1 ~]# ceph health detail
HEALTH_WARN Low space hindering backfill (add storage if this doesn't resolve itself): 2 pgs backfill_toofull
PG_BACKFILL_FULL Low space hindering backfill (add storage if this doesn't resolve itself): 2 pgs backfill_toofull
    pg 9.5 is active+remapped+backfill_wait+backfill_toofull, acting [305,406,103]
    pg 9.e is active+remapped+backfill_wait+backfill_toofull, acting [306,406,104]
[root@vmfram1 ~]#

ceph status

root@storage00:~# ceph status
  cluster:
    id:     db5b6a5a-1080-46d2-974a-80fe8274c8ba
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum storage00,storage01,compute01 (age 12d)
    mgr: storage01(active, since 12d), standbys: storage00
    mds: vm:1 {0=storage01=up:active} 1 up:standby
    osd: 10 osds: 8 up (since 12d), 8 in (since 3M)
 
  data:
    pools:   4 pools, 448 pgs
    objects: 15.60k objects, 59 GiB
    usage:   177 GiB used, 14 TiB / 14 TiB avail
    pgs:     448 active+clean
 
  io:
    client:   341 B/s wr, 0 op/s rd, 0 op/s wr

root@storage00:~#