Protected: Ceph S3 load and performance test
ceph osd set-group
If you don’t want to set flags for the whole cluster, like noout or noup. Then you can also use ceph osd set-group and ceph osd unset-group to set the appropriate flag for a group of osds or even whole hosts.
ceph osd set-group <flags> <who> ceph osd unset-group <flags> <who>
for example set noout for a whole host with osds
ceph osd set-group noout clyso-ceph-node3
root@clyso-ceph-node1:~# ceph health detail
HEALTH_WARN 1 OSDs or CRUSH {nodes, device-classes} have {NOUP,NODOWN,NOIN,NOOUT} flags set
[WRN] OSD_FLAGS: 1 OSDs or CRUSH {nodes, device-classes} have {NOUP,NODOWN,NOIN,NOOUT} flags set
host clyso-ceph-node3 has flags noout
ceph osd unset-group noout clyso-ceph-node3
root@clyso-ceph-node1:~# ceph health detail
HEALTH_OK
root@clyso-ceph-node1:
Sources:
https://docs.ceph.com/en/quincy/rados/operations/health-checks/#osd-flags
ceph unlock/enable a locked dashboard user
ceph dashboard ac-user-enable <username>
example with admin user
ceph dashboard ac-user-enable admin
Sources:
https://docs.ceph.com/en/quincy/mgr/dashboard/#enable-a-locked-user
CloudLand 2022
ceph warning – HEALTH_WARN pools have many more objects per pg than average
You might have wondered how to get rid of the warning “ceph warning – pools have many more objects per pg than average”, because you want to see your cluster in HEALTH_OK status.
The option to change the thresholds for this warning is: mon_pg_warn_max_object_skew
Especially for start in production of a ceph cluster or a new pool you can set the threshold high. After a certain time you should always check the value and adjust it if necessary.
An important note for the option is that it must be set on ceph mgr, often you can find posts that set the option on ceph mon and see no effect on ceph status.
The cluster status commands
ceph status
or
ceph health detail
shows the following warning:
[WRN] MANY_OBJECTS_PER_PG: 1 pools have many more objects per pg than average
pool test objects per pg (2079) is more than 11.6798 times cluster average (178)
note
To disable the warning completely the value of mon_pg_warn_max_object_skew must be set to 0 or a negative number.
Verify the default value
ceph config get mgr mon_pg_warn_max_object_skew
10.000000
inject the value:
ceph tell mgr.a injectargs '--mon_pg_warn_max_object_skew 50'
verify the value:
ceph tell mgr.a config get mon_pg_warn_max_object_skew
{
"mon_pg_warn_max_object_skew": "50.000000"
}
set the value persistent, e.g. 50 times higher
ceph config set mgr mon_pg_warn_max_object_skew 50
ceph config get mgr mon_pg_warn_max_object_skew
50.000000
ceph-volume – create WAL/DB on separate device for existing OSD
ceph-volume can be used to create for a existing OSD a new WAL/DB on a faster device without the need to recreate the OSD
- https://docs.ceph.com/en/latest/ceph-volume/lvm/migrate/
- https://docs.ceph.com/en/latest/ceph-volume/lvm/newdb/
- https://docs.ceph.com/en/latest/ceph-volume/lvm/newwal/
ceph-volume lvm new-db –osd-id 15 –osd-fsid FSID –target cephdb/cephdb1
–> NameError: name ‘get_first_lv’ is not defined
this is a bug in ceph-volume v16.2.7 that will be fixed in v16.2.8 https://github.com/ceph/ceph/pull/44209
First create a new Logical Volume on the Device that will hold the new WAL/DB
vgcreate cephdb /dev/sdb
Volume group "cephdb" successfully created
lvcreate -L 100G -n cephdb1 cephdb
Logical volume "cephdb1" created.
Now stop running OSD and if it was deactivated ( cephadm ) then activate it on the host
systemctl stop ceph-FSID@osd.0.service
ceph-volume lvm activate --all --no-systemd
Create new WAL/DB on new Device
ceph-volume lvm new-db --osd-id 0 --osd-fsid OSD-FSID --target cephdb/cephdb1
--> Making new volume at /dev/cephdb/cephdb1 for OSD: 0 (/var/lib/ceph/osd/ceph-0)
Running command: /bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-0/block.db
Running command: /bin/chown -R ceph:ceph /dev/dm-1
--> New volume attached.
Migrate existing WAL/DB to new Device
ceph-volume lvm migrate --osd-id 0 --osd-fsid OSD-FSID --from data --target cephdb/cephdb1
--> Migrate to existing, Source: ['--devs-source', '/var/lib/ceph/osd/ceph-0/block'] Target: /var/lib/ceph/osd/ceph-0/block.db
--> Migration successful.
Deactivate OSD and start it
ceph-volume lvm deactivate 0
Running command: /bin/umount -v /var/lib/ceph/osd/ceph-0
stderr: umount: /var/lib/ceph/osd/ceph-0 unmounted
systemctl start ceph-FSID@osd.0.service
Protected: ceph-volume – ceph osd migrate DB to larger ssd/flash device
ceph osd migrate DB to larger ssd/flash device
First we wanted to use ceph-bluestore-tool bluefs-bdev-new-wal. However, it turned out that it is not possible to ensure that the second DB is actually used.
For this reason, we decided to migrate the entire bluefs of the osd to an ssd/flash.

Verify the current osd bluestore setup
ceph-bluestore-tool show-label –dev device […]
Verify the current size of the osd bluestore DB
ceph-bluestore-tool bluefs-bdev-sizes –path <osd path>
ceph-bluestore-tool bluefs-bdev-migrate –path osd path –dev-target new-device –devs-source device1 [–devs-source device2]
Verify the size of the osd bluestore DB after the migration
ceph-bluestore-tool bluefs-bdev-sizes –path <osd path>
if the size does not correspond to the new target size execute the following command:
ceph-bluestore-tool bluefs-bdev-expand –path osd path
Instruct BlueFS to check the size of its block devices and, if they have expanded, make use of the additional space. Please note that only the new files created by BlueFS will be allocated on the preferred block device if it has enough free space, and the existing files that have spilled over to the slow device will be gradually removed when RocksDB performs compaction. In other words, if there is any data spilled over to the slow device, it will be moved to the fast device over time.
https://docs.ceph.com/en/octopus/man/8/ceph-bluestore-tool/#commands
Verify the new osd bluestore setup
ceph-bluestore-tool show-label –dev device […]
Update
You might be interested in a migration method on a higher layer with ceph-volume lvm.
Appendix
I’m trying to figure out the appropriate process for adding a separate SSD block.db to an existing OSD. From what I gather the two steps are:
1. Use ceph-bluestore-tool bluefs-bdev-new-db to add the new db device
2. Migrate the data ceph-bluestore-tool bluefs-bdev-migrate
I followed this and got both executed fine without any error. Yet when the OSD got started up, it keeps on using the integrated block.db instead of the new db. The block.db link to the new db device was deleted. Again, no error, just not using the new db
https://www.spinics.net/lists/ceph-users/msg62357.html
Sources:
https://docs.ceph.com/en/octopus/man/8/ceph-bluestore-tool/
https://tracker.ceph.com/attachments/download/4478/bluestore.png
OpenInfra 2022 Berlin
Wed, June 8, 2:50pm – 3:20pm | Berlin Congress Center – B – B09
Ceph on WindowsPrivate & Hybrid Cloud
Ceph RADOS, RBD and CephFS have been ported on Microsoft Windows, a community effort led by SUSE and Cloudbase Solutions. The goal consisted in porting librados and librdb on Windows Server, providing a kernel driver for exposing RBD devices natively as Windows volumes, support for Hyper-V VMs and last but not least, even CephFS.
During this session we will talk about the architectural differences between Windows and Linux from a storage standpoint and how we retained the same CLI so that long time Ceph users will feel at home regardless of the underlying operating system.
Performance is a key aspect of this porting, with Ceph on Windows significantly outperforming the iSCSI gateway, previously the main option for accessing RBD images from Windows nodes. There will be no lack of live demos, including automating the installation of the Windows binaries, setting up and managing a Ceph cluster across Windows and Linux nodes, spinning up Hyper-V VMs from RBD, and CephFS.
https://openinfra.dev/summit-schedule