I've had issues where my VM setup has multiple network interfaces but only 1 interface should be used for my nodes to communicate with eachother. MicroCeph kept using the external IP and external subnet during setup, the following command overrides the defaults to use the correct IP (here using the 10.10.103.0/24 subnet to communicate internally)
sudo microceph cluster bootstrap --microceph-ip 10.10.103.218 --mon-ip 10.10.103.218 --public-network 10.10.103.0/24 --cluster-network 10.10.103.0/24
After that simply run the usual microceph init
or microceph add-node
, the correct IP will be assigned to the token
To join you also need to specify the IP to use
sudo microceph cluster join --microceph-ip 10.10.103.222 THE_TOKEN_FROM_MASTER
WARNING: Before doing anything please read the documentation and understand the various features of ceph, manipulating the database directly is HIGH risk, backup your VMs and if working with
rook-ceph
andKubernetes
, DRAIN YOUR NODESWhen removing a node I had issues with the database getting corrupt and a row stuck at 'PENDING' even when uninstall ceph on that specific node, simple manipulations on the database solved that:
Delete the member from the
internal_cluster_members
table:Sometimes the Token need to be regenerated, which throws the error on the UNIQUE constraint on that table, for the tokens, I personally just deleted all generated tokens (they are one time use), if you have more you can do the same as the above note, else simply run delete all:
sudo microceph cluster sql 'delete from internal_token_records;'
To get a better understanding on the state you can investigate the whole database with the following command and run a few
select *
:sudo microceph cluster sql '.schema'