Phase 6 – Install Ceph on the 3-node Proxmox cluster

Goal: turn the unused storage in each Dell T5500 into a shared storage cluster that Proxmox, OpenStack, Kubernetes, and Slurm can later consume.

Target design:

pve01 ─ Ceph MON + MGR + OSD
pve02 ─ Ceph MON + MGR + OSD
pve03 ─ Ceph MON + MGR + OSD

Use Ceph for:

VM disks        → Ceph RBD
OpenStack Cinder → Ceph RBD
Glance images → Ceph RBD
Kubernetes PVs → Ceph CSI later
Shared files → CephFS later
Object storage → RGW later

1. Pre-check every node

Run on all three Proxmox nodes:

hostname
ip -br addr
lsblk
pveversion
timedatectl

Check that:

pve01 / pve02 / pve03 can resolve each other
time is synced
cluster quorum is healthy
the intended Ceph disks are unused

Check cluster:

pvecm status

You want:

Quorum: Yes
Nodes: 3

2. Decide the Ceph network

Because you only have one NIC per node, keep it simple first.

Use your existing Proxmox management network initially.

Example:

pve01 192.168.1.10
pve02 192.168.1.11
pve03 192.168.1.12

Later, if you add VLANs or a second NIC, you can separate:

public_network  = client / VM / OpenStack access
cluster_network = OSD replication / recovery traffic

For now:

Ceph public network = 192.168.1.0/24
Ceph cluster network = same network

Not ideal for production, but fine for learning.


3. Install Ceph from Proxmox UI or CLI

Option A — Proxmox UI

On each node:

Datacenter
→ Node
→ Ceph
→ Install Ceph

Use the same version on all nodes.

Then initialise Ceph on the first node.


Option B — CLI

On each node:

pveceph install

Then initialise Ceph on pve01:

pveceph init --network 192.168.1.0/24

Then create MONs:

pveceph mon create

Run this on each node:

pveceph mon create

Then create managers:

pveceph mgr create

Again, run on each node.

Verify:

ceph -s
ceph mon stat
ceph mgr stat

4. Prepare disks for OSDs

List disks:

lsblk -o NAME,SIZE,TYPE,FSTYPE,MOUNTPOINT

Example:

sda  150G  Proxmox OS
sdb 2.5T empty disk for Ceph OSD

Important: do not use the Proxmox OS disk as an OSD unless you deliberately partitioned it for that.

Wipe the intended Ceph disk:

wipefs -a /dev/sdb
sgdisk --zap-all /dev/sdb

Then create one OSD per node:

pveceph osd create /dev/sdb

Repeat on pve01, pve02, and pve03.

Verify:

ceph osd tree
ceph -s

You want to see:

3 osds: 3 up, 3 in

5. Create a Ceph pool for VM disks

Create an RBD pool:

ceph osd pool create vm-rbd 32
ceph osd pool application enable vm-rbd rbd
rbd pool init vm-rbd

For a small 3-node lab, 32 PGs is fine to start.

Add it to Proxmox storage:

pvesm add rbd ceph-vm-rbd \
--pool vm-rbd \
--content images,rootdir \
--krbd 0

Check:

pvesm status

You should see ceph-vm-rbd.


6. Test VM storage on Ceph

Create or clone a small VM and place its disk on:

ceph-vm-rbd

Then test live migration:

qm migrate <vmid> pve02 --online

If the VM disk is on Ceph, migration should not require copying the disk.

That is one of the main reasons Ceph is valuable.


7. Create CephFS later, not immediately

Do RBD first.

After RBD works, add CephFS.

Create metadata servers:

pveceph mds create

Create CephFS:

pveceph fs create cephfs

Add to Proxmox storage:

pvesm add cephfs cephfs \
--content iso,backup,snippets

Use CephFS for:

ISO images
container templates
backups
shared files
snippets

Use RBD for:

VM disks
OpenStack Cinder volumes
Glance images

8. Learn the important Ceph commands

Run these repeatedly until you understand them:

ceph -s
ceph health detail
ceph osd tree
ceph osd df
ceph df
ceph mon stat
ceph mgr stat
ceph pg stat
ceph pg dump_stuck

For OSD detail:

ceph osd metadata
ceph osd perf

For pools:

ceph osd pool ls detail
rbd ls vm-rbd
rbd du vm-rbd

9. Break it deliberately

This is where the real learning happens.

Safely test:

systemctl stop ceph-osd@0
ceph -s
systemctl start ceph-osd@0

Observe:

HEALTH_WARN
OSD down
PG degraded
recovery starts
HEALTH_OK returns

Then test node-level failure:

Shutdown pve03
Watch ceph -s
Restart pve03
Watch recovery

Do not panic when Ceph shows warnings. Learn what they mean.


10. What healthy looks like

A healthy 3-node lab should show roughly:

cluster:
health: HEALTH_OK

services:
mon: 3 daemons, quorum pve01,pve02,pve03
mgr: pve01(active), standbys: pve02,pve03
osd: 3 osds: 3 up, 3 in

data:
pools: 1 pools
pgs: active+clean

11. Key concepts to understand

Focus on these:

MON    = cluster map and quorum
MGR = metrics, dashboard, management modules
OSD = stores the actual data
Pool = logical storage namespace
PG = placement group
CRUSH = decides where data lives
RBD = block devices for VMs
CephFS = shared filesystem
RGW = S3-compatible object storage

The biggest one is CRUSH.

CRUSH determines where replicas are placed. In your 3-node lab, you want data spread across different hosts, not multiple copies on the same host.


Recommended order

Do not try to configure everything at once.

Use this sequence:

1. Install Ceph packages
2. Create MONs
3. Create MGRs
4. Add one OSD per node
5. Confirm HEALTH_OK
6. Create RBD pool
7. Add RBD to Proxmox
8. Put VM disk on Ceph
9. Test live migration
10. Break and recover one OSD
11. Add CephFS
12. Later integrate with OpenStack

For your homelab, the first real success milestone is:

A VM running on pve01 with its disk on Ceph RBD,
live migrated to pve02 without copying the disk.

That proves the Proxmox + Ceph foundation is working.