Ceph

From Grid5000
Jump to navigation Jump to search


Warning.png Warning

Available to users for testing purpose

Introduction

Ceph is a distributed object store and file system designed to provide excellent performance, reliability and scalability. - See more at: http://ceph.com/

Ceph architecture

File:Ceph stack.png

Grid'5000 Deployment

Sites Size Configuration Rados RBD CephFS RadosGW
Rennes ~ 9TB 16 OSDs on 4 nodes Check.png Check.png Fail.png Fail.png

Configuration

Generate your key

In order to access to the object store you will need a Cephx key. See : https://api.grid5000.fr/sid/storage/ceph/ui/

Your key will also available from the frontends :

[client.jdoe]
  key = AQBwknVUwAPAIRAACddyuVTuP37M55s2aVtPrg==

Note : Replace jdoe by your login.

Create/Update/Delete Ceph pools

Requierement : Generate your key

Manage your Ceph pools from the Grid'5000 Ceph frontend : https://api.grid5000.fr/sid/storage/ceph/ui/

Currently each users are limited to 3 pools.

Ceph g5k frontend.png

Replication size

  • 1 : No replication (not secured, most efficient for write operations)
  • n : One primary object + n-1 replicas (more security, less efficient for write operations)

You can edit the replication size by clicking on the replication size value. If the new value is greater than the old value, objects will automatically be replicated. Depending the amount of objects this operation will stress the Ceph cluster and take more or less time.

Quota

Quota is currently limited to 2TB per pools. You can change this quota at any time by clicking on the quota value, so prefer to use a small/realistic quota first.

Expiration date

The expiration date is limited to 2 months. You can extend the expiration date at any time by clicking on the expiration date value.

Configure Ceph on clients

On a deployed environment

Terminal.png frennes:
oarsub -I -t deploy
Terminal.png frennes:
kadeploy3 -e ubuntu-x64-1404 -k -f $OAR_NODEFILE
Note.png Note

This version of ubuntu offers a version of Ceph who works with RBD support in Qemu/KVM.

Terminal.png frennes:
ssh root@node
Terminal.png node:
apt-get install ceph

Create a ceph configuration file /etc/ceph/ceph.conf :

[global]
  mon initial members = ceph0,ceph1,ceph2
  mon host = 172.16.111.30,172.16.111.31,172.16.111.32

Create a ceph keyring file /etc/ceph/ceph.client.jdoe.keyring with your keyring :

[client.jdoe]
  key = AQBwknVUwAPAIRAACddyuVTuP37M55s2aVtPrg==

On the frontend or a node with production environment

Note.png Note

The ceph version on frontend and production environment is old. Object Store access works, but not the support of RBD in Qemu/KVM.

Create a ceph configuration file ~/.ceph/config :

[global]
  mon initial members = ceph0,ceph1,ceph2
  mon host = 172.16.111.30,172.16.111.31,172.16.111.32

Create a ceph keyring file ~/.ceph/ceph.client.jdoe.keyring with your keyring :

[client.jdoe]
  key = AQBwknVUwAPAIRAACddyuVTuP37M55s2aVtPrg==

Usage

Rados Object Store access

Requierement : Create a Ceph poolConfigure Ceph on client

From command line

Note.png Note

Add -c ~/.ceph/config to rados CLI when using as user (not root) from the frontend or the production environment.

Put an object into a pool

Terminal.png node:
rados --id jdoe --pool jdoe_pool put <infile> <objectname>

List objects of a pool

Terminal.png node:
rados --id jdoe --pool jdoe_pool ls

Get object from a pool

Terminal.png node:
rados --id jdoe --pool jdoe_pool get <objectname> <outfile>

Remove an object

Terminal.png node:
rados --id jdoe --pool jdoe_pool rm <objectname>


Usage informations

Terminal.png frontend:
rados -c ~/.ceph/config --id pmorillo --pool pmorillo_objectstore df
pool name       category                 KB      objects       clones     degraded      unfound           rd        rd KB           wr        wr KB
pmorillo_objectstore -                    1563027            2            0            0           0            0            0          628      2558455
  total used       960300628       295991
  total avail     7800655596
  total space     9229804032

From your application (C/C++, Python, Java, Ruby, PHP...)

See : http://ceph.com/docs/master/rados/api/librados-intro/

Rados benchmarks

On a 1G ethernet client

Write (16 concurrent operations)
Terminal.png frontend:
rados -c ~/.ceph/config --id pmorillo --pool pmorillo_bench bench 60 write --no-cleanup
Read
Terminal.png frontend:
rados -c ~/.ceph/config --id pmorillo --pool pmorillo_bench bench 60 seq

On a 10G ethernet client

Write (16 concurrent operations)
  • Pool replication size : 1
Terminal.png frontend:
rados -c ~/.ceph/config --id pmorillo --pool pmorillo_bench bench 60 write --no-cleanup
Maintaining 16 concurrent writes of 4194304 bytes for at least 60 seconds.
Object prefix: benchmark_data_paranoia-3.rennes.grid5000.fr_5626
  sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
    0       0         0         0         0         0         -         0
    1      16       146       130   519.915       520  0.488794  0.103537
    2      16       247       231   461.933       404  0.164705  0.126076
    3      16       330       314   418.612       332  0.036563  0.148763
    4      16       460       444   443.941       520  0.177378  0.141696
...
Total time run:         60.579488
Total writes made:      5965
Write size:             4194304
Bandwidth (MB/sec):     393.863 

Stddev Bandwidth:       83.789
Max bandwidth (MB/sec): 520
Min bandwidth (MB/sec): 0
Average Latency:        0.162479
Stddev Latency:         0.195071
Max latency:            1.7959
Min latency:            0.033313


Read
Terminal.png frontend:
rados -c ~/.ceph/config --id pmorillo --pool pmorillo_bench bench 60 seq

RBD (Rados Block Device)

Requierement : Create a Ceph poolConfigure Ceph on client

Create a Rados Block Device

Terminal.png node:
rbd --id jdoe --pool jdoe_pool create <rbd_name> --size <MB>
Terminal.png node:
rbd --id jdoe --pool jdoe_pool ls
Terminal.png node:
rbd --id jdoe --pool jdoe_pool info <rbd_name>

Create filesystem and mount RBD

Terminal.png node:
modprobe rbd
Terminal.png node:
rbd --id jdoe --pool jdoe_pool map <rbd_name>
Terminal.png node:
rbd --id jdoe --pool jdoe_pool showmapped
id pool          image snap device    
1  jdoe_pool <rbd_name>  -    /dev/rbd1 
Terminal.png node:
mkfs.ext4 /dev/rbd1
Terminal.png node:
mkdir /mnt/rbd
Terminal.png node:
mount /dev/rbd1 /mnt/rbd
Terminal.png node:
df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda3        15G  1.6G   13G  11% /
...
/dev/sda5       525G   70M  498G   1% /tmp
/dev/rbd1        93M  1.6M   85M   2% /mnt/rbd

Resize, snapshots, copy, etc...

See :

QEMU/RBD

Requierement : Create a Ceph poolConfigure Ceph on client

Convert a qcow2 file into RBD

Terminal.png frennes:
scp /grid5000/virt-images/wheezy-x64-base.qcow2 root@node:/tmp
Terminal.png node:
apt-get install qemu qemu-kvm
Terminal.png node:
qemu-img convert -f qcow2 -O raw /tmp/wheezy-x64-base.qcow2 rbd:pool_name/debian7:id=jdoe
Terminal.png node:
rbd --id jdoe --pool pool_name ls
debian7

Start KVM virtual machine from a Rados Block Device

Terminal.png node:
screen kvm -m 1024 -drive format=raw,file=rbd:pool_name/debian7:id=jdoe -nographic