In this howto we will describe in detail how to install / configure GlusterFS 3.3.1 (latest stable) on CentOS 6.3.

GlusterFS is an open source, powerful clustered file system capable of scaling to several petabytes of storage which is available to user under a single mount point. It uses already available disk filesystems like ext3, ext4, xfs etc to store data and client will able to access the storage as local filesystem. GlusterFS cluster aggregates storage blocks over Infiniband RDMA and/or TCP/IP interconnect in a single global namespace.

We will discuss following terms later on in this howto, so it require you to understand them before proceed.

brick
The brick is the storage filesystem that has been assigned to a volume. e.g /data on server
client
The machine which mounts the volume (this may also be a server).
server
The machine (physical or virtual or bare metal) which hosts the actual filesystem in which data will be stored.
volume
A volume is a logical collection of bricks where each brick is an export directory on a server . A volume can be of several types and you can create any of them in storage pool for a single volume.
Distributed – Distributed volumes distributes files throughout the bricks in the volume. You can use distributed volumes where the requirement is to scale storage and the redundancy is either not important or is provided by other hardware/software layers.
Replicated – Replicated volumes replicates files across bricks in the volume. You can use replicated volumes in environments where high-availability and high-reliability are critical.
Striped – Striped volumes stripes data across bricks in the volume. For best results, you should use striped volumes only in high concurrency environments accessing very large files.

1. Setup

Hardware

I will use three servers and one client for my GlusterFS installation/configuration. These can be a physical machines or virtual machines. I will be using my virtual environment for this and IP/hostname will be as follow.

host1.example.com 192.168.0.101
host2.example.com 192.168.0.102
host3.example.com 192.168.0.103
client1.example.com 192.168.0.1

Two partition require on each server because 1 will be using for OS install and 2nd will be using for our storage.

Software

For OS I will be using CentOS 6.3 and GlusterFS 3.3.1. In EPEL repository 3.2.7 is available but we will go with latest version i.e 3.3.1. This is available through GlusterFS own repository.

2. Installation

First we will add GlusterFS repo in our yum repositories. To do this execute following command.

wget -P /etc/yum.repos.d http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/glusterfs-epel.repo

 2.1 Installation on Servers:

On Servers (host1, host2, host3) execute following command to install glusterfs server side packages.

yum -y install glusterfs glusterfs-fuse glusterfs-server

Start glusterfs services on all servers with enable them to start automatically on startup.

/etc/init.d/glusterd start
chkconfig glusterfsd on

 2.2 Installation on Client:

On Client execute following command to install clusterfs client side packages.

yum -y install glusterfs glusterfs-fuse

This we will later on use to mount the glusterfs on client.

 3. Creating Trusted Storage Pool.

Trusted storage pool are the servers which are running as gluster servers and will provide bricks for volumes. You will need to probe all servers to server1 (dont probe server1 or localhost).

Note: turn off your firewall using iptables -F command.

We will now create all three servers in a trusted storage pool and probing will be done on server1.

gluster peer probe host2
Probe successful

gluster peer probe host3
Probe successful

Confirm your server status.

gluster peer status
Number of Peers: 2

Hostname: host2
Uuid: b65874ab-4d06-4a0d-bd84-055ff6484efd
State: Peer in Cluster (Connected)

Hostname: host3
Uuid: 182e3214-44a2-46b3-ae79-769af40ec160
State: Peer in Cluster (Connected)

4. Creating Glusterfs Server Volume

Now its time to create glusterfs server volume. A volume is a logical collection of bricks where each brick is an export directory on a server in the trusted storage pool.

Glusterfs gives many types in storage to create the volumes within: I will demonstrate three of them defined above and it will gives you enough knowledge to create remaining by yourself.

4.1 Distributed

Use distributed volumes where you need to scale storage because in a distributed volumes files are spread randomly across the bricks in the volume.

gluster volume create dist-volume host1:/dist1 host2:/dist2 host3:/dist3
Creation of volume dist-volume has been successful. Please start the volume to access data.

Start the dist-volume

gluster volume start dist-volume
Starting volume dist-volume has been successful

Check status of volume

gluster volume info

Volume Name: dist-volume
Type: Distribute
Volume ID: b842b03f-f8db-47e9-920a-a04c2fb24458
Status: Started
Number of Bricks: 3
Transport-type: tcp
Bricks:
Brick1: host1:/dist1
Brick2: host2:/dist2
Brick3: host3:/dist3

 4.1.1 Accessing Distributed volume and testing.

Now on client1.example.com we will access and test distributed volume functionality. To mount gluster volumes to access data, first we will mount it manually then add in /etc/fstab to mount it automatically whenever server restart.

Use mount command to access gluster volume.

mkdir /mnt/distributed
mount.glusterfs host1.sohailriaz.com:/dist-volume /mnt/distributed/

Check it using mount command.

mount
/dev/sda on / type ext4 (rw,errors=remount-ro)
none on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
host1.sohailriaz.com:/dist-volume on /mnt/distributed type fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072)

Now add following line at the end of /etc/fstab file to make it available to server on every reboot.

host1.example.com:/dist-volume /mnt/distributed glusterfs defaults,_netdev 0 0

save the file.

Now to test create following files in the mounted directory.

touch /mnt/distributed/file1
touch /mnt/distributed/file2
touch /mnt/distributed/file3
touch /mnt/distributed/file4
touch /mnt/distributed/file5
touch /mnt/distributed/file6
touch /mnt/distributed/file7
touch /mnt/distributed/file8

Check on the servers for distributed functionality

[root@host1 ~]# ls -l /dist1
total 0
-rw-r--r-- 2 root root 0 May 9 10:50 file5
-rw-r--r-- 2 root root 0 May 9 10:51 file6
-rw-r--r-- 2 root root 0 May 9 10:51 file8

[root@host2 ~]# ls -l /dist2
total 0
-rw-r--r-- 2 root root 0 May 9 10:50 file3
-rw-r--r-- 2 root root 0 May 9 10:50 file4
-rw-r--r-- 2 root root 0 May 9 10:51 file7

[root@host3 ~]# ls -l /dist3
total 0
-rw-r--r-- 2 root root 0 May 9 10:50 file1
-rw-r--r-- 2 root root 0 May 9 10:50 file2

All of the files created in mounted volume are distributed to all the servers.

4.2 Replicated

Use replicated volumes in storage where high-availability and high-reliability are critical because replicated volumes create same copies of files across multiple bricks in the volume.

gluster volume create rep-volume replica 3 host1:/rep1 host2:/rep2 host3:/rep3
Creation of volume rep-volume has been successful. Please start the volume to access data.

Where replica 3 is a value to create number of copies on multiple servers, so here we need same copy on all servers.

Start the dist-volume

gluster volume start rep-volume
Starting volume rep-volume has been successful

Check status of volume

gluster volume info rep-volume

Volume Name: rep-volume
Type: Replicate
Volume ID: 0dcf51bc-376a-4bd2-8759-3d47bba49c3d
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: host1:/rep1
Brick2: host2:/rep2
Brick3: host3:/rep3

4.2.1 Accessing Replicated Volume and tests.

Now same as distributed volume access using mount command. To mount gluster replicated volumes to access data, first we will mount it manually then add in /etc/fstab to mount it automatically whenever server restart.

Use mount command to access gluster volume.

mkdir /mnt/replicated
mount.glusterfs host1.sohailriaz.com:/rep-volume /mnt/replicated/

Check it using mount command.

mount
/dev/sda on / type ext4 (rw,errors=remount-ro)
none on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
host1.sohailriaz.com:/dist-volume on /mnt/distributed type fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072)
host1.sohailriaz.com:/rep-volume on /mnt/replicated type fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072)

Now add following line at the end of /etc/fstab file to make it available to server on every reboot.

host1.example.com:/rep-volume /mnt/replicated glusterfs defaults,_netdev 0 0

Now to test create following files in the mounted directory.

touch /mnt/replicated/file1
touch /mnt/replicated/file2
touch /mnt/replicated/file3
touch /mnt/replicated/file4
touch /mnt/replicated/file5
touch /mnt/replicated/file6
touch /mnt/replicated/file7
touch /mnt/replicated/file8

Check on the servers for replicated functionality

[root@host1 ~]# ls -l /rep1
total 0
-rw-r--r-- 2 root root 0 May 9 11:16 file1
-rw-r--r-- 2 root root 0 May 9 11:16 file2
-rw-r--r-- 2 root root 0 May 9 11:16 file3
-rw-r--r-- 2 root root 0 May 9 11:16 file4
-rw-r--r-- 2 root root 0 May 9 11:16 file5
-rw-r--r-- 2 root root 0 May 9 11:16 file6
-rw-r--r-- 2 root root 0 May 9 11:16 file7
-rw-r--r-- 2 root root 0 May 9 11:16 file8

[root@host2 ~]# ls -l /rep2
total 0
-rw-r--r-- 2 root root 0 May 9 11:16 file1
-rw-r--r-- 2 root root 0 May 9 11:16 file2
-rw-r--r-- 2 root root 0 May 9 11:16 file3
-rw-r--r-- 2 root root 0 May 9 11:16 file4
-rw-r--r-- 2 root root 0 May 9 11:16 file5
-rw-r--r-- 2 root root 0 May 9 11:16 file6
-rw-r--r-- 2 root root 0 May 9 11:16 file7
-rw-r--r-- 2 root root 0 May 9 11:16 file8

[root@host3 ~]# ls -l /rep3
total 0
-rw-r--r-- 2 root root 0 May 9 11:16 file1
-rw-r--r-- 2 root root 0 May 9 11:16 file2
-rw-r--r-- 2 root root 0 May 9 11:16 file3
-rw-r--r-- 2 root root 0 May 9 11:16 file4
-rw-r--r-- 2 root root 0 May 9 11:16 file5
-rw-r--r-- 2 root root 0 May 9 11:16 file6
-rw-r--r-- 2 root root 0 May 9 11:16 file7
-rw-r--r-- 2 root root 0 May 9 11:16 file8

All of the files created in mounted volume are replicated to all the servers.

4.3 Stripped

Use striped volumes only in high concurrency environments accessing very large files because striped volumes stripes data across bricks in the volume.

gluster volume create strip-volume strip 3 host1:/strip1 host2:/strip2 host3:/strip3
Creation of volume strip-volume has been successful. Please start the volume to access data.

Start the dist-volume

gluster volume start strip-volume
Starting volume strip-volume has been successful

Check status of volume

gluster volume info strip-volume

Volume Name: strip-volume
Type: Stripe
Volume ID: 2b21ad4b-d464-408b-82e8-df762ef89bcf
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: host1:/strip1
Brick2: host2:/strip2
Brick3: host3:/strip3

4.3.1 Accessing Stripped Volume and tests.

Now same as distributed and replicated volume access stripped volume using mount command. To mount gluster stripped volumes to access data, first we will mount it manually then add in /etc/fstab to mount it automatically whenever server restart.

Use mount command to access gluster volume.

mkdir /mnt/stripped
mount.glusterfs host1.sohailriaz.com:/strip-volume /mnt/stripped/

Check it using mount command.

mount
/dev/sda on / type ext4 (rw,errors=remount-ro)
none on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
host1.sohailriaz.com:/dist-volume on /mnt/distributed type fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072)
host1.sohailriaz.com:/rep-volume on /mnt/replicated type fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072)
host1.sohailriaz.com:/strip-volume on /mnt/stripped type fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072)

Now add following line at the end of /etc/fstab file to make it available to server on every reboot.

host1.example.com:/strip-volume /mnt/stripped glusterfs defaults,_netdev 0 0

Save the file.

Now to test create following large file in the mounted directory on client1.

dd if=/dev/zero of=/mnt/stripped/file.img bs=1024k count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 61.5011 s, 17.0 MB/s
ls -l /mnt/stripped/
total 1024120
-rw-r--r-- 1 root root 1048576000 May 9 11:31 file.img

Check on the servers for stripped functionality

[root@host1 ~]# ls -l /strip1/
total 341416
-rw-r--r-- 2 root root 1048444928 May 9 11:31 file.img

[root@host2 ~]# ls -l /strip2/
total 341416
-rw-r--r-- 2 root root 1048576000 May 9 11:31 file.img

[root@host3 ~]# ls -l /strip3/
total 341288
-rw-r--r-- 2 root root 1048313856 May 9 11:31 file.img

The large file is stripped across volume successfully.

5. Managing Gluster Volumes.

Now we will see some of the common operations/maintenance you might do on gluster volumes.

5.1 Expanding Volumes.

As per need we can add volume to already online volumes. Here for example we will going to add new brick to our distributed volume. To do this task we will need to do following:

First probe a new server which will offer new brick to our volume. This has to be done on host1

gluster peer probe host4
Probe successful

Now add the new brick from new probed host4.

gluster volume add-brick dist-volume host4:/dist4
Add Brick successful

Check the volume information using the following command.

gluster volume info

Volume Name: dist-volume
Type: Distribute
Volume ID: b842b03f-f8db-47e9-920a-a04c2fb24458
Status: Started
Number of Bricks: 4
Transport-type: tcp
Bricks:
Brick1: host1:/dist1
Brick2: host2:/dist2
Brick3: host3:/dist3
Brick4: host4:/dist4

5.2 Shrinking Volume

As needed you can shrink volumes even the gluster fs is online and available. Due to some hardware failure or network unreachable one of your brick is unable in volume, you need to remove it then first start the process of removing,

gluster volume remove-brick dist-volume host2:/dist2 start
Remove Brick start successful

Check the status should be completed.

gluster volume remove-brick dist-volume host2:/dist2 status

Node      Rebalanced-files size        scanned     failures    status
--------- -----------      ----------- ----------- ----------- ------------
localhost 0                0Bytes      0           0           not started
host3     0                0Bytes      0           0           not started
host2     0                0Bytes      4           0           completed

commit the removing brick operation

gluster volume remove-brick dist-volume host2:/dist2 commit
Removing brick(s) can result in data loss. Do you want to Continue? (y/n) y
Remove Brick commit successful

Check the volume information for confirmation.

gluster volume info dist-volume

Volume Name: dist-volume
Type: Distribute
Volume ID: b842b03f-f8db-47e9-920a-a04c2fb24458
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: host1:/dist1
Brick2: host3:/dist3

5.3 Rebalancing Volume

Rebalancing need to be done after expanding or shrinking the volume, this will rebalance the data amount other servers. To do this you need to issue following command

gluster volume rebalance dist-volume start
Starting rebalance on volume dist-volume has been successful

5.4 Stopping the Volume.

To stop a volume

gluster volume stop dist-volume
Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y
Stopping volume dist-volume has been successful

To delete a volume

gluster volume delete dist-volume
Deleting volume will erase all information about the volume. Do you want to continue? (y/n) y
Deleting volume dist-volume has been successful

Be remember to unmount the mounted directory on your clients.

By Sohail Riaz

I am a First Red Hat Certified Architect - RHCA (ID # 110-082-666) from Pakistan with over 14 years industry experience in several disciplines including LINUX/UNIX System Administration, Virtualization, Network, Storage, Load Balances, HA Clusters and High Performance Computing.

14 thoughts on “GlusterFS HowTo on CentOS 6.x”
  1. Nice article. I am testing GlusterFS for possible use in my datacenter as I need to implement shared storeage for virtualized servers. my 2 options are NFS with GlusterFS or an iSCSI scan to support my VMWare servers that currently use local storage.

    Cheers!

  2. Very Nice tutorial Sohail Riaz Appreciate that, there is one basic question i want to ask, in replication, you mount the volume on client as “mount.glusterfs host1.sohailriaz.com:/rep-volume /mnt/replicated/” and also create volume on host1, what would happen if host1 fails physically, does it mean your copy of data will remain on host2. but client got disconnected and you have to mount host2 in client again to access copy of data. ???? it mean down time right but minor???

    prompt reply will be highly appreciated. Thanks

    Regards,
    Adeel Ahmad

  3. I got a few bare metal servers for my experiment and i was trying to install glusterfs packages on them. But while installing i run into a dependency issue.
    :Error: Package: glusterfs-3.5git-1.el6.x86_64 (exogeni)
    Requires: libcrypto.so.10(libcrypto.so.10)(64bit)
    Error: Package: glusterfs-3.5git-1.el6.x86_64 (exogeni)
    Requires: libssl.so.10(libssl.so.10)(64bit)
    Error: Package: glusterfs-libs-3.5git-1.el6.x86_64 (exogeni)
    Requires: libcrypto.so.10(libcrypto.so.10)(64bit)
    I tried finding the latest libcrypto libs but there is no libcrypto.so.10 version. Can some help me.

  4. Hi Sohail,

    Do we need a shared disk for setting up glusterfs between 2 nodes or need to provide separate disk to each node and proceed with the above setup?

  5. Hi,
    I was trying to install glusterfs in centos minimal version on a virtual machine. After running the command:
    yum install glusterfs-server
    there is an error shown saying that “no package glusterfs-server is available”.
    What should i do now?
    thanks

  6. Hi Sohail,

    I have existing 2 server and 1 client. what I’m planning to make is a fail-over/load balancing in client and I need to provide 1 more client server to make this happen. now my question is this recommended or a best practice, is there any problem with this setup especially when 2 users with the same account do read/write ?

    Thanks

  7. @ALDEN: Using glusterfs to use failover you can use replicated volumes as they contain same information across all bricks.
    For loadbalancing you can create different volume using different bricks and read/write same data using different volume. As volume will be present as shared filesystem to client you can instruct your application to do load balancing on different mounted volumes.

  8. Hello Sohail,

    In my client server I have installed vsftpd or ftp for some users to easily upload the data in my server rep-volume using filezilla. but whenever I upload any file it always respond error (553 Could not create file.), but I can download the files. what do you think is the problem? is it better if I setup my server1&2 as a server client no more dedicated server for client?

    Thanks.

  9. Hi Alden,

    Could not create file only lead to permission issue’s. You need to check permission from top to bottom level till your brick. You might also check ACL for it too.
    Regarding your setup it depends on your’s client need. You need to serve well, so if server client serve you better, I am with you.

    Regards,

    Sohail Riaz

  10. HI Sohail,
    Please add one more this in this tutorial. How to migrate Volume and its data into new Hard Drive.

Leave a Reply

Your email address will not be published. Required fields are marked *