                               GSTATUS
                               =======

Overview
========
A gluster trusted pool (aka cluster), consists of several key components; 
nodes, volumes and bricks. In glusterfs 3.4/3.5, there isn't a single command
that can provide an overview of the cluster's health. This means that admins
currently assess the cluster health by looking at several commands to piece
together a picture of the cluster's state.

This isn't ideal - so 'gstatus' is an attempt to provide an easy to use,
highlevel view of a cluster's health through a single command. The tool gathers
information by issuing gluster commands, to build an object model covering
Nodes, Volumes and Bricks. With this data in place, checks are performed across
this meta data and errors reported back to the user. In later releases, this
data could be analysed in different ways to add further checks, incorporating
deployment best practices, freespace triggers etc.

Pre-Requisites
==============
+ glusterfs 3.4 or above
  The tool issues commands to gluster using the --xml parameter. Scraping the
  output is another possibility, but by using the xml option, the program 
  should be more stable and more easily extendable
  
+ volume types
  the initial release of gstatus supports distributed and 
  distributed-replicated volumes ONLY


Dependencies
============
- python 2.6 or above
- gluster CLI
- gluster version 3.4 or above


Installation
============
Installing the tool can be done in two ways, through python-seuptools or simply
by running the main script (gstatus.py) directly.

* Using python-setuptools
1. To install using this method extract the archive file
   - tar xvzf gstatus.tar.gz
2. cd to the gstatus directory
3. run the setup.py program
   - python setup.py install
4. invoke the script
   gstatus <options>

This will install a hook for gstatus in /usr/bin so running the script doesn't
need the .py extension.

* running from the archive
1. Extract the archive file
   - tar xvzf gstatus.tar.gz
2. cd to the gstatus directory
3. invoke the script using
   ./gstatus.py <options>



Volume States
=============
A gluster volume is made from individual filesystems (gluster bricks) across 
multiple nodes. From an end/user of application perspective, this complexity
is abstracted but the status of individual bricks does have a bearing on the
data availability of the volume. For example, even without replication the
loss of a single brick in the volume will not cause the volume itself to be
unavilable, but instead manifest as inaccessible files in the filesystem.
 
To help understand the impact a brick or node outage can have on data access,
gstatus introduces some additional volume status descriptions.

UP
Volume is started and available, all bricks are up

UP(Degraded)
This state is specific to replicated volumes, where at least one brick is down
within a replica set. Data is still 100% available due to the alternate
replica(s),  but the resilience of the volume to further failures within the
same replica set flags this volume as 'degraded'. 

UP(Partial)
Effectively this means that allthough some bricks in the volume are online, 
there are others that are down to a point where areas of the filesystem will be
missing. For a distrubuted volume, this state is seen if ANY brick is down, 
whereas for a replicated volume a complete replica set needs to be down before
the volume state transitions to PARTIAL. 

DOWN
Bricks are down, or the volume has not yet been started 


Running the tool
================

Running gstatus with a -h flag shows the following options;

[root@glfs35-1 gstatus]# gstatus -h
Usage: gstatus [options]

Options:
  --version             show program's version number and exit
  -h, --help            show this help message and exit
  -s, --state           show highlevel health of the cluster
  -v, --volume          volume info (default is ALL, or supply a volume name)
  -n, --no-selfheal     turn of self heal backlog checks (faster)
  -a, --all             show all cluster information (-s with -v)
  -u UNITS, --units=UNITS
                        display capacity units in DECimal or BINary format (GB
                        vs GiB)
  -l, --layout          show brick layout when used with -v, or -a
  -o OUTPUT_MODE, --output-mode=OUTPUT_MODE
                        produce output in different formats - json, keyvalue
                        or console(default)
  -D, --debug           turn on debug mode



* Understanding the output

- Simply invoking gstatus without any options, will give a 3 line summary of
  the cluster. This will tell you overall health, capacity and gluster version.
  If there are issues detected in the cluster, the UNHEALTHY state will be 
  suffixed by a number in brackets - the number denotes the number of errors
  detected within the cluster.

- Using -a will show high level cluster elements, as well as volume information
  You'll notice  that there are a number of what appear to be fractions listed
  against items like Nodes, Bricks etc.

  These fractions actually show you the current 'up' state against the 
  potential up state. So a 'Node' field that shows 2/4, actually means the
  cluster has 4 nodes, but only 2 are in an up state.

  e.g.
[root@rhs1-1 gstatus-tests]# gstatus
   
      Status: HEALTHY           Capacity: 32.00 GiB(raw bricks)
   Glusterfs: 3.4.0.59rhs                 10.00 GiB(raw used)
  OverCommit: Yes                         48.00 GiB(usable from volumes)

[root@rhs1-1 gstatus-tests]# gstatus -a
 
      Status: HEALTHY           Capacity: 32.00 GiB(raw bricks)
   Glusterfs: 3.4.0.59rhs                 10.00 GiB(raw used)
  OverCommit: Yes                         48.00 GiB(usable from volumes)

   Nodes    :  5/ 5		Volumes:  2 Up
   Self Heal:  5/ 4		          0 Up(Degraded)
   Bricks   :  8/ 8		          0 Up(Partial)
   Clients  :     5                       0 Down

<snip>

- Capacity information is derived from the brick information taken from a 
  'vol status detail' command. The accuracy of this number therefore depends
  on the nodes/bricks being all up - elements missing from the configuration
  will simply be missing from the calculation.

- A 'flag' called 'OverCommit' is provided that indicates whether a brick is
  used by multiple volumes. Although technically valid, this exposes the 
  user to capacity conflicts across different volumes when quota's are not used.

- running the tool with the -l flag (with -a, or -v) will show a line diagram
  depicting the brick layout within the volume. This feature was is inspired 
  by work of Fred van Zwieten and Niels De Vos in the 'lsgvt' project.

TIP: Normally for a volume listing, gstatus will issue a "vol heal X info"to show the 
user any heal backlog. This takes more time, so using -n (e.g. gstatus -an) 
can be used to skip this check, making the command faster to run.


The examples directory included in the archive contains a number of scenarios
that show the behaviour of the script under different error conditions.

 
Tested on
=========
- 4 node vm cluster, running Red Hat Storage 2.1 (RHEL 6.4, glusterfs 3.4)
- 4 node vm cluster, based on RHEL 6.5 with upstream glusterfs 3.5 beta 3


Known Issues
============
1) After a node is taken down, gluster uses a node timeout to confirm that the
   node has actually gone. This causes the tool to appear to hang as this 
   timeout is honoured in gluster, when the tool is invoked within 30secs of a
   node shutdown.
   
2) Normally every node will be capacity nodes. However, when nodes are added to 
   the cluster for quorum purposes, the Self Heal field shows more nodes with Self 
   Heal active than needed. This is due to the arbitration nodes also running the 
   self head daemon.

 
