Installing an ElasticSearch 2 Cluster on CentOS 7

scottalanmiller

Whether you are building an ELK logging server, a GrayLog2 logging server or what to use the powerful ElasticSearch NoSQL database platform for some other task, we will generally want to build a high performance, highly reliable ElasticSearch cluster before beginning any of those projects. And, in fact, a single cluster can easily support many different projects at the same time. So running ELK and GrayLog2 side by side, for example, with a single database cluster.

First we can start with a clean, vanilla CentOS 7 build from my stock 1511 template. (That is pure vanilla 1511 with firewall installed.)

0_1471562304631_Screenshot from 2016-08-18 18-19-32.png

We are going to need to add some additional storage. I have 16GB by default in my template for the OS. That's great. I'm going to add a 200GB secondary VirtIO drive here which I will attach and use for log storage. 200GB is quite large, remember this is just one cluster member, so perhaps 30GB is good for a more normal lab (this will be tripled at a minimum.)

0_1471562330382_Screenshot from 2016-08-18 19-04-05.png

On my Scale HC3 tiered storage cluster (HC 2150) I downtune my OS to zero. I don't want any SSD tiering going on with my OS. That's just wasteful.

0_1471562406343_Screenshot from 2016-08-18 19-04-34.png

But as logs can sometimes require a lot of performance, I'm going to tweak this one up just a little bit to give it some priority over other, random VM workloads. Only from a four (default) to a five to give it a small advantage.

0_1471562485993_Screenshot from 2016-08-18 19-04-54.png

And finally I tweak my vCPU from one (generic in my template) to two and give the VM 8GB of RAM. Logging can be pretty intensive.

0_1471562494228_Screenshot from 2016-08-18 19-05-31.png

Now we can log into the VM and get started:

yum -y update
echo "prd-lnx-elasic2" > /etc/hostname
yum -y install java
reboot

Many people choose to use the official Oracle JRE, but the included, maintained OpenJDK Java 1.8 should work fine and we are going to use it here. This is advantageous as it is now maintained by the OS repos automatically.

Now we can get down to actually installing ElasticSearch 2:

rpm --import http://packages.elastic.co/GPG-KEY-elasticsearch
echo '[elasticsearch-2.x]
name=Elasticsearch repository for 2.x packages
baseurl=http://packages.elastic.co/elasticsearch/2.x/centos
gpgcheck=1
gpgkey=http://packages.elastic.co/GPG-KEY-elasticsearch
enabled=1' | sudo tee /etc/yum.repos.d/elasticsearch.repo
yum -y install elasticsearch

We should now have an installed ElasticSearch server. Now the fun part, we need to edit our ES configuration file to set it up for our purposes.

vi /etc/elasticsearch/elasticsearch.yml

And here is the resultant output of my file:

grep -v ^# /etc/elasticsearch/elasticsearch.yml 

cluster.name: ntg-graylog2
node.name: ${HOSTNAME}
network.host: [_site_, _local_]
discovery.zen.ping.unicast.hosts: ["prd-lnx-elastic1", "prd-lnx-elastic2", "prd-lnx-elastic3"]

That's right. Just three lines that need to be modified for most use cases. You can name your cluster whatever makes sense for you. And my hosts names are those on my network, be sure to modify these if you do not use the same names that I do. The ${HOSTNAME} variable option allows the node to name itself at run time adding convenience with a uniformly defined configuration file.

We need to add the storage for the database to leverage our second block device for the data. You will need to adjust these commands for your block device. If you are on a Scale cluster or using KVM with PV drivers, you will likely have the same settings as I do.

pvcreate /dev/vdb
vgcreate vg1 /dev/vdb
lvcreate -l 100%FREE -n lv_data vg1
mkfs.xfs /dev/vg1/lv_data
mkdir /data
echo "/dev/vg1/lv_data	/data					xfs	defaults	0 0" >> /etc/fstab
mount /data
rmdir /var/lib/elasticsearch/
mkdir /data/elasticsearch
ln -s /data/elasticsearch/ /var/lib/elasticsearch
chown -R elasticsearch:elasticsearch -R /data/elasticsearch/

Next we just need to add our three cluster nodes to /etc/hosts so that they can be discovered by name. We could have skipped this step and do things by IP Address, but it is so much nicer with hostnames. Make sure to put in the right entries for your hosts, don't just copy mine.

echo "192.168.1.51     prd-lnx-elastic1" >> /etc/hosts
echo "192.168.1.52     prd-lnx-elastic2" >> /etc/hosts
echo "192.168.1.53     prd-lnx-elastic3" >> /etc/hosts

systemctl disable firewalld
systemctl enable elasticsearch
shutdown -h now

Now that we have made our first node and made it essentially stateless, as much as possible, we get to use cloning to make our other nodes! So easy. That last line sets the ElasticSearch server to start when the system comes back online.

Firewall: This is going to take some additional research. Traditionally ElasticSearch is run without a firewall. This is, of course, silly. Determining a best practices firewall setup is the next step and will be revisited.

Cloning:

0_1471627727025_Screenshot from 2016-08-19 13-27-38.png

0_1471627735571_Screenshot from 2016-08-19 13-28-05.png

0_1471627743580_Screenshot from 2016-08-19 13-28-39.png

Once cloned, just change the hostname of the two new hosts and set their static IP to be different from the parent and to match what you put in the /etc/hosts file.

Now we can verify, on any one of the three hosts, that things are running correctly:

curl -XGET 'http://localhost:9200/_cluster/state?pretty'


{
  "cluster_name" : "ntg-graylog2",
  "version" : 10,
  "state_uuid" : "lvkvXYuyTFun-RYWto6RgQ",
  "master_node" : "Z3r9bHOgRrGzxzm9J6zJfA",
  "blocks" : { },
  "nodes" : {
    "xpVv8zfYRjiZss8mmW8esw" : {
      "name" : "prd-lnx-elastic2",
      "transport_address" : "192.168.1.52:9300",
      "attributes" : { }
    },
    "d8viH-OpQ-unG6l3HOJcyg" : {
      "name" : "prd-lnx-elastic3",
      "transport_address" : "192.168.1.53:9300",
      "attributes" : { }
    },
    "Z3r9bHOgRrGzxzm9J6zJfA" : {
      "name" : "prd-lnx-elastic1",
      "transport_address" : "192.168.1.51:9300",
      "attributes" : { }
    }
  },
  "metadata" : {
    "cluster_uuid" : "Li_OEbRwQ9OA0VLBYXb-ow",
    "templates" : { },
    "indices" : { }
  },
  "routing_table" : {
    "indices" : { }
  },
  "routing_nodes" : {
    "unassigned" : [ ],
    "nodes" : {
      "Z3r9bHOgRrGzxzm9J6zJfA" : [ ],
      "d8viH-OpQ-unG6l3HOJcyg" : [ ],
      "xpVv8zfYRjiZss8mmW8esw" : [ ]
    }
  }
}