ML
    • Recent
    • Categories
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Installing an ElasticSearch 2 Cluster on CentOS 7

    IT Discussion
    elasticsearch elasticsearch 2 graylog graylog2 elk logging nosql clustering how to scale scale hc3
    1
    1
    2.2k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • scottalanmillerS
      scottalanmiller
      last edited by

      Whether you are building an ELK logging server, a GrayLog2 logging server or what to use the powerful ElasticSearch NoSQL database platform for some other task, we will generally want to build a high performance, highly reliable ElasticSearch cluster before beginning any of those projects. And, in fact, a single cluster can easily support many different projects at the same time. So running ELK and GrayLog2 side by side, for example, with a single database cluster.

      First we can start with a clean, vanilla CentOS 7 build from my stock 1511 template. (That is pure vanilla 1511 with firewall installed.)

      0_1471562304631_Screenshot from 2016-08-18 18-19-32.png

      We are going to need to add some additional storage. I have 16GB by default in my template for the OS. That's great. I'm going to add a 200GB secondary VirtIO drive here which I will attach and use for log storage. 200GB is quite large, remember this is just one cluster member, so perhaps 30GB is good for a more normal lab (this will be tripled at a minimum.)

      0_1471562330382_Screenshot from 2016-08-18 19-04-05.png

      On my Scale HC3 tiered storage cluster (HC 2150) I downtune my OS to zero. I don't want any SSD tiering going on with my OS. That's just wasteful.

      0_1471562406343_Screenshot from 2016-08-18 19-04-34.png

      But as logs can sometimes require a lot of performance, I'm going to tweak this one up just a little bit to give it some priority over other, random VM workloads. Only from a four (default) to a five to give it a small advantage.

      0_1471562485993_Screenshot from 2016-08-18 19-04-54.png

      And finally I tweak my vCPU from one (generic in my template) to two and give the VM 8GB of RAM. Logging can be pretty intensive.

      0_1471562494228_Screenshot from 2016-08-18 19-05-31.png

      Now we can log into the VM and get started:

      yum -y update
      echo "prd-lnx-elasic2" > /etc/hostname
      yum -y install java
      reboot
      

      Many people choose to use the official Oracle JRE, but the included, maintained OpenJDK Java 1.8 should work fine and we are going to use it here. This is advantageous as it is now maintained by the OS repos automatically.

      Now we can get down to actually installing ElasticSearch 2:

      rpm --import http://packages.elastic.co/GPG-KEY-elasticsearch
      echo '[elasticsearch-2.x]
      name=Elasticsearch repository for 2.x packages
      baseurl=http://packages.elastic.co/elasticsearch/2.x/centos
      gpgcheck=1
      gpgkey=http://packages.elastic.co/GPG-KEY-elasticsearch
      enabled=1' | sudo tee /etc/yum.repos.d/elasticsearch.repo
      yum -y install elasticsearch
      

      We should now have an installed ElasticSearch server. Now the fun part, we need to edit our ES configuration file to set it up for our purposes.

      vi /etc/elasticsearch/elasticsearch.yml
      

      And here is the resultant output of my file:

      grep -v ^# /etc/elasticsearch/elasticsearch.yml 
      
      cluster.name: ntg-graylog2
      node.name: ${HOSTNAME}
      network.host: [_site_, _local_]
      discovery.zen.ping.unicast.hosts: ["prd-lnx-elastic1", "prd-lnx-elastic2", "prd-lnx-elastic3"]
      

      That's right. Just three lines that need to be modified for most use cases. You can name your cluster whatever makes sense for you. And my hosts names are those on my network, be sure to modify these if you do not use the same names that I do. The ${HOSTNAME} variable option allows the node to name itself at run time adding convenience with a uniformly defined configuration file.

      We need to add the storage for the database to leverage our second block device for the data. You will need to adjust these commands for your block device. If you are on a Scale cluster or using KVM with PV drivers, you will likely have the same settings as I do.

      pvcreate /dev/vdb
      vgcreate vg1 /dev/vdb
      lvcreate -l 100%FREE -n lv_data vg1
      mkfs.xfs /dev/vg1/lv_data
      mkdir /data
      echo "/dev/vg1/lv_data	/data					xfs	defaults	0 0" >> /etc/fstab
      mount /data
      rmdir /var/lib/elasticsearch/
      mkdir /data/elasticsearch
      ln -s /data/elasticsearch/ /var/lib/elasticsearch
      chown -R elasticsearch:elasticsearch -R /data/elasticsearch/
      

      Next we just need to add our three cluster nodes to /etc/hosts so that they can be discovered by name. We could have skipped this step and do things by IP Address, but it is so much nicer with hostnames. Make sure to put in the right entries for your hosts, don't just copy mine.

      echo "192.168.1.51     prd-lnx-elastic1" >> /etc/hosts
      echo "192.168.1.52     prd-lnx-elastic2" >> /etc/hosts
      echo "192.168.1.53     prd-lnx-elastic3" >> /etc/hosts
      
      systemctl disable firewalld
      systemctl enable elasticsearch
      shutdown -h now
      

      Now that we have made our first node and made it essentially stateless, as much as possible, we get to use cloning to make our other nodes! So easy. That last line sets the ElasticSearch server to start when the system comes back online.

      Firewall: This is going to take some additional research. Traditionally ElasticSearch is run without a firewall. This is, of course, silly. Determining a best practices firewall setup is the next step and will be revisited.

      Cloning:

      0_1471627727025_Screenshot from 2016-08-19 13-27-38.png

      0_1471627735571_Screenshot from 2016-08-19 13-28-05.png

      0_1471627743580_Screenshot from 2016-08-19 13-28-39.png

      Once cloned, just change the hostname of the two new hosts and set their static IP to be different from the parent and to match what you put in the /etc/hosts file.

      Now we can verify, on any one of the three hosts, that things are running correctly:

      curl -XGET 'http://localhost:9200/_cluster/state?pretty'
      
      
      {
        "cluster_name" : "ntg-graylog2",
        "version" : 10,
        "state_uuid" : "lvkvXYuyTFun-RYWto6RgQ",
        "master_node" : "Z3r9bHOgRrGzxzm9J6zJfA",
        "blocks" : { },
        "nodes" : {
          "xpVv8zfYRjiZss8mmW8esw" : {
            "name" : "prd-lnx-elastic2",
            "transport_address" : "192.168.1.52:9300",
            "attributes" : { }
          },
          "d8viH-OpQ-unG6l3HOJcyg" : {
            "name" : "prd-lnx-elastic3",
            "transport_address" : "192.168.1.53:9300",
            "attributes" : { }
          },
          "Z3r9bHOgRrGzxzm9J6zJfA" : {
            "name" : "prd-lnx-elastic1",
            "transport_address" : "192.168.1.51:9300",
            "attributes" : { }
          }
        },
        "metadata" : {
          "cluster_uuid" : "Li_OEbRwQ9OA0VLBYXb-ow",
          "templates" : { },
          "indices" : { }
        },
        "routing_table" : {
          "indices" : { }
        },
        "routing_nodes" : {
          "unassigned" : [ ],
          "nodes" : {
            "Z3r9bHOgRrGzxzm9J6zJfA" : [ ],
            "d8viH-OpQ-unG6l3HOJcyg" : [ ],
            "xpVv8zfYRjiZss8mmW8esw" : [ ]
          }
        }
      }
      
      C 1 Reply Last reply Reply Quote 6
      • 1 / 1
      • First post
        Last post