Hi,
New Data Nodes can be added easily to existing and live clusters. Remember that they are object of licensing in standard licensing model, so before you do anything - check if your license will allow additional nodes.
The procedure is really simple. The most important thing is that new data node has to be empty of any data, so it can get data from the cluster it is joining; and data node has to be able to connect with all master nodes of the cluster.
- Prepare new machine for additional Data Node.
- Install Data Node, you can do it directly from
rpm
package that comes with ELS. Usually they are located in /root/install/
. So just go with rpm -i <data_node_package>
.
- If the service started automatically - stop it with
systemctl stop elasticsearch
. It shouldn't have any data, but to be sure verify if /var/lib/elasticsearch
is empty. That's the default location for data for new instances.
- We'll operate on
/etc/elasticsearch/elasticsearch.yml
file. It is main configuration file and all we have to do is there. After we done editing, we can start the service. What we need to edit, I'll post below.
In the configuration file - /etc/elasticsearch/elasticsearch.yml
we are mostly interested in Cluster
section, where we define where the new node will connect. Connectivity to other nodes in the cluster is very important to remember.
What you can do additionally, to make sure the data structure is the same across all nodes is to change the path.data
path to your desired location. For ELS deployments it's usually the /data
mount point.
For reference on how the configuration looks like it's good idea to look at the config file of one of your working nodes. All we'll change are: node name and ip addresses. Let's do it.
## Cluster
cluster.name: logserver #1
node.name: my-node-name-123 #2
node.master: false #3
node.data: true #4
#5
discovery.seed_hosts: ["10.4.5.100:9300", "10.4.5.20:9300", "10.4.5.30:9300"]
cluster.initial_master_nodes: ["node-1", "node-2", "node-3" ]
*1: Cluster name has to be the same across all nodes of the same cluster. Make sure it's right.
*2: node.name
is the name of the node and it has to unique for all nodes within the cluster.
*3: Define if the node can become a cluster master or not. For an existing cluster it's recommended to set it to false
.
*4: This parameter defines if node will act as a database within cluster. This has to be true
.
*5: Those are parameters, where you specify ip of other master nodes of the cluster.
Once that done, you can start the service and it should join the cluster automatically.
You can verify if the operation was successful by looking at the logs, using curl
command for the API or just look at the cluster status from the UI.
Let me know if that helps!