Introduction
This tutorial will teach you how to configure a Multi-Node cluster with Cassandra on a VPS. Cassandra, a highly scalable open source database system that achieves great performance when setup with multiple-nodes – even on different data centers.
Installing Cassandra on Each Node
Before we begin configuring each node, you need to have Cassandra installed in every one of them. We have an easy tutorial on how to do that with VPS. After you've installed Cassandra on every node, you need to make sure it isn't running. To close Cassandra, type in:
sudo ps auwx | grep cassandra
If a process different from the "grep" one appears, copy the proccess ID and kill it:
sudo kill -9 PID


You'll also need to clear data. Do so by running:
sudo rm -rf /var/lib/cassandra/*
Configuring Cassandra
To configure Cassandra for multiple nodes, you'll need to know beforehand how many nodes you're going to use, and calculate token numbers for each. We've developed a tool to do this, and you can get it here. Simply write the number of nodes you're dealing with and you'll have tokens for each node. For example, if you have three nodes, you'd have these numbers:
Node 0: 0 Node 1: 3074457345618258602 Node 2: 6148914691236517205
Now you'll need to edit your configuration file for each node. To do so, open the nano text editor by running:
nano ~/cassandra/conf/cassandra.yaml
The information you'll need to edit can be the same for all nodes (cluster_name, seed_provider, rpc_address and endpoint_snitch) or different for each one (initial_token and listen_address). Choose a node to be your seed one, and look in the configuration file for the lines that refer to each of these attributes, and modify them to your needs:
cluster_name: 'Name' initial_token: Token seed_provider: - seeds: "Seed IP" listen_address: Droplet's IP rpc_address: 0.0.0.0 endpoint_snitch: RackInferringSnitch
用你的集群名称代替),用你的种子节点的IP代替Seed IP
,用你的滴滴的IP地址代替Droplet’s IP
。
Node 0 cluster_name: 'MyDigitalOceanCluster' initial_token: 0 seed_provider: - seeds: "198.211.xxx.0" listen_address: 198.211.xxx.0 rpc_address: 0.0.0.0 endpoint_snitch: RackInferringSnitch Node 1 cluster_name: 'MyDigitalOceanCluster' initial_token: 3074457345618258602 seed_provider: - seeds: "198.211.xxx.0" listen_address: 192.241.xxx.0 rpc_address: 0.0.0.0 endpoint_snitch: RackInferringSnitch Node 2 cluster_name: 'MyDigitalOceanCluster' initial_token: 6148914691236517205 seed_provider: - seeds: "198.211.xxx.0" listen_address: 37.139.xxx.0 rpc_address: 0.0.0.0 endpoint_snitch: RackInferringSnitch
To run, simply type in:
sudo sh ~/cassandra/bin/cassandra
on the seed node and when it's finished, replicate this process on the other nodes. If you don't see any errors, your multi-node Cassandra setup should be successfully deployed.