Vertica cluster migration using copycluster – Part 1

Posted on Posted in Data & Business Intelligence, DevOps & Networking

We had a requirement to migrate a HP vertica cluster to another physical location for archival purposes.

 

The first limitation is that we can’t just move all of the servers, we have to reduce the physical footprint of the cluster to a very small size.  The cluster we have to migrate has 14 servers (Dell R720xd with local 6TB disks for data on each server), if you are not familiar with Dell each R720xd server is 2 RU (Rack units) which means the cluster will require 28 RU which isn’t small at all.  The actual used disk size for the cluster is almost 40TB (2.75TB used per server)

The second limitation came from HP Vertica. All of the tools we examined (backup/restore, copy cluster, full replication) require the same amount of nodes on the target cluster (we will discuss later in this article but generally we will refer to the original cluster as the source cluster and the reduced footprint cluster as the target cluster).

 

Luckily there was no requirement to keep up with the same performance the source cluster has on the target cluster because activity on the target cluster should be very low.

In order to satisfy the first requirement we decided to use a single Dell R720xd server with 224GB RAM and a local MD1200 with 4TB disks in raid 6, this gave us 50TB of disk space – this reduces the physical footprint of the target cluster to 4 RU instead of the original 28 RU.

We decided to use Xenserver for virtualization and create a virtual Vertica cluster of 14 VMs, this means that each physical R720xd node on the source cluster will be reduced to a very small VM with only 12GB of RAM (If you are paying attention to the numbers, I know 224/14 is more than 12GB but we also needed 2 additional VMs to move with this cluster which I will not share the details about).

 

So we examined the different ways we can do this:

  1. Backup/Restore using the vbr utility – we thought about using this method first but we didn’t want it because we have to keep the same exact IP address configuration on the source and target cluster servers (which means that both clusters can’t be connected to the same network at the same time which will make validation harder), in addition this will require another storage location with enough space to backup 40 TB of data
  2. Object level replication (https://my.vertica.com/blog/tag/replication/) – this looked interesting but this feature was introduced in Vertica 7.2 SP 2, our cluster is in version 7.1.2 and we didn’t want to add the additional time that it will take us to upgrade the cluster.
  3. Finally we found the copycluster option of the vbr utility – this looked very promising as no additional storage was required and both clusters can exist at the same time, so this is the method we chose.

 

In this post (Part 1) I will go over the preparations we had to do for this, basically a Vertica cluster installation and database setup, the next post will cover the clustercopy process itself (Part 2).

These are the requirements we want to achieve in order to use the copycluster method:

  • Have the same number of nodes the source cluster.
  • Have a database with the same name as the source database. The target database can be completely empty.
  • Have the same node names as the source cluster. The nodes names listed in the NODES system tables on both clusters must match.
    Be accessible from the source cluster.
  • Have the same database administrator account, and all nodes must allow a database administrator of the source cluster to login through SSH without a password.
  • Have adequate disk space for the vbr.py –task copycluster command to complete.

 

Vertica installation guide specifies the required OS settings for a cluster installation, I will mention only what we had to change (this is done on all target cluster nodes, we use CentOS 6 on our servers):

  • Nice Limits Configuration
  • User Max Open Files Limit

  • Pam Limits – add “session required pam_limits.so” to /etc/pam.d/su
  • Transparent Hugepages (disable) – via grub.conf, add transparent_hugepage=never at the end of the kernel line

  • Disk setup, disk read-ahead and I/O Scheduling

This is our disk setup, I’ve placed the comments above each command section to explain

Xenserver disks are named xvd* – xvda is our OS;  xvdb,xvdc and xvde will be used for data,

I mention this just because if you thought that the installer will be forgiving if you do not set the OS disk according to HP requirements you were wrong, it’ll be angry with you like this:

This isn’t a requirement but I preferred to edit the hosts file on the target cluster in this setup to make things easier after we move it to it’s new home

 

After we finished the requirements the cluster installation is very simple so you just need to copy the RPM and your license key to one of the target cluster nodes.

Its important that the Vertica version on the target cluster matches the source cluster and then run the install command, specifying a comma delimited host list, local rpm path, local license path, data directory and failure threshold.

If everything was prepared correctly you should see output similar to this, if something fails it will usually tell you what is missing.

 

Now we need to create a database with the same name as the source cluster

Go to Configuration Menu > Create Database

Database name should match the source cluster (in our case its dwvertica)

Choose a password for the DB, it doesn’t have to match the source cluster db password.

Select all of the hosts in the cluster

Place the Catalog and Data on the data disk we created earlier, it isn’t specified in the requirements if this has to match the source cluster but for convenience I matched it

Confirm the Database creation

If everything is OK you will see the following message:

So we almost completed the preparations for the tool, the last step is to allow SSH access from the source cluster to the target cluster without a password from the dbadmin user.

Connect to the source cluster server you will run the vbr script from and copy the public key to all of the target cluster nodes

 

Lets review our copycluster preparation checklist from before:

Target requirement

Status

Have the same number of nodes the source cluster.

Both clusters have 14 nodes

Have a database with the same name as the source database. The target database can be completely empty.

Both clusters have a database name dwvertica

Have the same node names as the source cluster. The nodes names listed in the NODES system tables on both clusters must match.

Node names match (will be shown in next post)

Be accessible from the source cluster.

Connectivity is in place

Have the same database administrator account, and all nodes must allow a database administrator of the source cluster to login through SSH without a password.

Copied dbadmin ssh keys from source cluster to all target nodes

Have adequate disk space for the vbr.py –task copycluster command to complete.

Enough disk space on target cluster

 

In part 2 of this post I will cover the copycluster process itself.

Leave a Reply

Your email address will not be published. Required fields are marked *