CockroachDB – how to build a 4 node SQL cluster on ubuntu and HyperV

CockroachDB Overview

Description: cockroach is an open source, survivable, strongly consistent, scale-out SQL database. If you wonder where google engineers go when they leave google, they go out on their own and build unbelievably great scalable and distributed open source software. Essentially if you want to run your own fault-tolerant SQL database across multiple datacenters and cloud services, using your own servers, allowing you complete control of your database, without paying hefty licensing fees, then run cockroach. The info in this post is not a review of cockroach, but rather a demonstration of a lab setup and POC.

To get started in our lab, first we want to build around 3 or 4 test clone servers or “nodes”. I use ubuntu on top of HyperV, but you can use any flavor of linux or MacOS you want. It can also run on Windows Docker.

If you’re like me and use Hyper-V on Win10, make 4 x Ubuntu 16.04 “clones” – first build a ‘goldmaster’ image, and clone it 4 times – guide here: https://4sysops.com/archives/clone-a-ubuntu-server-in-hyper-v-2012-r2/ – or use something like virtualboxes.org.

Create 4 virtual machines, each having it’s own IP address:
Node1: inet addr:10.0.10.169
Node2: inet addr:10.0.10.170
Node3: inet addr:10.0.10.171
Node4: inet addr:10.0.10.172

Make sure each node is up to date and has ntp installed and synchronized with the commands:

sudo apt-get install ntp

Use the command

timedatectl

To ensure that…

NTP synchronized: yes

At this point before you install/run cockroach, it’s wise to export each node VM with HyperV as a backup.

On Nodes 1,2,3,4 download the latest binary here https://www.cockroachlabs.com/docs/install- cockroachdb.html with the command:

sudo wget https://binaries.cockroachdb.com/cockroach-latest.linux-amd64.tgz

Extract the binary with the command:

tar -xvf cockroach-latest.linux-amd64.tgz

Move the binary to a location in your PATH or add the directory location to your path. You can learn about your path with the command:

sudo vi /etc/environment

And then move your extracted cockroach to /usr/sbin with the command:

sudo mv cockroach-latest.linux-amd64/cockroach /usr/sbin/

Do a sanity check with the command:

cockroach version

Start cockroach in insecure mode in the background on Node1 (master server) with the command:

sudo cockroach start --background --insecure --host=10.0.10.169

Result should be something like below:

CockroachDB node starting at 2017-03-15 23:16:23.118419329 -0700 PDT
 build: CCL beta-20170309 @ 2017/03/09 16:31:10 (go1.8)
 admin: http://10.0.10.169:8080
 sql: postgresql://root@10.0.10.169:26257?sslmode=disable
 logs: cockroach-data/logs
 store[0]: path=cockroach-data
 status: restarted pre-existing node
 clusterID: 08b6bfe6-4886-466b-a9c6-bc58a3809113
 nodeID: 1

Go ahead and browse to the admin page http://10.0.10.169:8080

On your other nodes:

sudo cockroach start --background --insecure --host=10.0.10.170 --join=10.0.10.169:26257

*where –host=current node ip address you’re having to join with the master server 10.0.10.169

Your results should look something like the following:

CockroachDB node starting at 2017-03-15 23:23:43.783097234 -0700 PDT
 build: CCL beta-20170309 @ 2017/03/09 16:31:10 (go1.8)
 admin: http://10.0.10.170:8080
 sql: postgresql://root@10.0.10.170:26257?sslmode=disable
 logs: cockroach-data/logs
 store[0]: path=cockroach-data
 status: initialized new node, joined pre-existing cluster
 clusterID: 08b6bfe6-4886-466b-a9c6-bc58a3809113
 nodeID: 2

Your web interface should provide you with performance graphs:

Identify the new nodes in the View Nodes List link:

Go on and add the remaining Nodes to the cluster.

???

Profit! – just kidding

Now you can go on to learn about cockroach SQL and create some databases and tables and test how pulling the plug on one of your nodes doesn’t bring down the DB, and how all the data is replicated to all 4 nodes. It’s recommended you don’t run this lab on a single workstation-class system, but something that meets the cockroach DB minimum system requirements. This product is still in beta and features are subject to change. Regardless, cockroachdb is an incredible addition to the open-source community and I’m sure will be very useful to a lot of systems admins and application developers.

Fix ubuntu when the OS will not boot – kernel panic – kernel panic not syncing vfs unable to mount root fs on unknown-block 0 0 – error /boot full remove old kernels from command line

To begin, it will probably take at least 30 minutes resolve this issue…

This fix solved my problem with the “vfs unable to mount root fs” error, but of course your results may vary. As always, first backup your system or do an export of the vm so you have a copy of the system as it existed before you started screwing around with it 😉

After running apt-get update / apt-get upgrade and then a reboot, you may receive the following error: kernel panic not syncing vfs unable to mount root fs on unknown-block 0 0 on ubuntu 16.04.

In many cases this  will be due to the /boot drive becoming 100% full because many updates have been made to the kernel. By default, ubuntu will retain the old kernels and add them to the list of available kernels you can boot into in the Grub2 boot loader menu. You can confirm that your drive is full by issueing the command:

df -h

The result will likely show the following:

In order to resolve this issue and boot successfully, while you’re looking at the error during boot, (you should already be at the console), and restart the vm or computer into the Grub2 menu then choose “Advanced options for ubuntu” view where you can see a list of old kernels you can boot into. Some report you can do this booting with the Shift key held down, or in the event it’s a virtual machine, you should be able to arrow-down in the Grub start screen and choose Advanced options for ubuntu on startup:

Grub2 boot menu.

Once you go into the advanced boot menu you will likely see several kernels listed. Choose the next-oldest kernel from the top/highest version of kernels. In my case I booted into the version labeled Ubuntu, with Linux 4.4.0-57-generic (my boot menu screenshot below is clean, but you’ll likely see several kernels listed).

Cross your fingers and hope you get to your login prompt. From here I jumped on putty and connected from that client, as I prefer it over the console.

Next, login and follow the directions that I found here:

http://askubuntu.com/questions/2793/how-do-i-remove-old-kernel-versions-to-clean-up-the-boot-menu

To save you the search, here are the instructions I used to first list and then remove the old kernels:

Open terminal and check your current kernel:

uname -a

DO NOT REMOVE THIS KERNEL! Make a note of the version in notepad or something.

Next, type the command below to view/list all installed kernels on your system.

dpkg --list | grep linux-image

Find all the kernels that lower than your current kernel. When you know which kernel to remove, continue below to remove it. Run the commands below to remove the kernel you selected.

sudo apt-get purge linux-image-x.x.x.x-generic

Or:

sudo apt-get purge linux-image-extra-x.x.x-xx-generic

Finally, run the commands below to update grub2

sudo update-grub2

Reboot your system.

sudo reboot

As you can see from my terminal history, I had to remove a few:

589  uname -a
 590  dpkg --list | grep linux-image
 591  sudo apt-get purge linux-image-4.4.0-21-generic
 592  sudo apt-get purge linux-image-4.4.0-22-generic
 593  sudo apt-get purge linux-image-4.4.0-24-generic
 594  df -h
 595  sudo apt-get purge linux-image-4.4.0-24-generic
 596  sudo apt-get purge linux-image-4.4.0-28-generic
 597  sudo apt-get purge linux-image-4.4.0-31-generic
 598  sudo apt-get purge linux-image-4.4.0-34-generic
 599  sudo apt-get purge linux-image-4.4.0-36-generic
 600  sudo apt-get purge linux-image-4.4.0-38-generic
 601  df -h
 602  sudo apt-get purge linux-image-4.4.0-42-generic
 603  sudo apt-get purge linux-image-4.4.0-45-generic
 604  sudo apt-get purge linux-image-4.4.0-47-generic
 605  sudo apt-get purge linux-image-4.4.0-51-generic
 606  sudo apt-get purge linux-image-4.4.0-53-generic
 607  sudo update-grub2
 608  dpkg --list | grep linux-image
 609  df -h
 610  sudo apt-get purge linux-image-extra-4.4.0-21-generic
 611  sudo apt-get purge linux-image-extra-4.4.0-22-generic
 612  sudo apt-get purge linux-image-extra-4.4.0-24-generic
 613  sudo apt-get purge linux-image-extra-4.4.0-28-generic
 614  sudo apt-get purge linux-image-extra-4.4.0-31-generic
 615  sudo update-grub2
 616  df -h
 617  sudo reboot
 618  dpkg --list | grep linux-image
 619  uname -a
 620  sudo reboot

After the reboot, you can see my /boot partition returned to a manageable size:

I hope this post helps someone save some time and help them fix their ubuntu boot problems. Please leave a comment if this helped resolve your issue or if there is a smarter/faster way to fix this problem.