Wednesday, 11 September 2013

Enterprise Monitor-ing the Raspberry Pi & MySQL Cluster

So, now I've got my Raspberry Pi's tested, and running MySQL Cluster we'll need some form of checking it's up and running, as with the rest of our MySQL servers.

Monitoring via a Remote Agent

First issue, of course, is that, with my existing MEM console, I have no need to re-install MEM, but rather want to deploy an agent so that I can monitor the MySQL Cluster.
This poses it's first problem, as there isn't an ARM-ready agent software available. Remember, it's not a supported platform. So what can we do? Setup a remote Enterprise Monitor agent, so that, we can monitor the MySQL Cluster, albeit at the sacrifice of not having the agent local on each Raspberry Pi, and hence, not be able to capture the o.s. data.

Config change

So, on my Ubuntu server, I go to the agent install directory:

  cd /opt/mysql/enterprise/agent/etc
  vi mysql-mypi01-agent.ini
  :1,$ s/ol63uek01/mypi01/g

  # Proxy Parameters
  proxy-address=:3317
  proxy-backend-addresses = mypi01:3306


We've changed the proxy port so that it's unique on the MEM server.

Also need to change the agent-uuid for this remote agent:
First, generate a new UUID:
  /opt/mysql/enterprise/agent/bin/mysql-proxy --plugins=agent --agent-generate-uuid
  a5e420b3-02d5-4000-ad6f-faeb9f9b80a4
And replace the agent-uuid entry in mysql-mypi01-agent-ini.

Now for the directory structure and details needed to connect to the sqlnode:
  cp -r instances-ol6uek01 instances-mypi01
  cd instances-mypi01/agent
  vi agent-instance.ini


This reminds us to create the memagent user on each sqlnode.
Go on.. do it, on both mypi01 & mypi02:
  grant all on *.* to 'memagent'@'141.144.12.45' identified by 'oracle';
And test it too, from ubuvlc01:
  mypi01:  mysql -umemagent -poracle -h141.144.12.41 -P3306
  mypi02:  mysql -umemagent -poracle -h141.144.12.40 -P3306

Ok.

Now, repeat the same steps for mypi02.
  cp mysql-mypi01-agent.ini mysql-mypi02-agent.ini
  vi mysql-mypi02-agent.ini
  :1,$ s/mypi01/mypi02/g
  # Proxy Parameters
  proxy-address=:3318
  proxy-backend-addresses = 141.144.12.40:3306


  /opt/mysql/enterprise/agent/bin/mysql-proxy --plugins=agent --agent-generate-uuid
  2013-09-11 10:43:54: (critical) plugin agent 2.3.12.2174 started
  052fcb7b-89e7-4b16-8b66-7475c40a2b0f

  vi mysql-mypi02-agent.ini
Change the agent-uuid entry.

Now the directory structure:
  cp -r instances-mypi01 instances-mypi02
  cd instances-mypi02/agent/
  vi agent-instance.ini

And change the IP address so it connects to mypi02, and not mypi01.

We've create the users and confirmed remote access.
Now, to start them both up.

Remote Agent Startup

On the ubuntu server:
mypi01 agent:
  /opt/mysql/enterprise/agent/etc/init.d/mysql-monitor-agent start /opt/mysql/enterprise/agent/etc/mysql-mypi01-agent.ini
  Starting MySQL Enterprise agent service...
   *

mypi02 agent:
  /opt/mysql/enterprise/agent/etc/init.d/mysql-monitor-agent start /opt/mysql/enterprise/agent/etc/mysql-mypi02-agent.ini
  Starting MySQL Enterprise agent service...
   *

Double checking:
  ps -ef | grep agent | grep mypi
root     11488     1  0 10:47 ?        00:00:00 /opt/mysql/enterprise/agent/libexec/mysql-monitor-agent --defaults-file=/opt/mysql/enterprise/agent/etc/mysql-mypi01-agent.ini --daemon --pid-file=/opt/mysql/enterprise/agent/mysql-mypi01-agent.pid
root     11489 11488  0 10:47 ?        00:00:00 /opt/mysql/enterprise/agent/libexec/mysql-monitor-agent --defaults-file=/opt/mysql/enterprise/agent/etc/mysql-mypi01-agent.ini --daemon --pid-file=/opt/mysql/enterprise/agent/mysql-mypi01-agent.pid
root     11576     1  0 10:50 ?        00:00:00 /opt/mysql/enterprise/agent/libexec/mysql-monitor-agent --defaults-file=/opt/mysql/enterprise/agent/etc/mysql-mypi02-agent.ini --daemon --pid-file=/opt/mysql/enterprise/agent/mysql-mypi02-agent.pid
root     11577 11576  1 10:50 ?        00:00:00 /opt/mysql/enterprise/agent/libexec/mysql-monitor-agent --defaults-file=/opt/mysql/enterprise/agent/etc/mysql-mypi02-agent.ini --daemon --pid-file=/opt/mysql/enterprise/agent/mysql-mypi02-agent.pid


Now check the MEM dashboard.

On the dashboard, now refreshed, we can see both servers mypi01 & mypi02.
Let's create their own group:
- Go to Settings tab - Manage Servers - Create Group button "RaspberryPiCluster"
- Add both servers by passing the mouse over the down-pointing arrow, "add to group" and select both servers.
Go back to the Monitor tab, and click on the RaspberryPiCluster group.

MySQL Enterprise Monitor with the Raspberry Pi MySQL Cluster group added.

Now we can see them specifically, let's enable the Cluster-specific advisor for this group.
- Go to Advisors tab - Add to Schedule - click on the "Cluster (10)" top level box, and then hit the "schedule" button just under "Current Schedule". Accept the default frequency, and we're now collecting cluster data.

with the Cluster specific Advisor enabled.




If we kill both ndbd processes on the datanodes, and then go back to the Monitor tab, we can see that it's providing Critical Eents on the first page, that "Cluster Has Stopped / Nodes Not Running"

Showing the Cluster down / stopped events.
Ok, so we're seeing some Cluster info on MySQL Enterprise Monitor 2.3.12.


to be continued...

Tuesday, 10 September 2013

From 2 Management nodes down to 1 (R.Pi, Cluster n Cream spin-off)

From my testing MySQL Cluster on the Raspberry Pi's I thought I'd share this little extract, just in case someone tries the same, some day.. somewhere.. why? I don't know.

Ok, so when we pull the plug on one of the pi's, we have of each component falling down, but because one of them is the arbitrator (node-id=2) then cluster falls over.

Before the 'accident':

  ndb_mgm -e show

Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=3    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0, Nodegroup: 0, Master)
id=4    @10.0.0.7  (mysql-5.5.25 ndb-7.3.0, Nodegroup: 0)

[ndb_mgmd(MGM)] 2 node(s)
id=1    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)
id=2    @10.0.0.7  (mysql-5.5.25 ndb-7.3.0)

[mysqld(API)]   4 node(s)
id=10   @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)
id=11   @10.0.0.7  (mysql-5.5.25 ndb-7.3.0)
id=12 (not connected, accepting connect from any host)
id=13 (not connected, accepting connect from any host)

Whoops, who pulled that plug?

Everything on mypi02 is instantly down, as seen by mgmt_node on mypi01:
  ndb_mgm -e show

Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=3    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0, Nodegroup: 0, Master)
id=4 (not connected, accepting connect from mypi02)

[ndb_mgmd(MGM)] 2 node(s)
id=1    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)
id=2    @10.0.0.7  (mysql-5.5.25 ndb-7.3.0)

[mysqld(API)]   4 node(s)
id=10   @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)
id=11 (not connected, accepting connect from mypi02)
id=12 (not connected, accepting connect from any host)
id=13 (not connected, accepting connect from any host)

but... a few seconds later:
  ndb_mgm -e show

Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=3 (not connected, accepting connect from mypi01)
id=4 (not connected, accepting connect from mypi02)

[ndb_mgmd(MGM)] 2 node(s)
id=1    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)
id=2    @10.0.0.7  (mysql-5.5.25 ndb-7.3.0)

[mysqld(API)]   4 node(s)
id=10 (not connected, accepting connect from mypi01)
id=11 (not connected, accepting connect from mypi02)
id=12 (not connected, accepting connect from any host)
id=13 (not connected, accepting connect from any host)

Check the mgmt_node log:
  vi /opt/mysql/mysql/mgmd_data/ndb_1_cluster.log
...
..
2013-09-04 14:06:44 [MgmtSrvr] WARNING  -- Node 3: Node 2 missed heartbeat 2
..
..
2013-09-04 14:07:02 [MgmtSrvr] ALERT    -- Node 3: Forced node shutdown completed. Caused by error 2305: 'Node lost connection to other nodes and can not form a unpartitioned cluster, please investigate if there are error(s) on other node(s)(Arbitration error). Temporary error, restart node'.

So the physical architecture is too limited. Let's limit the logical architecture to 1 Management Node:
  vi config.ini
--> comment out nodeid=2 entry in the [ndb_mgmd] section.
  vi my.cnf
--> remove mypi02:1186 from the ndb-connectstring entries in both sections [mysqld] & [mysql_cluster].

Now to start it all back up, but this time with --reload, to make sure the changes are accepted.

On mypi01:
  ndb_mgmd -f /usr/local/mysql/conf/config.ini --config-dir=/usr/local/mysql/conf --ndb-nodeid=1 --reload
  ndbd -c mypi01
On mypi02:
  ndbd -c mypi01
Check status, make sure the data nodes have started ok.
Then on mypi01:
  mysqld --defaults-file=/usr/local/mysql/conf/my.cnf --user=mysql &
On mypi02:
  mysqld --defaults-file=/usr/local/mysql/conf/my.cnf --user=mysql &

And to make sure we're all connected up fine:
From mypi02 (remember, with only 1 mgmtnode on mypi01 now):
  ndb_mgm -e show -c mypi01
Connected to Management Server at: mypi01:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=3    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0, Nodegroup: 0, Master)
id=4    @10.0.0.7  (mysql-5.5.25 ndb-7.3.0, Nodegroup: 0)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)

[mysqld(API)]   4 node(s)
id=10   @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)
id=11   @10.0.0.7  (mysql-5.5.25 ndb-7.3.0)
id=12 (not connected, accepting connect from any host)
id=13 (not connected, accepting connect from any host)

That's better.

Wednesday, 4 September 2013

Raspberry Pi, MySQL Cluster 'n' Cream.

Ok, so I've been playing around with the idea of setting up MySQL cluster on a couple of Raspberry Pi's and this is how it has been going.

References

First of all, for anyone else who's reading this, it's not a new thing, I know, and I highly recommend reading A.Morgans blog, http://www.clusterdb.com/mysql-cluster/mysql-cluster-running-on-raspberry-pi/ as well as someone else's blog: http://markswarbrick.wordpress.com/, cheers Mark.

So, to make it all possible, here's what I bought:
Product Model Quantity
Raspberry Pi - Model B (Made in UK, 512MB) RASPIB 2
16GB Samsung SD Card Pre-Loaded with NOOBS SAM-16 2
HDMI (Male) to DVI Converter (Female) HDM-DVF 1
New Link 4 Port USB Hub (USB 2.0 with Mains Adaptor) New Link USB 1
Noodle - USB to Micro USB Cable (2m Red) NOODLE-RED 2
RJ45 Cat5e Ethernet LAN Cable 2m (Blue) RJ45-BLUE 2
NETGEAR 8 Port Fast Ethernet Switch FS308-100UKS/M01 Netgear 1

as well as having other things lying around that came in useful.

So, when you've got it all set up, you'll need a USB keyboard & mouse, the HDM-DVI cable and a monitor / tv too.

Setup

Then I needed to set it all up, using the following software programs.
- SDFormatter_4e
- win32diskimager-v0.8-binary.zip
Basically, you don't really need to format the sd card or anything if you buy the NOOBS pre-loaded cards. But if you're reusing an old one, then it's handy to know. If, like me, you had to repartition and reformat an old card whilst testing, then make sure you've also bought a fair load of patience.

Using "2013-05-29-wheezy-armel.img" Raspbian soft-float image.
 I tried Pidora but 'cos of the soft-float issue, it wasn't working for MySQL Cluster.


- source: mysql-cluster-gpl-7.3.0.tar.gz (get it here: http://dev.mysql.com -> downloads -> MySQL Cluster (GPL) -> top left, just under Dev Zone/Downloads/Documentation tab bar, you'll see "Current" & "Archives". In Archives, you'll find the 7.3.0 source bundle.)
- MySQL patch for Bug 17637

Plugged in, turned on, engines running:

  login: pi
  password: raspberry (default password)

sudo raspi-config
- change hostname
- enlarge f/s
- set locale & TZ
- update
You might also want to change the 'pi' user password too.. for security.

Some additional o.s. config

 

Using USB as SWAP:

Originally done on a 4Gb sd card.. later I used a 16Gb one.. but hey.. it might come in handy..
:: as root
  free
  swapoff /swap01
  ls -lrt /dev/disk/by-uuid
Ok, it's a 1Gb pen drive.. but better than nothing:
  dd if=/dev/zero of=/dev/sda1 bs=1024 count=1000000
  mkswap /dev/sda1
  swapon /dev/sda1
  swapon /swap01
  free

SWAP on the SD card

When you've got a 16Gb sd card, try using a larger swpafile than the original / default 100M:
:: as root
  free
  dphys-swapfile swapoff
  vi /etc/dphys-swapfile  =>>  CONF_SWAPSIZE=2048
  dphys-swapfile setup
  dphys-swapfile swapon
  free

apt

And for apt-get to work:
  vi /etc/apt/apt.conf.d/01proxy
This means a new file creation:
  Acquire::http::proxy "http://my.proxy.address.com:80/";

NTP date & time

It's nice to have all members of a cluster with the right time, and as the Raspberry Pi hasn't got a built in clock (it's a hard life isn't it) let's use the network:
  vi /etc/ntp.conf
Comment out all the "server 0.debian.pool.ntp.org iburst" entries.
+ add "server pool.ntp.org" in that section.
+ add "restrict 192.168.1.2 mask 255.255.255.0 nomodify notrap" (with your own IP address) in the "Clients from this subnet.." section too.

Restart the service, to get it all working:
  service ntp stop
  service ntp start

or if the proxy doesn't let it sync properly, "date -s <right_date_in_right_format>" always helps. I did and up having to execute "apt-get install ntpdate --fix-missing" and then it started working properly.

MySQL Cluster compile time

Now that the environment is as we want it, it's time to get MySQL working.
Remember, Oracle currently does include ARM as a supported platform for MySQL Cluster so we'll have compile it ourselves. This means that there are some pre-req's we'll be needing.

Pre-requisites:

:: as root / using 'sudo' ('cos this is MY Pi and MY MySQL, I consider that there's no one more powerful than I' here so I'll be doing everything as root, that's if you don't mind. ;-) )

  apt-get install -y cmake
  apt-get install -y libncurses5-dev (it's "yum install ncurses-devel" on Pidora, btw).
  apt-get install -y openjdk-7-jdk


Following a typical Mysql install:

  groupadd mysql
  useradd -r -g mysql mysql
  tar zxvf mysql-cluster-gpl-7.3.0.tar.gz
  cd mysql-cluster-gpl-7.3.0

Download & apply MySQL patch for Bug 17637

Download: http://bugs.mysql.com/file.php?id=17637
Apply:       cd mysql-cluster../sql_common
                 patch -l -f --verbose -i mysql-va-list.patch client_plugin.c

eg.
  root@mypi01:/home/mysql/mysql-cluster-gpl-7.3.0/sql-common# patch -l -f --verbose -i mysql-va-list.patch client_plugin.c
  Hmm...  Looks like a unified diff to me...
  The text leading up to this was:
  --------------------------
  |diff -Naur mysql-5.5.16.orig/sql-common/client_plugin.c mysql-5.5.16/sql-common/client_plugin.c
  |--- mysql-5.5.16.orig/sql-common/client_plugin.c       2011-09-09 11:56:39.000000000 -0400
  |+++ mysql-5.5.16/sql-common/client_plugin.c    2011-10-16 23:00:00.708799138 -0400
  --------------------------
  Patching file client_plugin.c using Plan A...
  Hunk #1 succeeded at 228.
  Hunk #2 succeeded at 246.
  Hunk #3 succeeded at 290.
  Hunk #4 succeeded at 308.
  done

Compile time

As per the documentation, we need to execute 3 commands now. Please be warned, and review the time they took me with the Model B Raspberry Pi.

cmake .
# nohup cmake . > cmake_20130801.log &
    - Started 19:12, Ended 19:21 = 9 minutes.
make
# nohup make > make_20130801.log &
    - Started 19:39, Ended 00:39 = 5 hours.

make install
# nohup make install > make_install_20130802.log &
    - Started 00:47, Ended 00:49 = 2 minutes.

Total time compiling:
    5 hrs 11 mins.

Finish install procedures

setup the my.cnf file.

  cd /usr/local/mysql
  chown -R mysql:mysql .
  mkdir conf
  cd conf
  vi my.cnf

   ---- my.cnf ----
   [client]
   socket                          =/tmp/mysql.sock
 

   [mysql]
   prompt                          =\R:\m \d>\_
 

   [mysqld]
   ndbcluster
   # ndb-connectstring               =mypi01:1186,mypi02:1186
   ndb-connectstring               =mypi01:1186
   datadir                         =/opt/mysql/mysql/data
   user                            =mysql
   port                            =3306
   socket                          =/tmp/mysql.sock
   general-log                     =1
   log-output                      =FILE
   log-error                       =mypi02_cluster730.err
   slow-query-log                  =1
   max_connections                 =200
   innodb_log_buffer_size          =4M
   innodb_buffer_pool_size         =50M
   innodb_log_file_size            =10M
   innodb_flush_log_at_trx_commit  =2
   innodb_file_per_table           =1
   innodb_data_home_dir            =/opt/mysql/mysql/data
   innodb_data_file_path           =ibdata1:10M;ibdata2:10M:autoextend

   [mysql_cluster]
   # ndb-connectstring               =mypi01:1186,mypi02:1186
   ndb-connectstring               =mypi01:1186
   ---- EOF ----


  ln -s /usr/local/mysql/conf/my.cnf /etc/my.cnf
The sym linking is NOT obligatory by far, but I like making my housekeeping as easy as possible, and keep error exposure minimum, i.e. edit once. You'll see why later (DemoKit).

Careful here, as I was travelling around to different networks, I ended up setting the hostnames in all config files as the hostname, and modified the /etc/hosts file and commented out / uncommented the appropriate entry. REMEMBER THIS, should you move around with the raspberry pi 'n' cluster.

  scripts/mysql_install_db --user=mysql --datadir=/opt/mysql/mysql/data
  chown -R root .
  cd /opt/mysql/mysql/data
  chown -R mysql:mysql .

One last thing before we start it all up:

Setup the config.ini: 
  cd /usr/local/mysql/conf
  vi config.ini
   ---- config.ini ----
   [ndb_mgmd default]
   PortNumber                      =1186
   DataDir                         =/opt/mysql/mysql/mgmd_data

   [ndb_mgmd]
   NodeId                          =1
   HostName                        =mypi01

   #[ndb_mgmd]
   #NodeId                          =2
   #HostName                        =mypi02

   [ndbd default]
   noofreplicas                    =2
   DataDir                         =/opt/mysql/mysql/ndbd_data
   DataMemory                      =2M
   IndexMemory                     =1M
   DiskPageBufferMemory            =4M
   StringMemory                    =5
   MaxNoOfConcurrentOperations     =1K
   MaxNoOfConcurrentTransactions   =500
   SharedGlobalMemory              =500K
   LongMessageBuffer               =512K
   MaxParallelScansPerFragment     =16
   MaxNoOfAttributes               =100
   MaxNoOfTables                   =20
   MaxNoOfOrderedIndexes           =20

   [ndbd]
   NodeId                          =3
   HostName                        =mypi01
   DataDir                         =/opt/mysql/mysql/ndbd_data
   [ndbd]
   NodeId                          =4
   HostName                        =mypi02
   DataDir                         =/opt/mysql/mysql/ndbd_data

   [mysqld]
   NodeId                          =10
   HostName                        =mypi01
   [mysqld]
   NodeId                          =11
   HostName                        =mypi02

   [API]
   NodeId                          =12
   [API]
   NodeId                          =13
   ---- EOF ----

  export PATH=$PATH:/usr/local/mysql/bin


MySQL Cluster Startup

Management Node First (INITIAL) Startup

REMEMBER: if you get any errors here, double check your IP's and /etc/hosts.

I played around a little with this, and configured 2 management nodes, one on each Raspberry Pi. Although it is not going to give anyone any more HA, in fact, by pulling the plug on one of them, ndb_mgm operations can't be done, but I thought I'd leave it all in, just in case someone is interested and looking to use it later in an architecture that will provide HA (i.e. buys another 2 Raspberry Pi's for the sqlnode & management node).
So, that being said, on both the (temporarily 'both') ndb_mgm nodes (INITIAL only used on very first startup):
1st:
    ndb_mgmd --initial --config-dir=/usr/local/mysql/conf -f /usr/local/mysql/conf/config.ini
2nd:
    ndb_mgmd --config-dir=/usr/local/mysql/conf -f /usr/local/mysql/conf/config.ini
Once you've done all this and revisit / restart, INITIAL isn't needed.

Also, make sure you use the --ndb-nodeid= option for each management node, so no entries appears with 127.0.0.1 IP's.
eg.
  ndb_mgmd -f /usr/local/mysql/conf/config.ini --config-dir=/usr/local/mysql/conf --ndb-nodeid=1

NOTE:
Now, I'm going to make all contact via the management node on mypi01, i.e. now the commented out my.cnf:ndb-connectstring and the 2nd entry for config.ini:ndb_mgmd makes more sense. Apologies if it generates any complications.

Lets check it's working:
  ndb_mgm -c mypi01:1186 -e show
  ndb_mgm -e show
If we had both management servers up n running, we could do the following too:
  ndb_mgm -c mypi02:1186 -e show

Datanode startup (FIRST / 1st startup):
On mypi01:
  ndbd --initial
On mypi02:
  ndbd --initial

Let's check from mypi02:
  ndb_mgm -e show

On mypi01:
  mysqld --defaults-file=/usr/local/mysql/conf/my.cnf --user=mysql &
On mypi02:
  mysqld --defaults-file=/usr/local/mysql/conf/my.cnf --user=mysql &

Remember, security, on both sqlnodes: 
  mysqladmin -uroot password 'pass'
  mysql -uroot -ppass

So what do we get? This is what we'd see if we had both mypi01 & mypi02 as management nodes:
  ndb_mgm -e show

Connected to Management Server at: mypi01:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=3    @192.168.1.17  (mysql-5.5.25 ndb-7.3.0, Nodegroup: 0, Master)
id=4    @192.168.1.16  (mysql-5.5.25 ndb-7.3.0, Nodegroup: 0)

[ndb_mgmd(MGM)] 2 node(s)
id=1    @192.168.1.17  (mysql-5.5.25 ndb-7.3.0)
id=2    @192.168.1.16  (mysql-5.5.25 ndb-7.3.0)

[mysqld(API)]   4 node(s)
id=10   @192.168.1.17  (mysql-5.5.25 ndb-7.3.0)
id=11   @192.168.1.16  (mysql-5.5.25 ndb-7.3.0)
id=12 (not connected, accepting connect from any host)
id=13 (not connected, accepting connect from any host)


As our situation will be that we only have 1 managment node configured, this is what we'll really see:
  ndb_mgm -e show

Connected to Management Server at: mypi01:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=3    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0, Nodegroup: 0, Master)
id=4    @10.0.0.7  (mysql-5.5.25 ndb-7.3.0, Nodegroup: 0)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)

[mysqld(API)]   4 node(s)
id=10   @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)
id=11   @10.0.0.7  (mysql-5.5.25 ndb-7.3.0)
id=12 (not connected, accepting connect from any host)
id=13 (not connected, accepting connect from any host)

If you want to see what happens when I test the HA of Cluster by pulling the plug on half of the architecture, including the 1 of the 2 management nodes, have a look at the other post from-2-management-nodes-down-to-1.

For Demo Purposes: Getting the Java app "DemoKit" running

For demo's sake, I've used a home-made java app, "DemoKit" authored by Mario Beck
For it to work, an exported display is needed, for X windows, run from the sqlnode / datanode / management node.

From Windows, I use Cygwin.
 1. Install Cygwin (Release 6.8.2-4)
 2. execute "StartXwin" from Start - Cygwin menu. This allows remote access with an X windows to the MSWin desktop.
 3. then execute "Cygwin Bash shell". This allows us to communicate with the proper environment variables and execute X commands.
 4. set the display "export DISPLAY=:0.0"
 5. ssh -Y root@mypi01
    Once connected to mypi01, make sure DISPLAY is set: env | grep DISPLAY. Should say DISPLAY=localhost:10.0 or similar.
    execute x-terminal-emulator and an xterm style window opens.
    Test the X capability and run xpdf.
    Once in, if you do a "su -" to another user, X doesn't work, as it's the user you did the "ssh" with who's allowed to connect. Try reconnecting as that user.

Copy (WinSCP) the DemoKit.zip file to one of the pi's.
Left it in the ~root directory on mypi01.
unzipped, it leaves the DemoKit dir.

The java app "DemoKit":

Time to revamp demokit.cfg config, layout, etc.  :: Adapt for a nice layout, no overlaps, etc.
Getting load balance - Readonly error.  :: So update Connector J driver, just in case:
    /home/mysql/DemoKit/lib
    Copy mysql-connector-java-5.1.26-bin.jar into this dir.
    /home/mysql/DemoKit/nbproject
    Update the lines in "project.properties" using the old connector.

Sort out why the cluster part of the architecture wasn't working:
  :: After installing NetBeans and opening the project, and going into the code details, it was far simpler.
    Seeing that it does a ndb_mgm mypi01:1186 -t 1 -e "3 status" and get a 2 line result of a that particular nodeid. All without me telling it any particular PATH or anything...
  :: DemoKit.zip comes with a "ndb_mgm" "Mach-O executable i386" executable.
    Looks like this might come from an Apple / MacOS X platform, or similar.
    So, seeing as none of this was being detected, I renamed it to ndb_mgm.orig, and made a symbolic link to the platform specific binary, in this case, at ln -s /usr/local/mysql/bin/ndb_mgm /home/mysql/DemoKit/ndb_mgm:

root@mypi01:/home/mysql/DemoKit# ls -lrt
total 3012
drwxr-xr-x 2 mysql mysql    4096 Jun 22  2010 test
-rw-r--r-- 1 mysql mysql      82 Jun 22  2010 manifest.mf
-rwxr-xr-x 1 mysql mysql 3032612 Aug 31  2010 ndb_mgm.orig
-rw-r--r-- 1 mysql mysql    1621 Sep  3  2010 demokit-cluster4node.cfg
-rw-r--r-- 1 mysql mysql    3642 Aug  6 22:53 build.xml
drwxr-xr-x 3 mysql mysql    4096 Aug  8 06:52 src
drwxr-xr-x 2 mysql mysql    4096 Aug  8 07:56 ressources
-rw-r--r-- 1 mysql mysql    2177 Aug  8 08:18 demokit-original.cfg
drwxr-xr-x 3 mysql mysql    4096 Aug  8 08:22 dist
drwxr-xr-x 4 mysql mysql    4096 Aug  8 08:22 build
drwxr-xr-x 2 mysql mysql    4096 Sep  6 12:34 lib
drwxr-xr-x 3 mysql mysql    4096 Sep  6 12:37 nbproject
-rw-r--r-- 1 mysql mysql    2758 Sep  6 13:48 demokit.cfg
lrwxrwxrwx 1 root  root       28 Sep  6 13:52 ndb_mgm -> /usr/local/mysql/bin/ndb_mgm

Run it:
  :: as root:
cd /root/DemoKit
java -jar dist/DemoKit.jar

So, now that it's showing me all 5 components of Cluster, time to look at the High Availability scenarios.

DemoKit java app, deployed from mypi01, showing all MySQL Cluster components status.

Cluster robustness: Do we have a Cluster?

Now, let's test the HA properly. Considering nodeid=1 is the sole management_node (even though it's on the same server as ndbd nodeid=3 and mysqld nodeid=10.) we can kill any ndbd process as if it were a power failure:
Show them the java app DemoKit and that all 5 components are working fine.

Killing id=4 on mypi02:
  ps -ef | grep ndbd
  root      2300     1  0 15:58 ?        00:00:00 ndbd -c mypi01
  root      2301  2300  4 15:58 ?        00:00:16 ndbd -c mypi01

  kill -9 2301
  ps -ef | grep ndbd
  root      2368  2126  0 16:05 pts/0    00:00:00 grep ndbd

  
  ndb_mgm -e show
Connected to Management Server at: mypi01:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=3    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0, Nodegroup: 0, Master)
id=4 (not connected, accepting connect from mypi02)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)

[mysqld(API)]   4 node(s)
id=10   @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)
id=11   @10.0.0.7  (mysql-5.5.25 ndb-7.3.0)
id=12 (not connected, accepting connect from any host)
id=13 (not connected, accepting connect from any host)


Check NDBD log, on mypi02:
  vi + /opt/mysql/mysql/ndbd_data/ndb_4_out.log:
  2013-09-04 16:05:36 [ndbd] ALERT    -- Node 4: Forced node shutdown completed. Occured during startphase 0. Initiated by signal 9.

Check MYSQLD error log, on mypi02:
  vi + /opt/mysql/mysql/data/mypi02_cluster730.err:
  130904 16:05:37 [Note] NDB Binlog: Node: 4, down, Subscriber bitmask 00


Check NDB_MGMT log, on mypi01:
2013-09-04 16:05:36 [MgmtSrvr] ALERT    -- Node 3: Node 4 Disconnected
2013-09-04 16:05:36 [MgmtSrvr] INFO     -- Node 3: Communication to Node 4 closed
2013-09-04 16:05:36 [MgmtSrvr] ALERT    -- Node 3: Network partitioning - arbitration required
2013-09-04 16:05:36 [MgmtSrvr] INFO     -- Node 3: President restarts arbitration thread [state=7]
2013-09-04 16:05:36 [MgmtSrvr] ALERT    -- Node 1: Node 4 Disconnected
2013-09-04 16:05:36 [MgmtSrvr] ALERT    -- Node 3: Arbitration won - positive reply from node 1
2013-09-04 16:05:36 [MgmtSrvr] ALERT    -- Node 4: Forced node shutdown completed. Occured during startphase 0. Initiated by signal 9.
2013-09-04 16:05:36 [MgmtSrvr] INFO     -- Node 3: Started arbitrator node 1 [ticket=0aef0002012295fe]
2013-09-04 16:05:40 [MgmtSrvr] INFO     -- Node 3: Communication to Node 4 opened


Imagine we've recovered the NDBD nodeid=4, and now restart:
On mypi02:
  ndbd -c mypi01
  2013-09-04 16:14:40 [ndbd] INFO     -- Angel connected to 'mypi01:1186'
  2013-09-04 16:14:40 [ndbd] INFO     -- Angel allocated nodeid: 4


Status check:
  ndb_mgm -e show
Connected to Management Server at: mypi01:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=3    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0, Nodegroup: 0, Master)
id=4    @10.0.0.7  (mysql-5.5.25 ndb-7.3.0, Nodegroup: 0)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)

[mysqld(API)]   4 node(s)
id=10   @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)
id=11   @10.0.0.7  (mysql-5.5.25 ndb-7.3.0)
id=12 (not connected, accepting connect from any host)
id=13 (not connected, accepting connect from any host)


Cluster didn't go down when we lost half of our Node Group 0, i.e. 1 of our 2 datanodes.
Look at DemoKit monitor application:

Cluster didn't go down when we lost half of our Node Group 0, i.e. 1 of our 2 datanodes.

Taking a look at DemoKit monitor application:

Datanode 2 down
Datanode 2 starting up
All back to normal.

We've lost half our Data Centre: datanode & sqlnode

This time, unplugging mypi02.

So, someone has pulled half of the data centre (the datanode & sqlnode):

  ndb_mgm -e show
Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=3    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0, Nodegroup: 0, Master)
id=4 (not connected, accepting connect from mypi02)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)

[mysqld(API)]   4 node(s)
id=10   @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)
id=11 (not connected, accepting connect from mypi02)
id=12 (not connected, accepting connect from any host)
id=13 (not connected, accepting connect from any host)


And let's confirm from within the cluster:
On mypi01:
  mysql -uroot -p
  mysql> status;  <-- shows uptime is good, i.e. longer than since we lost mypi02.
  mysql> show global status like 'ndb_number_%';
  +--------------------------------+-------+
  | Variable_name                  | Value |
  +--------------------------------+-------+
  | Ndb_number_of_data_nodes       | 2     |
  | Ndb_number_of_ready_data_nodes | 1     |
  +--------------------------------+-------+


Now for a logical data test

With only 1 datanode active:
  mysql> create database clustertest;
  Query OK, 1 row affected (0.05 sec)

  mysql> use clustertest;
  Database changed
  mysql> create table ndbtest (i int) engine=NDBCLUSTER;
  Query OK, 0 rows affected (0.27 sec)

  mysql> insert into ndbtest () values (1), (2), (3), (4);
  Query OK, 4 rows affected (0.01 sec)
  Records: 4  Duplicates: 0  Warnings: 0

  mysql> select i from ndbtest;
  +------+
  | i    |
  +------+
  |    1 |
  |    4 |
  |    3 |
  |    2 |
  +------+
  4 rows in set (0.01 sec)


Remember, datanode nodeid=4 is down. Let's start it up.
  ndbd -c mypi01
  2013-09-04 16:34:06 [ndbd] INFO     -- Angel connected to 'mypi01:1186'
  2013-09-04 16:34:06 [ndbd] INFO     -- Angel allocated nodeid: 4

mysqld --defaults-file=/usr/local/mysql/conf/my.cnf --user=mysql &


Let's use sqlnode nodeid=11 on mypi02:
  mysql -uroot -p
  mysql> status;

uptime shows the number of seconds it's been running.

Now double check from within cluster:
  mysql> show databases;
we can see the newly created database "clustertest".

  mysql> use clustertest; select * from ndbtest;
  Database changed
  +------+
  | i    |
  +------+
  |    1 |
  |    4 |
  |    3 |
  |    2 |
  +------+
  4 rows in set (0.00 sec)


Let's complicate things:

Now to kill data node nodeid=3, without exiting the mysql session on the other sqlnode, with the freshly started datanode 4.
  ndb_mgm -e show
Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=3 (not connected, accepting connect from mypi01)
id=4    @10.0.0.7  (mysql-5.5.25 ndb-7.3.0, Nodegroup: 0, Master)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)
...
..

Just in case the data is cached, exit and login again:
  mysql -uroot -p clustertest -e ' use clustertest; select * from ndbtest;'
  +------+
  | i    |
  +------+
  |    3 |
  |    2 |
  |    1 |
  |    4 |
  +------+


Notice the rows have swapped around, i.e. the first half of the original data (numbers 1 & 4) is now the latter half, and viceversa.

Kill the Management node, with active sessions on the sqlnodes.

NOTE! That killing the management node will affect the DemoKit's ability to detect things...
Let's see (watch the time displayed):

sqlnode 10 (mypi01):
  16:19 clustertest> select * from ndbtest;
  +------+
 
  | i    |
  +------+ 
  |    3 |
  |    4 |
  |    2 |
  |    1 |
  +------+
  4 rows in set (0.01 sec)

sqlnode 11 (mypi02):
  16:18 clustertest> select * from ndbtest;
  +------+
  | i    | 
  +------+ 
  |    1 | 
  |    3 | 
  |    4 |
  |    2 | 
  +------+ 
  4 rows in set (0.01 sec)

Leaving those sessions there...

ps -ef | grep ndb_mgm
root      2181     1  3 14:41 ?        00:03:13 ndb_mgmd -f /usr/local/mysql/conf/config.ini --config-dir=/usr/local/mysql/conf/root      4590  2677  8 16:20 pts/2    00:00:01 ./ndb_mgm mypi01 -t 1 -e 4 status
root      4591  2677  8 16:20 pts/2    00:00:01 ./ndb_mgm mypi01 -t 1 -e 1 status
root      4598  2677  8 16:20 pts/2    00:00:00 ./ndb_mgm mypi01 -t 1 -e 3 status

kill -9 2181

ps -ef | grep ndb_mgm
root      4647  2677 11 16:21 pts/2    00:00:01 ./ndb_mgm mypi01 -t 1 -e 3 status
root      4651  2677  9 16:21 pts/2    00:00:00 ./ndb_mgm mypi01 -t 1 -e 1 status
root      4655  2677  9 16:21 pts/2    00:00:00 ./ndb_mgm mypi01 -t 1 -e 4 status


DemoKit shows:

No datanodes in the architecture, but they are still up and running, it's just that the ndb_mgm command fails to get in contact with the ndb_mgmd.

Back to our user sessions:

sqlnode 10 (mypi01):16:23 clustertest> select * from ndbtest;
+------+
| i    |
+------+
|    3 |
|    4 |
|    2 |
|    1 |
+------+
4 rows in set (0.01 sec)

sqlnode 11 (mypi02):
16:23 clustertest> select * from ndbtest;
+------+
| i    |
+------+
|    1 |
|    3 |
|    4 |
|    2 |
+------+
4 rows in set (0.01 sec)

Now, any command sent to the management node process will fail with:
  root@mypi01:~# ndb_mgm -e show
  Unable to connect with connect string: nodeid=0,mypi01:1186
  Retrying every 5 seconds. Attempts left: 2 1, failed.


We can exit and reconnect to the cluster:
  mysql -uroot -poracle clustertest -e "select sysdate(); select * from ndbtest"
  +---------------------+
  | sysdate()           |
  +---------------------+
  | 2013-09-06 16:27:03 |
  +---------------------+
  +------+
  | i    |
  +------+
  |    1 |
  |    3 |
  |    4 |
  |    2 |
  +------+


So the Cluster is still working...
mypi01:
  root@mypi01:~# ps -ef | grep ndbd
  root      4464     1  0 16:17 ?        00:00:00 ndbd
  root      4465  4464  6 16:17 ?        00:00:39 ndbd
  
  root@mypi01:~# ps -ef | grep mysqld
  mysql     2279  2159 12 14:42 pts/0    00:13:18 mysqld --defaults-file=/usr/local/mysql/conf/my.cnf --user=mysql

mypi02:
  root@mypi02:~# ps -ef | grep ndbd
  root      2143     1  0 15:43 ?        00:00:01 ndbd
  root      2144  2143  5 15:43 ?        00:02:16 ndbd

  root@mypi02:~# ps -ef | grep mysqld
  mysql     2190  2127  4 15:43 pts/0    00:02:05 mysqld --defaults-file=/usr/local/mysql/conf/my.cnf --user=mysql


Restart management node:
  ndb_mgmd -f /usr/local/mysql/conf/config.ini --config-dir=/usr/local/mysql/conf
  MySQL Cluster Management Server mysql-5.5.25 ndb-7.3.0


And see if we can check the Cluster status now:

root@mypi02:~# ndb_mgm -e show

Connected to Management Server at: mypi01:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=3    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0, Nodegroup: 0)
id=4    @10.0.0.7  (mysql-5.5.25 ndb-7.3.0, Nodegroup: 0, Master)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)

[mysqld(API)]   4 node(s)
id=10   @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)
id=11   @10.0.0.7  (mysql-5.5.25 ndb-7.3.0)
id=12 (not connected, accepting connect from any host)
id=13 (not connected, accepting connect from any host)

DemoKit displays the datanode & management node again. We're back up and running.