Tuesday 10 September 2013

From 2 Management nodes down to 1 (R.Pi, Cluster n Cream spin-off)

From my testing MySQL Cluster on the Raspberry Pi's I thought I'd share this little extract, just in case someone tries the same, some day.. somewhere.. why? I don't know.

Ok, so when we pull the plug on one of the pi's, we have of each component falling down, but because one of them is the arbitrator (node-id=2) then cluster falls over.

Before the 'accident':

  ndb_mgm -e show

Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=3    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0, Nodegroup: 0, Master)
id=4    @10.0.0.7  (mysql-5.5.25 ndb-7.3.0, Nodegroup: 0)

[ndb_mgmd(MGM)] 2 node(s)
id=1    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)
id=2    @10.0.0.7  (mysql-5.5.25 ndb-7.3.0)

[mysqld(API)]   4 node(s)
id=10   @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)
id=11   @10.0.0.7  (mysql-5.5.25 ndb-7.3.0)
id=12 (not connected, accepting connect from any host)
id=13 (not connected, accepting connect from any host)

Whoops, who pulled that plug?

Everything on mypi02 is instantly down, as seen by mgmt_node on mypi01:
  ndb_mgm -e show

Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=3    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0, Nodegroup: 0, Master)
id=4 (not connected, accepting connect from mypi02)

[ndb_mgmd(MGM)] 2 node(s)
id=1    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)
id=2    @10.0.0.7  (mysql-5.5.25 ndb-7.3.0)

[mysqld(API)]   4 node(s)
id=10   @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)
id=11 (not connected, accepting connect from mypi02)
id=12 (not connected, accepting connect from any host)
id=13 (not connected, accepting connect from any host)

but... a few seconds later:
  ndb_mgm -e show

Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=3 (not connected, accepting connect from mypi01)
id=4 (not connected, accepting connect from mypi02)

[ndb_mgmd(MGM)] 2 node(s)
id=1    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)
id=2    @10.0.0.7  (mysql-5.5.25 ndb-7.3.0)

[mysqld(API)]   4 node(s)
id=10 (not connected, accepting connect from mypi01)
id=11 (not connected, accepting connect from mypi02)
id=12 (not connected, accepting connect from any host)
id=13 (not connected, accepting connect from any host)

Check the mgmt_node log:
  vi /opt/mysql/mysql/mgmd_data/ndb_1_cluster.log
...
..
2013-09-04 14:06:44 [MgmtSrvr] WARNING  -- Node 3: Node 2 missed heartbeat 2
..
..
2013-09-04 14:07:02 [MgmtSrvr] ALERT    -- Node 3: Forced node shutdown completed. Caused by error 2305: 'Node lost connection to other nodes and can not form a unpartitioned cluster, please investigate if there are error(s) on other node(s)(Arbitration error). Temporary error, restart node'.

So the physical architecture is too limited. Let's limit the logical architecture to 1 Management Node:
  vi config.ini
--> comment out nodeid=2 entry in the [ndb_mgmd] section.
  vi my.cnf
--> remove mypi02:1186 from the ndb-connectstring entries in both sections [mysqld] & [mysql_cluster].

Now to start it all back up, but this time with --reload, to make sure the changes are accepted.

On mypi01:
  ndb_mgmd -f /usr/local/mysql/conf/config.ini --config-dir=/usr/local/mysql/conf --ndb-nodeid=1 --reload
  ndbd -c mypi01
On mypi02:
  ndbd -c mypi01
Check status, make sure the data nodes have started ok.
Then on mypi01:
  mysqld --defaults-file=/usr/local/mysql/conf/my.cnf --user=mysql &
On mypi02:
  mysqld --defaults-file=/usr/local/mysql/conf/my.cnf --user=mysql &

And to make sure we're all connected up fine:
From mypi02 (remember, with only 1 mgmtnode on mypi01 now):
  ndb_mgm -e show -c mypi01
Connected to Management Server at: mypi01:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=3    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0, Nodegroup: 0, Master)
id=4    @10.0.0.7  (mysql-5.5.25 ndb-7.3.0, Nodegroup: 0)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)

[mysqld(API)]   4 node(s)
id=10   @10.0.0.6  (mysql-5.5.25 ndb-7.3.0)
id=11   @10.0.0.7  (mysql-5.5.25 ndb-7.3.0)
id=12 (not connected, accepting connect from any host)
id=13 (not connected, accepting connect from any host)

That's better.

No comments:

Post a Comment