Entries tagged with “database”.


If you don’t know what memcache is, the official site says the following:

“memcached is a high-performance, distributed memory object caching system, generic in nature, but intended for use in speeding up dynamic web applications by alleviating database load.”

Basically, what this means is that memcached is a daemon running on a server which allows you to save and retrieve variables in the server memory. This is primarily used to ease database load on websites and applications. The logic for doing something like this is…

- Before running a query, check if the result is available in memcache.
- If the result is in memcache, return the cached result.
- If not, run the query on the database, and store the result in memcache.

To install memcached, simply run the following command:

sudo apt-get install memcached

Once it’s installed, edit /etc/memcached.conf and change the line beginning ‘-m’ which is the amount of memory (in megabytes) to allocate to the server. You can also change the IP address that the server listens on  in the line beginning ‘-l’.

Now restart the daemon by running

/etc/init.d/memcached restart

Now you have memcache set up and running on your server. A great feature of memcached is that you can easily cluster servers. If you want to do this, simply install memcache on your other servers before continuing.

At this point, I’d recommend downloading memcache.php which is a php script showing you a lot of useful information about your memcached servers. Once downloaded, put it in a web facing directory on your server, and modify the $MEMCACHE_SERVERS array with your server addresses.

Using memcache in php scripts is very easy. There are either procedural, or object oriented functions already available. Here is an example of a script which will store a simple variable and then retrieve it and display it.

<?php
$memcache = new Memcache;

$memcache->addServer(’10.2.0.245′, 11211) or die (“Could not connect”);
$memcache->addServer(’10.2.0.249′, 11211) or die (“Could not connect”);

$memcache->set(‘mytestvariable’, “this is the data in my test variable”, false, 60) or die (“Unable to save the data to the server”);
echo “Data has been stored in the cache<br />”;

$result = $memcache->get(‘mytestvariable’);
echo “Retrieved data from the server:<br/>”;

var_dump($result);

?>

Obviously if you only have one server then only use one addServer line.

The script above will store the test variable in the cache for 60 seconds. If you drop the timeout down to something like 1 second, and then sleep for 3 seconds before attempting to get the data, you will find that the data has expired and nothing will be returned.

It’s also useful to know that you can store anything which can be serialized in memcache. This means it’s safe to store things like arrays in the cache without having to ‘flatten’ the data beforehand.

Obviously how and where you implement a cache is entirely dependant on how your system works. If you have a lot of intense database usage then you will find that even caching with a short timeout will vastly reduce the amount of database queries. You should find that memcache is considerably faster than normal database access, and even faster than the mysql query cache. You also have total control over what data is cached, and how long it is cached for before your queries fall back to checking a database.

Some further reading:
Facebook engineering blog – How facebook use memcache
Fotolog case study

Also, it’s worth checking out the PHP manual page.

This guide explains how to turn standard debian or ubuntu mysql-server installs into a full ndb cluster. As we’re using the standard mysql-server package, you won’t need to download any non .deb binaries or do any compiling. If everything goes well, this should take under 10 minutes to get working. I don’t plan to cover much theory here, as it is just a guide to getting the server up and running.

Please note that clustering works differently in MySQL 5.1, so this guide may not be relevant for that.

I also recommend the book ‘MySQL Clustering’ by Alex Davies and Harrison Fisk (ISBN 0-672-32855-0) as it explains how the cluster works in a lot more detail than I plan to here.

I will explain how to set up 3 servers, with the following roles:

server-a = management node (IP 10.1.0.10)
server-b = storage and sql node (IP 10.1.0.11)
server-c = storage and sql node (IP 10.1.0.12)

It is easy to have storage and sql nodes on separate servers, but to make this guide easier to follow (and so we don’t need 5 machines), I’ll use 3 servers.

All servers will need to have mysql-server installed. If you need help doing this, have a look at this guide.

Setting up the management node (10.1.0.10)

Default debian/ubuntu installs look for /etc/mysql/ndb_mgmd.cnf for the management node, so we need to create this file, with the following contents:

[NDBD DEFAULT]
NoOfReplicas=2
DataDir= /var/lib/mysql-cluster

# Management Node
[NDB_MGMD]
HostName=10.1.0.10
DataDir=/var/lib/mysql-cluster

# Storage Nodes (one for each node)
[NDBD]
HostName=10.1.0.11
DataDir=/var/lib/mysql-cluster
[NDBD]
HostName=10.1.0.12
DataDir=/var/lib/mysql-cluster

# SQL Nodes (one for each node)
[MYSQLD]
HostName=10.1.0.11
[MYSQLD]
HostName=10.1.0.12

The line beginning NoOfReplicas tells the cluster how many copies of data should be kept.

Now we start the management node, and it will sit waiting for connections from the storage and SQL nodes:

/etc/init.d/mysql-ndb-mgm start

Setting up SQL and data nodes

All we need to do here is make a few changes to /etc/mysql/my.cnf (the mysql config file)

First, add the following 2 lines (using the IP of your management node) inside the [mysqld] section of the config file:

ndbcluster
ndb-connectstring=10.1.0.10

And near the bottom of the file there will be a section for [MYSQL_CLUSTER] which you will need to uncomment, and change the ndb-connectstring line to your management nodes IP.

If you are using separate data and sql nodes, the [mysqld] part is relevant to the sql nodes only, and the [MYSQL_CLUSTER] part is relevant to the data nodes only.

Before we start the services, we have to create the /var/lib/mysql-cluster directory and set it to be owned by the mysql user:

mkdir /var/lib/mysql-cluster
chown mysql:mysql /var/lib/mysql-cluster

Now we need to start the node services:

/etc/init.d/mysql restart
(which starts the sql node)
/etc/init.d/mysql-ndb restart
(which starts the data node)

By now, everything should be running, so we connect to the management node (by running ndb_mgm from the command line) and check that the other nodes have connected properly (using the show command):

ndb_mgm> show;
Cluster Configuration
———————
[ndbd(NDB)]     2 node(s)
id=2    @10.1.0.11  (Version: 5.0.51, Nodegroup: 0)
id=3    @10.1.0.12  (Version: 5.0.51, Nodegroup: 0, Master)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @10.1.0.10  (Version: 5.0.51)

[mysqld(API)]   2 node(s)
id=4    @10.1.0.11  (Version: 5.0.51)
id=5    @10.1.0.12  (Version: 5.0.51)

This shows that everything has connected properly. If connections are missing, then it’s worth checking /var/log/syslog on the affected server to see if there are any error messages.

Using the cluster

It’s worth noting that any databases already on the servers will continue to work as before. Tables only become part of the cluster when their engine type is changed to ‘ndbcluster’ by issuing this command (from a mysql prompt):

alter table tablename engine=ndbcluster;

But for now we will create a new database and clustered table, and test that the data is clustered. The cluster setup applies to tables only, not databases, so we first need to create the database on both sql nodes:

create database cluster;

Now, when we create a table inside the cluster database, as long as the engine is ndbcluster, the data will be synced across data nodes, which we can test by doing the following (from a mysql prompt on either sql node):

create table clustertest (i int) engine=ndbcluster;
insert into clustertest () values (1);
select i from clustertest;

Which should return the single row with the value of 1. Now connect to the other SQL node and try…

insert into clustertest () values (2);
select i from clustertest;

Which should return both rows, which will happen whichever SQL node you connect to as the table is now stored in the cluster.

If you receive an error like:

ERROR 1015 (HY000): Can’t lock file (errno: 4009)

Then it is likely that some of your cluster nodes haven’t started correctly, so its worth checking the management interface again.

Shutting down and restarting the cluster

To shutdown the data nodes and management node, all you need to do is enter the command ‘shutdown’ in the management interface.

To restart, simply run

/etc/init.d/mysql-ndb-mgm start

On the management node, and

/etc/init.d/mysql-ndb start

On the data nodes. The SQL nodes continue running, and can be stopped/started using the standard mysql init script.

That’s it, you’ve now should have a working MySQL cluster. As you have NoOfReplicas=2 in the management config, you should be able to unplug either data node at any time and still have access to all of the clustered tables.

My main reason for posting this is completeness (so I’ve got guides for everything you need for a basic hosting server here), not because it’s difficult to do.

Getting mysql server installed and running is simple to do:

apt-get install mysql-server

Once this is done, you will probably want to set the root password, or add an admin user:

mysql
grant all privileges on *.* to ‘username’@'host’ identified by ‘password’ with grant option;

Obviously use your own username, host, and password. To allow connections from any host, you can use ‘%’

Next, I usually change the config so MySQL is listening on all available IP addresses (instead of just localhost). This is done by commenting out the ‘bind-address’ line in /etc/mysql/my.cnf and restarting the server:

/etc/init.d/mysql restart

That’s it, you’ve now got a working MySQL server.