Alter tables with MySQL replication

I couldn’t find anything particularly useful about how to alter MySQL table structure in a situation where replication was being performed so I pretty much had to try something out to see if it worked.

What I found was that MySQL was remarkably clever at replicating changes through from one database to another. I don’t know whether just running a straight ‘ALTER TABLE’ command would work or not but I took the added precaution of running a ‘STOP SLAVE’ command (on the slave) first. Then, once the new column had been added to the master, running ‘START SLAVE’ kicked it off and the database automatically picked up where it had left off by propagating the table alteration and then updating the data.

Replication in MySQL from the manual

TRUNCATE TABLE on MySQL InnoDB databases

Having come up against the extremely poor performance of using TRUNCATE instead of DELETE on MySQL InnoDB tables (see previous post MySQL Truncate slow performance problems) I thought I better come up with a solution that didn’t mean leaving a table to clear for an hour.

The solution is to use a combination of SHOW CREATE and DROP. DROPping a table is very quick indeed, so as long as you have the CREATE code to hand then it’s a simple matter to empty a table. The main thing to watch out for with InnoDB tables is foreign key constraints which are easily disabled.

Some sample code to use this from within PHP is shown below

function truncateTable($tableName)
{
   //Grab the code to create the table
   $sql = "show create table " . $tableName;
   $dataSet = DataHandler::loggedDbQuery($sql);
   $result = $dataSet->fetchRow();
   $createSQL = $result["Create Table"] . ";
        SET FOREIGN_KEY_CHECKS=1;";
   //Drop the table. We have to disable foreign key
   //checks, which means running the whole thing
   //from the command line
   $sql = "SET FOREIGN_KEY_CHECKS=0;
      drop table " . $tableName . ";".
      $createSQL;
   DataHandler::multipleDbQueries($sql);
}

This is used in conjunction with a static method I’ve created to run a standard (single) SQL query from within PHP called DataHandler::loggedDbQuery (which works with PearDB, which is where the fetchRow() method comes from) and a multi-line query function I have developed and wrote about in Multiple SQL queries using MySQL and PHP and referred to as DataHandler::multipleDbQueries($sql).

Multiple SQL queries using MySQL and PHP

Something that I’ve had problems with using MySQL/PHP is the limitation of only being able to run one line of SQL at a time. Using something such as Microsoft SQL Server it’s possible to write multiple lines of SQL and run it all in a single database call. Until stored procedures (in MySQL 5) are available (i.e. when it seems that the database engine is ready for a live environment) I’ve put together the following static method ‘hack’ using PHP’s exec() function (assuming you’re using PHP 5’s object syntax – otherwise just paste the code into a regular function):

class MySQLInterface
{
   public static function multipleDbQueries($sql)
   {
      $file = fopen(TEMP_CSV_LOCATION .
         "temp_query.sql","a+");
      fwrite($file,$sql);
      fclose($file);
      exec("mysql -u " . DB_USERNAME .
         " --password=='" . DB_PASSWORD .
         "' " . DB_NAME . " < " .
         TEMP_CSV_LOCATION .
         "temp_query.sql");
   }
}

What this allows you to do is to pass in a SQL string and have it executed as if it was being run from the command line. This is especially useful if you need to disable foreign keys for some reason. e.g

$sql = "SET FOREIGN_KEY_CHECKS=0;
   drop table oldTable;
   SET FOREIGN_KEY_CHECKS=1;"

MySQLInterface::multipleDbQueries($sql)

Downsides of this are

  • Requires command line access to mysql
  • Liable to SQL injection

Since i’m working within an internal system I have control over both of these and the code seems to work particularly well.

Optimising MySQL a query with packed keys

I’ve been learning more about MySQL lately and particularly optimising
SQL queries on large tables. Large, in this case, being at the moment
hundreds of thousands of rows but soon to be millions. One of the
problems I’ve had is that MySQL sometimes decides not to use an index
even when a handy one seems to have been created for it. The root of
this appears to be to that with B-tree indexes if there are a large number of records with similar looking values then the MySQL engine may decide that it’s just as much effort using the index as to search the whole table.

The answer appears to be adding PACK_KEYS = 1 to the end of a create
table, or running the SQL command ALTER TABLE MyTable PACK_KEYS = 1 once
the table has been created. In effect, this takes account of the
similarity of adjacent keys. In our case we have a large column of field
type bigint(21) where the starting digits of the index are timestamp
generated. So, at present, we end up with a few tens of thousand rows
all starting with 108xx. Enabling packed keys means not only that the
index is smaller as MySQL only needs to store the differences between
keys (plus an extra byte to keep track of where the similarity starts)
but also that the index is actually of some use i.e. doesn’t become a
large, flat structure.

One down side of using packed keys is that inserts are slower, but given
that the system we are building is inserting each row once and then (in
theory) never touching it again that’s a small price to pay. The other major drawback, however, is that packed keys only works on MyISAM tables at present and not InnoDB. This actually isn’t much use to me as the large inserts we occassionally have to do would end up with MyISAM locking the table for perhaps an hour or more.

Pack keys
reference in the MySQL manual

MySQL Truncate slow performance problems

I was having problems with a MySQL TRUNCATE taking a long time on a very
large table (with foreign keys). I had thought that TRUNCATE ran more
quickly than DELETE but, according to the MySQL manual, that isn’t the
case with InnoDB tables. In this case there is no difference between
TRUNCATE and DELETE and it’s recommended to drop the table and then
re-create it. That sounds like a very high risk operation to me, but
given that the TRUNCATE statement is taking an hour and a half to run
then it looks like I’ve got some code to write.

TRUNCATE in the MySQL documentation

MySQL load data infile and foreign key constraints

I’ve been up against a(nother) MySQL 4/PHP 5 ‘not doing all I want it
to’ type problem. This time, it involved trying to do a bulk update on
potentially tens of thousands of lines. Now, I don’t want to do an
INSERT for each line and I’ve already written some classes to handle
outputting data objects into csv files which handles all of the primary
keys. There’s an option with LOAD DATA INFILE to REPLACE values where
primary key conflicts would occur so I thought if I used that with
existing rows then I should be okay.

What I have found, though, is that LOAD DATA INFILE with REPLACE appears
to try to delete the row and then re-insert it. This breaks other
foreign constraints and so the database engine won’t let it. Fair
enough, but it would have been nice if the manual mentioned it.

So: next step is to disable the foreign keys temporarily with SET
FOREIGN_KEY_CHECKS=0. From the command line this works fine, but it
seems that within PHP the foreign key checks are re-enabled
automatically after each command is run, which means the net result of
running SET FOREIGN_KEY_CHECKS=0 is absolutely nothing. So when I try
and to the LOAD DATA INFILE the constraints are still there and it
doesn’t work.

If we trusted MySQL 5 fully yet (and wanted to go through the pain of
installing it) I could probably write a stored procedure, with whatever
headaches that would bring, but as it is I’m stuck with MySQL 4 for the
moment. The solution I’ve come up with is to write the SQL query to a
text file and then use an exec() command from within PHP to execute the
commands in the file. I don’t like it, and it doesn’t feel very
portable, but it seems to work. In fact, it seems so handy that I’ve
created a static method so I can do it with any batch of SQL queries.

The handy class follows (with constants in place of the database
names/connection values/temporary folder):

public static function multipleDbQueries($sql)
{
 $file = fopen(TEMP_CSV_LOCATION .
  "temp_query.sql","a+");
 fwrite($file,$sql);
 fclose($file);
 exec("mysql -u " . DB_USERNAME .
  " --password=='" . DB_PASSWORD .
  "' " . DB_NAME . " < "
  . TEMP_CSV_LOCATION . "temp_query.sql");
}