Thursday, July 26, 2012

Tips and Tricks to Optimize MySQL

http://www.openlogic.com/wazi/bid/195905/Tips-and-Tricks-to-Optimize-MySQL


Databases tend to grow over time as they store more and more information. To ensure your database information can be accessed in the shortest time possible, you need to learn a few tricks to speed up data retrieval. For MySQL databases, you can use indexes and partitions to limit the amount of data that MySQL has to traverse to fetch query results, and use some other optimization tricks to further improve performance.
For a specific type of frequently accessed data, creating an index is the best way to speed things up. For example, if you have a table with 20 columns, of which one column is frequently accessed, you can create an index for that column to speed up lookups on it. If there are no indexes, MySQL performs a full table scan to retrieve data. It examines all the records in the table, one after another, until either the last record is read or the query is satisfied. With an index, however, MySQL can look up data without having to read each record of the table. As soon as a match is found in the index, you're pointed to the data in the actual table. Think of it like an index in a book. When looking for a particular topic, you can either flip through the pages until you reach the end of the book or find what you're looking for, or you can visit the index to reveal the page number for the topic you're interested in.
When you create an index for a field, MySQL collects all the information in that particular column, sorts it, and then stores it in a unique object or file, separate from the table, along with references to the original table with the unsorted data.
Maintaining indexes does require additional resources. If you create an index for a table that is frequently updated, all the DELETE, INSERT, and UPDATE statements must also update the index, in addition to the data table itself. Frequent alterations on indexes place an additional load on the server and may slow down other important processes, so you shouldn't create indexes for all fields, but only for ones that are often queried for information, and where the table is large enough (with several thousands of rows or more) to necessitate faster retrieval.
The way the indexing works also depends on the type of storage engine you use to create the tables for your MySQL database. Up until version 5.5, MyISAM was the default storage engine, but InnoDB has taken its place as the default as of MySQL 5.5 – though you can define the default storage engine in the my.cnf file or manually specify the engine to use when creating tables.
On top of this, there are several different types of indexes that you can use, such as B-Tree or Hash. Your choice of index depends on the storage engine you use.

Using Indexes

You can create an index for a table while creating the table itself, or create an index for existing tables. The following code creates a new table with an index on one of the columns:
CREATE TABLE records (
     name VARCHAR(50), 
     age INT,
     id_num INT, INDEX (id)
        		
)
This creates an index named id for the id_num column on the table. For an existing table, the command would be CREATE INDEX id ON records(id_num).
To simultaneously create an index on multiple columns for an existing table, use a command like ALTER TABLE records ADD INDEX id(id_num), ADD INDEX name(name);.
As MySQL is unaware of the exact nature of data stored in a field, other than the data type (whether CHAR or INT), it will create an index using the full length of the data. So, if you have a table of songs, for instance, and you create an index on the 50-character name field, MySQL will store the entire length of the name for every song in the database. In a case like this, it might be wiser to store only the first 20 characters in the index, instead of the full length; that should give you enough characters to help you distinguish between songs. For large databases with very long columns, it would be a waste of space to store the entire name in the index.
You can restrict the length of the data to 20 characters by specifying it along with the name of the column, like so:
ALTER TABLE records ADD INDEX id(id_num), ADD INDEX name(name(20));
An improperly configured index can slow down a server by hoarding its resources, so you need to have a firm grasp of the underlying technology, particularly if you're using the InnoDB storage engine. Take some time to familiarize yourself with the intricacies of InnoDB and how it works with primary key and secondary key indexes.

Partition MySQL Tables

The second trick, using partitions, is ideal for very large tables with several hundred thousand rows or more. While you can also partition tables with only several thousand rows, you will notice a remarkable difference in lookup times for the larger tables.
You can divide either a table's rows (called horizontal partitioning) or columns (vertical partitioning) into partitions. When you do so, instead of traversing the entire table, MySQL queries only the relevant partition, which greatly reduces the lookup times.
A partition can have any number of rows or columns, but you should try to keep partitions relatively small compared to the size of the table if you want to see performance gains. Each table can be divided into 1,024 partitions at most.
Before you try to partition a table, first check whether your version of MySQL was built with partitioning support. When compiling MySQL from source, you need to use the -DWITH_PARTITION_STORAGE_ENGINE option to build partitioning support; the MySQL software package distributed in the repositories of most distributions comes with the partitions option built-in. To check, run SHOW PLUGINS; at the MySQL prompt, which will print a tabular list of all plugins and their status. Make sure that partition is listed as one of the plugins; if it isn't, you can't use the partitioning feature of MySQL.
You can create partitions when creating a table, or alter an existing table to partition its data set. You can create various types of partitions, such as RANGE, LIST, and HASH. The most commonly used is Range, where each partition is defined to accept a specific range of values for some column. What partition suits your table best will depend on the type of data stored in the table.
You must use the PARTITION BY keyword to define how the partitions are to be effected. So, to create a partition based on a range of data such as year, use the PARTITION BY RANGE(YEAR(order_date)) clause, like so:
CREATE TABLE employees ( 
order_date DATETIME NOT NULL, 
-- Other columns omitted 
) PARTITION BY RANGE(YEAR(order_date)) ( 
PARTITION p_2009 VALUES LESS THAN (2009), 
PARTITION p_2010 VALUES LESS THAN (2010), 
PARTITION p_all VALUES LESS THAN MAXVALUE ); 
This command will create three partitions on the table employees, one for each of the specified years. Here, the first partition holds data for all years up to 2009, while the last partition, p_all, holds all the records that are left over after the other records are partitioned. The data is partitioned as soon as it matches a partitioning criterion, so no two partitions can have the same data.
When querying data from a partitioned table, you don't need to specify the partition; the query remains the same whether or not you use partitions. MySQL will automatically deduce which partition to retrieve the information from.
As with indexes, there's a small performance penalty with partitioning. When you insert new rows into a partitioned table, MySQL has to determine which partition the new data should go into, and this requires resources. While this additional drain on resources isn't necessarily crippling, you have to take it into account when creating partitions. Depending on the size of the table, it might be wise to use dozens of partitions, but once you start reaching into the hundreds, take due care that using partitions itself doesn't affect performance.

Identify Slow Queries

One of the best thing about MySQL is that it can help you identify the queries that take up the most resources via its slow-query-log.
MySQL doesn't record slow queries by default; running mysqladmin var | grep log_slow_queries in the terminal will confirm this for you. To enable the option, edit the my.cnf configuration file and add these lines under the [mysqld] block:
long_query_time = 1
log-slow-queries = /var/log/mysql/mysql-slow-query.log
The default long_query_time is 0, but we've set it to 1 here to make MySQL log all queries that take longer than 1 second to execute. Feel free to change this value, as well as the location for the log file. When you're done, you need to restart the MySQL server with the command /etc/init.d/mysql restart for the changes in the my.cnf file to take effect.
You can then use the mysqldumpslow command-line tool, included with MySQL, to get a summary of the slow-query-log file. If you see a list of queries that are taking too much time to execute, you can try to find out why.
Explain is one of the most widely used MySQL query analysis tool available. When used with a SELECT statement, it reveals a wide array of information, such as the number of tables involved in the statement, how the data is looked up, if there are any subqueries, whether any indexes are used, and much more. You can run Explain on each of of the slow queries to determine why the statements are slow. Once you know the causes, you should be able to figure out how to fix the problems.

Fix That Configuration File

One final tip: MySQL Tuner is a Perl script that you can run to test your MySQL performance. Based on the results, and other observations, the script will then advise what changes you can make in the configuration file to improve performance.
At the terminal, type wget http://mysqltuner.com/mysqltuner.pl to download the script. Make the script executable with the command chmod +x mysqltuner.pl. Run it by typing ./mysqltuner.pl; you will be prompted for your MySQL root password.
The output from the script is divided into sections. The General and Storage Engine Statistics at the top tells you the number of tables in each of the different storage engines. Performance Metrics lists parameters the script checks the system for, such as slow queries, memory usage, and temporary tables. Finally, the Recommendations section lists all the variables that you need to adjust in the configuration files.
By using some of these performance and optimization tricks, you ought to be able to speed up MySQL. For best results, read the documentation carefully before implementing any of these techniques.

No comments:

Post a Comment