Showing posts with label MySQL DBA best practices. Show all posts
Showing posts with label MySQL DBA best practices. Show all posts

Thursday, May 28, 2009

The Big Misperception by new MySQL Customers

When meeting with MySQL customers I find they usually have some very strong misperceptions of using the free version (Community Version) of MySQL and that they have no need to purchase the Enterprise Version of MySQL.

First of all, most new MySQL users do not install, configure, tune or manage the MySQL database server properly. MySQL is the easiest database server in the world to install, but if used in a production environment it needs to be set up properly. When an organization purchases an Oracle license for six or seven figures they understand they had better get the appropriate expertise to manage their Oracle database server. When downloading MySQL for free, they think anybody can manage the MySQL server because it is free and anybody can install it. There is a big difference between installing and playing with a MySQL database server and running it in a production environment.

MySQL is extremely popular with web applications. Often traditional OLTP and DW DBAs are not familiar with architectures surrounding the MySQL database server or how web applications need to be configured differently than a traditional OLTP database or data warehouse application.

When using the MySQL server in a production environment it needs to be configured properly for it to run right. Problems new customers can have when using MySQL:
  • Database server running slow, hanging and possibly crashing because it was not configured properly for the production server environment it will be running in.
  • Batch operations taking 40 - 60 hours when they should be taking 20 - 60 minutes because the database server is not tuned properly.
  • High cost of using MySQL because customers do not understand all the features of MySQL.
  • Not using MySQL for the right type of application or using the wrong architecture for their MySQL solution.
New customers talk about how they do not need to purchase the Enterprise version of MySQL and how support is not necessary. MySQL support is arguably the best support in the IT industry and MySQL support people are well recognized experts in MySQL. MySQL support people are not call takers, they are experienced experts in MySQL. So often one support can justify the cost of purchasing a MySQL Enterprise License.

When customers ask if they should be purchasing the MySQL Enterprise edition, I ask them these following questions.
  1. If your database server crashes and it costs your company money or impacts your business you should be looking at support. Production database servers should be running in a supported environment.
  2. Do you have the expertise to configure and tune your database server?
  3. Do you understand the ramifications of how you are configuring MySQL, the architecture it will run under and what features you are using? If not, you could be setting yourself up for problems down the road?
I guess the question is, can your business afford to run the free version of MySQL or should you be purchasing the supported Enterprise release of MySQL.? It's so funny, because no one would every run an important Oracle database server without some type of support. So I always wonder why a customer would run the community version of MySQL and not the supported version of MySQL for critical applications?

Monday, April 27, 2009

Installing MySQL 5.1 on Solaris using MOCA

Introduction

The following instructions will lay out an installation of MySQL on Solaris using the MySQL Optimal Configuration Architecture (MOCA) for someone knowledgeable in MySQL/Solaris administration. MOCA is a set of best practices I put together to lay out a guidelines for installing and configuring a MySQL database server. MOCA is designed for someone with experience with MySQL, it is not for someone brand new to MySQL.

If you are new to MySQL or to Solaris, I recommend using the default package install for MySQL. The MySQL default install is recommended for someone new to MySQL or the operating system platform. If the default package install makes more sense for you, then you can stop reading. This install is for MySQL 5.1.33 but it would be the same steps for any 5.1.xx installation.

Why Perform a Manual Install

The default install with MySQL is great for users new to MySQL. It is simple, requires a few point and clicks and you are up and running. The problem with a default install is that it is designed to be a very simple install and take minimum resources. The default install also puts MySQL files in different locations on the filesystem dependent on the OS release and platform. The default install is not how an experienced DBA would want to set up a production database environment. It is much better to be able to control the layout and configuration of the database software for production database environments and for platforms where multiple MySQL servers may be installed in the future.

    This install assumes you have a fundamental understanding of Solaris and have an understanding of MySQL database administration fundamentals. Oracle DBAs will find this installation very similar to the concepts of the Optimal Flexible Architecture (OFA).
    For experienced MySQL DBAs a manual install is much better. For this purpose I created a best practices configuration and white paper called MOCA (MySQL Optimal Configuration Architecture). This is based on DBA best practices and should be very similar to Oracle, DB2 and SQL Server production DBAs. There are certain fundamental truths about how database servers should be installed, configured and managed. My MOCA whitepaper addresses these fundamental truths. This manual install will follow MOCA standards and conventions.

Why MOCA?

Visit mysql-dba-journey.blogspot.com to get the details of the reasons behind MOCA and why it is based on best practices. There is also an example of installing MySQL on Mac OS that is very similar to a Linux install. In summary, it focuses on:
  1. Separating database software from other software.
  2. Separating data and index files, log files for recovery, administration and backup files.
  3. Developing standard naming conventions.
  4. Defines a flexible configuration that can support multiple database servers on same platform.
  5. A consistent configuration for multiple servers and versions of MySQL database software.

Installation Summary

This installation looks more complex than it is. I use this configuration for all beginning MySQL DBA classes.

  1. Remove old versions of MySQL if they exist. Setup up operating system user mysql and mysql user environment.
  2. Set up directories and directory permissions for all MySQL data files.
  3. Setup MySQL software and install MySQL software as mysql operating system user (not as root). Configure the my.cnf configuration file.
  4. Create the mysql database (mysql_install_db) and setup the security environment (mysql_secure_installation). Start the mysql database server.
  5. Test the shutdown and startup of the database server.

Installation Environment

The environment for this installation is below: Mac OS 10(Leopard) running VM Fusion with Solaris 10 - Downloaded DVD iso image from www.sun.com website. I installed the Solaris 10 05/08 x86/x64 image for this demo (sol-10-u5-ga-x86-dvd.iso). I also used MySQL 5.1 - Downloaded from dev.mysql.com.

    Before installing MySQL on my platform, make sure there are no previous versions of MySQL preinstalled. Unless you want the older version of MySQL, your life will be much easier if you remove any previous releases.
    Read through this installation a few times before starting.

Look for existing MySQL software

This install uses 5.1.33, these installation procedures can be used for any 5.1.x installation. Dependent on the version of Solaris, different packages may need to be installed or removed (old MySQL installations).

Check to see if you see MySQL on your current system.
# grep mysql /etc/passwd #
# find /usr/local -name '*mysql*' - print # look here for MacOS, Unix/Linux
# find /var -name '*mysql*' - print # good place to start with Solaris
# find / -name "*mysql*' - print # look everywhere for MySQL installations

VM Fusion Choices for Installing Solaris 10

My choices for installing Solaris 10 in a VM Fusion environment. During the installation you will be asked to hit F2 to continue. On a MAC that will be EscapeKey-2 or FN-F2. Solaris Interactive US-English Networked - DHCP IPv6 - No You may need to specify the amount of disk space to use. I allocated 10228 MB.

    You should now be able to log in as root. With Solaris choose the Java Desktop Environment or the Common Desktop Environment (CDE), this is a personal preference.

Removing older versions of MySQL

Check for MySQL packages installed and remove them.
# pkginfo | grep mysql
The following packages SUNWmysqlr, SUNWmysqlt, SUNWmysqlu were found and removed.
# pkgrm SUNWmysqlr
# pkgrm SUNWmysqlt
# pkgrm SUNWmysqlu

Remove old MySQL files from common directories.
# sudo rm /usr/local/mysql
# sudo rm -rf /Library/StartupItems/MySQLCOM/

Set up the mysql user. Start by checking to see if there is an existing MySQL user.
# grep mysql /etc/passwd

Setup new mysql user if one does not exist. If a mysql user does exist, set up a password, default shell, default directory, etc.
No mysql user was found so I added one. Add the mysql group, mysql user, password and home directory.
# groupadd -g 300 mysql
# useradd -u 300 -g 300 -d /export/home/mysql -s /usr/bin/bash -c "MySQL DBA" mysql
# passwd mysql
# mkdir /export/home/mysql
# chown -R mysql:mysql /export/home/mysql

Login and verify the mysql user setup

# exec login mysql (or su - mysql)

Then define a default profile file using your favorite text editor.

--- .bash_profile file ------
PS1='$PWD: '
MYSQL_BASE=/opt/mysql
MYSQL_HOME=/opt/mysql/5.1.33
export MYSQL_BASE MYSQL_HOME
PATH=$PATH:$MYSQL_HOME/bin
--- end of .bash_profile file -------

Set your environment by sourcing your profile file.

$ cd $MYSQL_HOME
$ . ./.bash_profile

Downloading MySQL

Go to http://dev.mysql.com and go to downloads. Find the distributions and choose the install release you want. I chose 5.1.33. I prefer a manual install so I choose the Solaris Tar Packages the Solaris 10 64-bit install. Select a mirror. On the Select a Mirror page, I choose "No thanks, just take me to the downloads"!

MySQL Directory Organization

Organize how MySQL files and software will be located:
/opt/mysql/5.1.33 - Symbolic link to software directory location
/db01/mysql/mysql01/data - data directory
/db02/mysql/mysql01/binlogs - location of binary log files
/db03/mysql/mysql01/admin - main administration directory
/db04/mysql/mysql01/backups - location of backup files

I created the following directories to download the MySQL software in /opt/mysql/5.1.33.
# mkdir -p /opt/mysql/5.1.33
# export MYSQL_NAME=mysql01

Setup data directory structure
# mkdir -p /db01/mysql/$MYSQL_NAME/data

Setup mysql administration directory structure
# mkdir -p /db03/mysql/$MYSQL_NAME
# mkdir /db03/mysql/$MYSQL_NAME/logs
# mkdir /db03/mysql/$MYSQL_NAME/errors
# mkdir /db03/mysql/$MYSQL_NAME/sql

# mkdir /db03/mysql/$MYSQL_NAME/startup

# mkdir /db03/mysql/$MYSQL_NAME/run


Setup binary log structure
# mkdir -p /db02/mysql/$MYSQL_NAME/binlogs

Setup backup directory structure for backups and exports.
# mkdir -p /db04/mysql/$MYSQL_NAME
# mkdir /db04/mysql/$MYSQL_NAME/backups
# mkdir /db04/mysql/$MYSQL_NAME/exports

Set permissions and ownership for MySQL file directories.
# chmod -R 750 /db*/mysql/* /opt/mysql/*
# chown -R mysql:mysql /db*/mysql/* /opt/mysql/*

Before going further

Double (triple) check all directory paths and permissions. 99.99% of issues with manual installs are typos in the directory paths, typos in the file names or permission issues with directories. Double check all paths. When you try to bring up database server, if it defaults to the default areas its because it can;t find a directory or doesn't have permission for directories specified so it will then try the default locations.

Setup the MySQL software (as the MySQL OS user, not the root OS user)

All following commands are run as the mysql OS user. In the /opt/mysql directory unzip and untar the MySQL software as the mysql OS user.
$ cd /opt/mysql
$ gunzip mysql-5.1.33-solaris10-64bit.tar.gz
$ tar xvf mysql-5.1.33-solaris10-64bit.tar
$ ln -s mysql-5.1.33-solaris10-64bit 5.1.33

Be careful with the my.cnf configuration file.

At the operating system prompt you can type the following command. If you scroll down you will find the default search path. All commands like mysql_secure_installation, mysql, mysql_install_db, etc. all look in a search path for the configuration file. Make sure any program is finding the right configuration file in the search path. Here are commands that will show you the search path:

$ mysql --help | more

$ mysqld --help --verbose | more

Use one of the sample configuration files provided with the distribution to get started.

$ cp $MYSQL_HOME/support-files/my-small-cnf /dbadmin/mysql/mysql/startup/my.cnf

Add the following entries to the my.cnf file to the [mysqld] group. This separates all your dynamic administration files, data files, and binary log files to different locations. A separate port is defined away from the default.

[mysqld]

datadir=/db01/mysql/mysql01/data
basedir=/opt/mysql/5.1.33
log-error=/db03/mysql/mysql01/errors/mysql5.1.33.err
pid-file=/db03/mysql/mysql01/localhost.pid

log-bin=/db02/mysql/mysql01/mysql-bin
#port = 3426
#socket=/dbadmin/mysql/mysql01/run/mysql.sock

Add the following entries to the my.cnf file to the [client] group.
[client]
#port = 3426
#socket = /dbadmin/mysql/mysql01/run/mysql.sock

Build a symbolic link from the default location to the real startup file. The step of adding a symbolic to the actual startup file is something I avoid but some environments like this addition.

$ cd $MYSQL_HOME
$ ln -s /dbadmin/mysql/mysql01/startup/my.cnf my.cnf

Create the mysql database files for the MySQL instance. This will create the default database schemas and database files.

$ cd $MYSQL_HOME
$ scripts/mysql_install_db --datadir=/db01/mysql/mysql01/data --basedir=$MYSQL_HOME

Verify data files and directories have been created in the datadir directory.

$ cd /db01/mysql/mysql01/data
$ ls

mysql test

Start the MySQL database server pointing to the defined locations.
$ cd /opt/mysql/5.1.33
$ bin/mysqld_safe --defaults-file=/dbadmin/mysql/mysql01/startup/my.cnf &

If there are socket errors:

i.e. MySQL client cannot star twith the error "cannot connect to the MySQL server through socket

Solution:

MySQL needs to write to a socket. If you don't specify one, a default one is chosen which may not have the appropriate permissions. You can specify the socket file in the command line as below. Make sure the permissions are set properly (owned by mysql).

$ mysql -uroot -p -socket=/dbadmin/mysql/mysql01/run/mysql.sock

Verify the mysqld background process is running as well as the mysqld_safe monitoring process. The mysqld background process should be up and running.

$ ps -ef |grep mysql

Clean up the database server by adding passwords and getting rid of anonymous users. If there are problems with the mysql_secure_installation script, then set the password manually and get rid of the anonymous accounts and any accounts with no passwords.

$ cd $MYSQL_HOME
$ bin/mysql_secure_installation

Shutdown the MySQL server to verify you can shutdown and startup the MySQL instance. I like switching to the my.cnf in the startup directory versus in the default software directory.

$ mysqladmin --defaults-file=/dbadmin/mysql/mysql01/startup/my.cnf shutdown
$ cd $MYSQL_HOME

$ bin/mysqld_safe --defaults-file= /dbadmin/mysql/mysql01/startup/my.cnf

You're up and running have fun. Once you are confortable with this configuration layout, you can create a Unix shell script that will automate almost the entire process. With a shell script automation the install takes about ten minutes.

Sunday, April 26, 2009

Trends in Database Design and Optimized Application Development

I'm constantly going out to customers and looking at their database environments and the challenges they are facing. Of course, sometimes my perspective can be rather unique because customers never invite me in to show me how fast their databases are or how great things are. Usually by the time I get called in, the environment has gotten very dynamic.

What never ceases to amaze me is how on average database design seems to keep getting worse and applications are more poorly written than ever before. There are a number of industry and organization reasons for this but these reasons constantly stand out at customer sites:
  1. Developers Designing Databases: Developers are usually the ones meeting with the customers and doing a lot of the initial design on smaller projects. The problem with this is there is a big difference between understanding what a primary and foreign key are and designing a database. Or there is a data modeler who can design beautiful databases in theory but has no idea how that database design is going to work in a production environment.
  2. Most Developers Do Not Know How to Write Optimized Queries: Back in the old days there were database developers that had a strong understanding of SQL development. However the current generation of Java, PHP, .NET and Ruby developers are great with their languages and writing APIs but the major percentage of them have no idea how to write optimized queries for their applications.
  3. Take Out Databases: In this generation of fast food take out and web applications, its all about getting something out the door quick. Which usually means minimal time for proper database design and testing. This is definitely the Dilbert world of databases.
I find these problems to be much more severe in MySQL environments versus Oracle environments. If an organization pays a million dollars for an Oracle license they are going to make sure they get some people that know that they are doing to design and manage a database project. When someone downloads a MySQL database for free for a small web application they think they can use anybody and the project will turn out okay.

When teaching performance tuning classes in Oracle and MySQL I always ask the students what percentage of companies do they think do a good job of designing their databases and their index optimization paths. I always get single digits for the response to these questions. Oh well, I guess this is what makes the database world so fun. :)

Top things that I have found that greatly improves database application success includes:
  • Getting the right expertise to design your databases.
  • Training developers on how to write good queries for their database applications.
  • Design patterns.
  • Taking the time to do it write.
  • Use case scenarios.

Sunday, March 2, 2008

Choosing a MySQL Configuration Strategy

MySQL configurations cover the range from developers installing a simple MySQL database for a LAMP application to the largest Internet companies using MySQL to deliver web content.
There are two different profiles that cover this range for installing, configuring and managing MySQL database servers:
  1. Profile 1: Individuals or small organizations that wants to use MySQL to create a simple database for web applications and the LAMP stack.
  2. Profile 2: Individuals or organizations that are looking at creating larger MySQL databases or may be creating a large number of MySQL databases.
Profile 1:
For this profile, the person can be a non-DBA. This profile can use the default install using the Windows Installation Manager or the Unix or Linux RPMs. The GUI interfaces will take you through a very simple install. MySQL database servers do not take a lot of resources to run. The template files (my.small, my.medium, my.large) can show startup files that contain larger configurations. This is a simple environment to install and manage. I have seen non-DBAs and developers use this environment for years and it works great for them.

This type of DBA is typically using some simple GUI interfaces, doing backups with mysqldump and taking a simple approach for managing MySQL databases. MySQL's basic configuration can make it very popular for the small and medium sized organization.

Profile2:
This profile involves more complex and larger MySQL databases. This environment often benefits from installing and configuring using best practices. This environment is more successful with someone that has DBA experience or training. Implementing a MySQL environment with the MySQL Optimal Configuration Architecture (MOCA) where a DBA organizes a database for performance, high availability, ease of management becomes important for this environment being successful.

Dependent on the size and complexity of the MySQL environment, the DBA for this environment may be more concerned with one or more of these areas:
  • How the MySQL servers will be monitored and managed. Using the MySQL Enterprise Monitor, open source monitors or 3rd party monitors becomes important.
  • What backup strategy to use: Snapshots, InnoDB Hot Backup, Replication, etc.
  • High Availability (MySQL cluster, other clustering strategies).
  • How to balance I/O by spreading data across multiple disks, storage arrays, etc.
  • Performance Tuning.
  • The establishment of database standards, guidelines and best practices (MOCA).
  • Change control.
  • Having to much coffee.
  • Managing test, development and production databases. Developing rollout strategies.
  • Security, auditing, Sarbanes-Oxley, COBIT, ITIL.
  • Information Lifecycle Management
These two profiles show two ends of the spectrum for MySQL database administration.



Thursday, February 28, 2008

Learning MySQL DBA Best Practices

It is a challenge for new MySQL DBAs to learn best practices for managing database servers. There are no books that teach you everything you need to know. Key knowledge areas new MySQL DBAs need to learn:
  • Best practices for installation and configuration.
  • How to monitor and manage a database.
  • How to monitor for tuning.
  • Best practices for backup/recovery.
  • Best security practices.
  • What audit controls need to be put in place.
  • Disaster planning.
  • Growth projections and scalability planning.
  • Change management.
Sources I would recommend a new DBA start with:
  • Google - do searches on best DBA practices, etc.
  • Read information on the MySQL Optimal Configuration Architecture (MOCA) and the Optimal Flexible Architecture (OFA - Oracle related). Both give examples of things to consider when planning databases.
  • Go to conferences and download presentations from previous conferences.
  • Find a mentor.
  • Larger environments look at Sarbanes-Oxley, COBIT and ITIL in terms of best practices for managing IT infrastructures. Waters get deep here for new DBAs.
  • http://www.planetmysql.org

Monday, February 18, 2008

Trends for DBAs

I grew up in the old days when the primary tools used to manage databases was Unix shells scripts and my development tool was vi. This was back in the day when if you showed other DBAs a script for monitoring and tuning, they would respond to it the way cavemen would look at fire. Those days are long gone.

Whether it is open source, vendor or 3rd party tools DBAs need to use software to help them manage their environments. Whatever the tool: Nagios, Mon, Zenoss, Cacti, MySQL Enterprise Monitor, etc. it is important to use tools to manage a database environment. In the monitoring and management side this includes the entire infrastructure which includes the application server and applications environment. As a DBA it is very important to be in a proactive environment where the administration infrastructure is scalable as more database instances and database schemas are added to the environment. The negative ramifications of using scripts and not software to monitor and manage a database environment include:
  • The more manual scripts (Unix, Perl, etc.) the more you are building a high maintenance environment that has to be supported.
  • As your environment changes with different software, application servers, applications, etc the more the scripts need to be modified.
  • New DBAs usually will not want to bet their jobs on someone else's scripts. Working scripts often get thrown away by new DBAs which will replace existing scripts with their own scripts.
  • Manual scripts are not as scalable as using software.
  • Using manual scripts are not cost-effective compared to using software for monitoring and management.
  • Manual scripts are more error-prone than using software.
  • Manual script environments are usually more reactive versus proactive. Reactive environments are not conducive to high availability.
Companies are looking more for DBAs that can help make strategic decisions, understand the business, have strong communication and management skills and understand how to manage databases from an enterprise perspective. This is different that the traditional heads down, perform technical tasks and write scripts type of DBA.

Wednesday, January 30, 2008

Partitioning a Powerful Tool for DBAs

Partitioning tables and indexes in a database is one of the most powerful features that can help the performance and manageability of a database with large tables. With a large database, there are usually only a handful of very large tables. These large tables typically get most of the I/O activity and have the biggest impact on performance and management.

Partitioning tables and indexes breaks a big problem (large tables) into a number of smaller manageable pieces (table partitions). When a table is partitioned, it is broken into separate physical chunks of data (partitions). For example, there is a large sales table with 10 million records. When generating a monthly or quarterly report, enough records are accessed so an index scan is not efficient, so a full table scan has to be performed. So every month more records get added to the table and every month the monthly report takes longer to run because it is generating a full table scan to go through the data.

An example of how partitioning can have an impact: a sales table can be divided into partitions (i.e. monthly increments) based on the sales date (partition key). So one table partition will have January's data, another partition can have February's data and so on. When a monthly report is generated, an I/O scan will occur on one month of data instead of the entire table. The table partitioning is transparent to the applications. While processing the SQL statement, the optimizer understands the table is partitioned and checks to see if individual partitions can be scanned instead of the entire table.

Partition Pruning
Partition pruning allows a query to run on specific partitions instead of an entire table. Leveraging this benefit requires understanding the data and how the data is accessed to determine the best partitioning method and what column(s) should be the partition key. The partition key is the column(s) used to divide the data into separate partitions. Sufficient time needs to be spent determining the correct column(s) to define as the partition key and defining the correct partition type for the table.

Partitioning offers a number of advantages:
  • Partition pruning allows I/O to be performed on smaller data sets instead of the entire table. As tables grow larger, it can be a big advantage to perform partition scans.
  • Being able to perform administration operations at the partition level versus the table level. This can decrease maintenance times and improve availability.
  • Partitioning is transparent at the application level. Applications (SQL and stored procedure code) will not be impacted by going to a partitioning solution. The optimizer will consider partition pruning (selecting individual partitions) during the optimization phase of processing a SQL command if the partition key column(s) is used in the WHERE clause.
  • Supports information lifecycle management so as data gets older it can be truncated or deleted in older partitions and not impact the more current partitions.
  • Partitioning on indexes breaks one big index into smaller indexes which can significantly reduce I/O on the indexes.
Some examples of using partitioning:
Here are two examples where partitioning can have a positive impact.
  1. A key transaction table is expected to grow by 30% over the next year. It may not be acceptable for the I/O performance to decrease as more data is added to a table. Being able to internally break a table into separate chunks of data that contain data for an individual month or region can offer significant advantages. Monthly reports or regional reports can perform partition scans on specific months or regions and not have to perform full table scans.
  2. A table needs to store data for 13 months. Every month when the oldest partition expires it can be truncated instead of going through and deleting all the old records that have expired.
Different types of partitioning
Range or list partitioning is good to use when there are recognized patterns for how the data is accessed or organizing the data into defined data sets makes sense from an administration perspective. Hash and key partitioning is good to use when someone is just as likely to access one record as another or there is not a clear recognized way to divide the data but there are so many records in a table the benefits of partitioning can be leveraged.
* RANGE
* LIST
* HASH
* KEY
* Composite partitioning- RANGE AND LIST partitions can be sub-partitioned by HASH or KEY.
- RANGE-HASH
- RANGE-KEY
- LIST-HASH
- LIST-KEY

Code examples:
In this example, we have selected the sdate column as the partition key. The table will be broken into separate data chunks (partitions) based on the VALUES LESS THAN clause. Here we have stated that all data before January 1, 2008 will go in the first partition. Partition ranges are not inclusive. Which means Partition Jan2008 will store records where the sdate value is 2008-01-01 or greater and less than 2008-02-01. For 2008 the data is broken into monthly increments. The partition key is defined by the columns listed in the partition clause. The values in the partition key columns determine which partitions the records will be put in.

Range Example:
Range partition example has one partition for dates before 2008 and monthly partitions after that. The MAXVALUE keyword will accept any records with a sales date that is 2008-04-01 or greater. Once we get into May, a new partition for the new month can be created.
CREATE TABLE sales (
invoice_id INT NOT NULL AUTO_INCREMENT,
sdate DATE NOT NULL,
samt DECIMAL (10,2) UNSIGNED NOT NULL,
region_id INT (3),
cust_id INT,
INDEX ( invoice_id),
INDEX ( sdate) ) ENGINE=innodb
PARTITION BY RANGE (to_days(sdate)) (
PARTITION before2008 VALUES LESS THAN (to_days('2008-01-01')) ,
PARTITION Jan2008 VALUES LESS THAN (to_days('2008-02-01')) ,
PARTITION Feb2008 VALUES LESS THAN (to_days('2008-03-01')) ,
PARTITION Mar2008 VALUES LESS THAN (to_days('2008-04-01')) ,
PARTITION plast VALUES LESS THAN MAXVALUE );

List Example:
List partition example partitions based on the region_id column. The value for the region_id will determine which partition the data goes into.
CREATE TABLE sales (
invoice_id INT NOT NULL AUTO_INCREMENT,
sdate DATE NOT NULL,
samt DECIMAL (10,2) UNSIGNED NOT NULL,
region_id INT (3),
cust_id INT,
INDEX ( invoice_id),
INDEX ( region_id) ) ENGINE=myisam
PARTITION BY LIST (region_id) (
PARTITION North VALUES IN (1, 2, 5, 6),
PARTITION South VALUES IN (7,8,15,16 ),
PARTITION West VALUES IN (20, 21, 22, 23, 24),
PARTITION East VALUES IN (30, 31, 32, 34, 35) );

Hash example:
Hash table example generates 16 partitions that will be used to try and evenly divide the data. Hash partitioning uses the modulus operator to distribute the data.
CREATE TABLE download(
id INT NOT NULL,
pname VARCHAR(60),
ddate DATE NOT NULL,
pcode INT,
region_id INT ) ENGINE=myisam
PARTITION BY HASH(region_id)
PARTITIONS 16;

Key partition example:
Key table table breaks the data into 16 partitions. With key partitioning, the MySQL server will use an internal algorithm to try to evenly distribute the data.
CREATE TABLE download(
id INT NOT NULL,
pname VARCHAR(60),
ddate DATE NOT NULL,
pcode INT,
region_id INT ) ENGINE=innodb
PARTITION BY KEY (id)
PARTITIONS 16;

Composite Partitioning
If the number of records that go into a range or list partition are still too too big, then a composite partitioning can be performed. Range or list partitioned tables can have a lower level of partitioning performed. This is called sub partitioning.

RANGE-HASH partitioning
This RANGE-HASH partitioning will have four has subpartitions for every range.
CREATE TABLE sales (
invoice_id INT NOT NULL AUTO_INCREMENT,
sdate DATE NOT NULL,
samt DECIMAL (10,2) UNSIGNED NOT NULL,
region_id INT (3),
cust_id INT,
INDEX ( invoice_id),
INDEX ( sdate) ) ENGINE=innodb
PARTITION BY RANGE (to_days(sdate))
SUBPARTITION BY HASH( region_id)
SUBPARTITIONS 4(
PARTITION before2008 VALUES LESS THAN (to_days('2008-01-01')) ,
PARTITION Jan2008 VALUES LESS THAN (to_days('2008-02-01')) ,
PARTITION Feb2008 VALUES LESS THAN (to_days('2008-03-01')) ,
PARTITION Mar2008 VALUES LESS THAN (to_days('2008-04-01')) ,
PARTITION plast VALUES LESS THAN MAXVALUE );

This RANGE-HASH partition table defines the number of subpartitions to be defined for each range. Different numbers of subpartitions can be defined when different months can have different volumes of sales. So it may be necesssary for some some ranges to have a larger or smaller number of partitions.

CREATE TABLE sales (
invoice_id INT NOT NULL AUTO_INCREMENT,
sdate DATE NOT NULL,
samt DECIMAL (10,2) UNSIGNED NOT NULL,
region_id INT (3),
cust_id INT,
INDEX ( invoice_id),
INDEX ( sdate) ) ENGINE=innodb
PARTITION BY RANGE (to_days(sdate))
SUBPARTITION BY HASH( region_id) (
PARTITION before2008 VALUES LESS THAN (to_days('2008-01-01'))
( SUBPARTITION s0,
SUBPARTITION s1 ),
PARTITION Jan2008 VALUES LESS THAN (to_days('2008-02-01'))
( SUBPARTITION s3,
SUBPARTITION s4 ),
PARTITION Feb2008 VALUES LESS THAN (to_days('2008-03-01'))
( SUBPARTITION s5,
SUBPARTITION s6 ),
PARTITION Mar2008 VALUES LESS THAN (to_days('2008-04-01'))
( SUBPARTITION s7,
SUBPARTITION s8,
SUBPARTITION s9,
SUBPARTITION s10),
PARTITION plast VALUES LESS THAN MAXVALUE
(SUBPARTITION s11,
SUBPARTITION s12)
);

To leverage partition pruning, the partition key column (sdate) needs to be in a WHERE clause, for the optimizer to consider performing partition scans. For example, the following query will only access data that is in the Jan2008 instead of the full table. So if we use the partition key in the WHERE clause then monthly and quarterly reports will perform partition scans and not full table scans.

SELECT * FROM sales
WHERE sdate BETWEEN '2008-01-01' AND '2008-01-01';

A few points about partitioning in MySQL 5.1
This is the first release of partitioning in MySQL so the focus is on key features of partitioning. It's a little tough due to some of the restrictions (listed below) but they are offset by the fact that MySQL offers partitioning for free. This can be an expensive license for other database vendors. This allows MySQL users the ability to leverage the great benefits of partitioning without any additional licensing costs.

People always ask me how big should a table be before we start considering partitioning. I then give my favorite answer, "it depends". Once a table gets to the size that the following become issues, it is time to consider partitioning:
  • Full tables scans are too expense.
  • Maintenance operations on an entire table take too long.
  • Availability is being impacted by the table being too large.
  • ILM can be leveraged by going to partitioning.

Some restrictions of using partitioning in MySQL 5.1
  • Partitioned tables must be partitioned on the partition key using an integer or must contain an expression that evaluates to an integer expression (key partitioning is an exception to this). Partition pruning can still be performed on date columns if you use the TO_DAYS() or YEAR() functions.
  • There can be a maximum of 1024 partitions.
  • Foreign keys are not supported.
  • If a table contains a primary key, the columns in the partition key must be part of the primary key.

Tuesday, December 18, 2007

Create aliases to make MySQL administration easier

When working in a Unix/Linux/Mac OS environment, aliases can make things a lot easier if you are running the same commands over and over again. Here are a few aliases you can define to simplify running MySQL commands. Then just type the alias name. Put these in one of your configuration files dependent on the Unix shell you are running so these aliases will be automatically set in your environment.

Alias examples ($ represents the command line prompt):
$ alias myalias='alias | grep mysql'
$ alias myenv='env | grep mysql '
$ alias mystat='mysqladmin -u root -p status'
$ alias myestat='mysqladmin -u root -p extended-status'
$ alias myping='mysqladmin -u root -p ping'
$ alias myrun='ps -aux | grep mysql | grep -v grep'
$ alias mysql='mysql -u root -p'
$ alias mystop='mysqladmin -u root -p shutdown'
$ alias mysafe='nohup /opt/mysql/5.1.22/bin/mysqld_safe -- defaults-file=/dbadmin/mysql/startup/my.cnf & '

To view your aliases type the following:
$ alias

There are a lot of additional aliases you can define to make your daily tasks easier. Normally aliases are put in a startup file (in the $HOME directory) that is executed each time a Unix shell is started.
  • In the Korn shell (ksh) put the aliases in the .kshrc file.
  • In the Bash shell (bash) put the aliases in the .bashrc file.

Wednesday, September 26, 2007

Top Ten Things to do before installing MySQL

I think it is very important that DBAs (especially new ones) do sufficient planning, physical design, risk analysis and backup/recovery planning before installing MySQL. MySQL is an easy database to install and configure on a wide range of platforms.

However if sufficient planning is not performed up front, a DBA is going to have problems as their databases grows, issues come up and more users start accessing the system. A top 10 list of things to do before installing a MySQL database include:
  1. Scalability: Understand the purpose and potential growth of your database.
  2. Physical Design: Determine the appropriate physical layout of your database. The defaults are not always the best long term layout.
  3. Storage Engines: Select the appropriate storage engines and how they impact performance and your backup/recovery strategies.
  4. Backup and Recovery: Layout your backup and recovery strategy. Pick the appropriate backup/recovery tools for your environment. Then test your backup and recovery strategies sufficiently. Test recovery as well as backups. Understand your recovery times.
  5. Database Environment: Understand the number of databases that will be created and the maintenance ramifications of the database layout.
  6. Disaster Recovery: What are the business costs and ramifications of this environment (database server, host server, site, network, ...) being down? Perform appropriate risk analysis.
  7. Security: Understand how security is being managed and maintained for your users and your databases. Make sure you secure your userids and passwords.
  8. SQL_MODE and Isolation Level: Set the appropiate SQL_MODE and Isolation Level and make sure you understand the ramifications of the these selections. If running an OLTP environment I highly recommend setting the SQL_MODE to a minimum of traditional.
  9. Storage: Understand your physical storage layout. Make sure you know the ramifications, limitations, scalability and performance of using separate disks, storage arrays, etc.
  10. Best Practices: Define a set of best practices that determine your physical layouts, naming conventions, management tools, infrastructure management and document everything.
A few areas that did not make my top ten but I would give honorable mention.
  • Make sure and secure the hardware platforms, networks and operating systems where your databases will be running. Along with that take out operating system features you should not be using on a database server. This includes features such as Telnet, web services, print services, etc. These unnecessary OS features have security issues to watch out for and add additional weight (impact performance) to the operating system.
  • Read the README files and research the release you are moving to.
  • Determine what software you are going to use to manage your MySQL database environment. If you have multiple MySQL databases you should consider the MySQL Network Monitoring and Advisory Service Advisors.
Dependent on your enterprise software licensing, the following advisers are available from MySQL.
  • Administration Advisor
  • Schema Advisor
  • Performance Advisor
  • Security Advisor
  • Memory Usage Advisor
  • Replication Advisor
  • Custom Advisor
The five P's apply to database configurations. Proper Planning Prevents Pitiful Performance. I had a higher pain threshold when I was younger versus today. I have found that proper planning can help you avoid a lot of pain, anguish and late evenings as a DBA.

Sunday, September 23, 2007

Optimal MySQL configurations

I read a book one time called the "Seven Habits of Highly Successful People". After I read it I thought what a waste of money I already knew everything that was in the book. There were no new ideas in it. I then realized a number of the seven habits I was not doing. And if I did them I would be more successful. I then realized maybe there is a reason the book stays in the New York Times best seller list year after year.

The following points are similar to the seven habits. They are a set of good practices and guidelines that if you follow, will make your database management a lot easier to support. If you follow them, you'll spend a lot less time working nights and weekends.

Any set of database management best practices need to include:
  • A standard way of organizing software and data on disks.
  • Separate database software from operating system and other third party software.
  • Separate data from log files. The files that store your data and the files need to recover your data should never reside on the same disk drives and controllers.
  • Data, index files and log files should be distributed to reduce I/O bottlenecks and to maximize recovery and reduce administration.
  • Having proven and tested (on a regular basis) backups.
  • Facilitate the management of installs, upgrades and backups. The routine tasks required of DBAs.
  • Prepare for and manage the growth of data and databases.
  • Minimize the fragmentation of tables and indexes.
  • Have appropriate safe guards so disk or hardware failures will have a minimal impact. DBAs have to work with the people making the purchases so they have the right hardware and disk configurations. Distributing data across multiple disk drives reduces the impact of individual disk failures.
  • There should be a consistent pattern for directory structures for software, backups, data files, log files, administration scripts, start up files and other other files associated with MySQL databases.
  • The change in software directories, data directories, login directories should have minimal if no impact on database scripts and applications.
  • Administrative information for each database such as logs, backup scripts, administration scripts should be separate for each database in a well organized directory structure pattern.
  • Files of different types should be organized in separate directory structures.
  • Separate file systems should exist for software, data, log files, backups, etc. as much as possible.
  • Protecting the hardware platforms and operating systems that are running MySQL databases.
  • Use a consistent naming pattern for all data files, index files, log files, scripts, database names and file systems supporting files related to MySQL databases.
  • If your organization is using storage drives, make sure you understand are are working closely with the storage management teams to optimize any striping and mirroring of database storage.
  • Software to manage and monitor a complex environment. The Enterprise Monitor tool from MySQL can be found at http://mysql.com/products/enterprise/benefits.html.
Database and software configurations need to address the constant tasks a DBA performs.
  • Adding new users.
  • Duplicating databases.
  • Managing changes for development, test and production database environments.
  • Upgrades of operating systems, databases, applications and hardware.
  • Adding storage for data increases.
  • Creating new databases.
  • Adding new hardware, disk drives, etc.
  • Constantly distributing I/O workloads.
I realize if you are a new DBA this can be a little overwhelming. Starting simple can be the best approach and as you get more experience following more of the above practices will become important. If you are an experienced DBA review the above guidelines and your best practices to make sure your management practices are honed to a fine edge.