Partition Pruning
Partition pruning allows a query to run on specific partitions instead of an entire table. Leveraging this benefit requires understanding the data and how the data is accessed to determine the best partitioning method and what column(s) should be the partition key. The partition key is the column(s) used to divide the data into separate partitions. Sufficient time needs to be spent determining the correct column(s) to define as the partition key and defining the correct partition type for the table.
Partitioning offers a number of advantages:
- Partition pruning allows I/O to be performed on smaller data sets instead of the entire table. As tables grow larger, it can be a big advantage to perform partition scans.
- Being able to perform administration operations at the partition level versus the table level. This can decrease maintenance times and improve availability.
- Partitioning is transparent at the application level. Applications (SQL and stored procedure code) will not be impacted by going to a partitioning solution. The optimizer will consider partition pruning (selecting individual partitions) during the optimization phase of processing a SQL command if the partition key column(s) is used in the WHERE clause.
- Supports information lifecycle management so as data gets older it can be truncated or deleted in older partitions and not impact the more current partitions.
- Partitioning on indexes breaks one big index into smaller indexes which can significantly reduce I/O on the indexes.
Here are two examples where partitioning can have a positive impact.
- A key transaction table is expected to grow by 30% over the next year. It may not be acceptable for the I/O performance to decrease as more data is added to a table. Being able to internally break a table into separate chunks of data that contain data for an individual month or region can offer significant advantages. Monthly reports or regional reports can perform partition scans on specific months or regions and not have to perform full table scans.
- A table needs to store data for 13 months. Every month when the oldest partition expires it can be truncated instead of going through and deleting all the old records that have expired.
* LIST
* HASH
* KEY
* Composite partitioning- RANGE AND LIST partitions can be sub-partitioned by HASH or KEY.
- RANGE-HASH
- RANGE-KEY
- LIST-HASH
- LIST-KEY
Code examples:
Range Example:
Range partition example has one partition for dates before 2008 and monthly partitions after that. The MAXVALUE keyword will accept any records with a sales date that is 2008-04-01 or greater. Once we get into May, a new partition for the new month can be created.
CREATE TABLE sales (
invoice_id INT NOT NULL AUTO_INCREMENT,
sdate DATE NOT NULL,
samt DECIMAL (10,2) UNSIGNED NOT NULL,
region_id INT (3),
cust_id INT,
INDEX ( invoice_id),
INDEX ( sdate) ) ENGINE=innodb
PARTITION BY RANGE (to_days(sdate)) (
PARTITION before2008 VALUES LESS THAN (to_days('2008-01-01')) ,
PARTITION Jan2008 VALUES LESS THAN (to_days('2008-02-01')) ,
PARTITION Feb2008 VALUES LESS THAN (to_days('2008-03-01')) ,
PARTITION Mar2008 VALUES LESS THAN (to_days('2008-04-01')) ,
PARTITION plast VALUES LESS THAN MAXVALUE );
List Example:
List partition example partitions based on the region_id column. The value for the region_id will determine which partition the data goes into.
CREATE TABLE sales (
invoice_id INT NOT NULL AUTO_INCREMENT,
sdate DATE NOT NULL,
samt DECIMAL (10,2) UNSIGNED NOT NULL,
region_id INT (3),
cust_id INT,
INDEX ( invoice_id),
INDEX ( region_id) ) ENGINE=myisam
PARTITION BY LIST (region_id) (
PARTITION North VALUES IN (1, 2, 5, 6),
PARTITION South VALUES IN (7,8,15,16 ),
PARTITION West VALUES IN (20, 21, 22, 23, 24),
PARTITION East VALUES IN (30, 31, 32, 34, 35) );
Hash example:
id INT NOT NULL,
pname VARCHAR(60),
ddate DATE NOT NULL,
pcode INT,
region_id INT ) ENGINE=myisam
PARTITION BY HASH(region_id)
PARTITIONS 16;
Key partition example:
Key table table breaks the data into 16 partitions. With key partitioning, the MySQL server will use an internal algorithm to try to evenly distribute the data.
CREATE TABLE download(
id INT NOT NULL,
pname VARCHAR(60),
ddate DATE NOT NULL,
pcode INT,
region_id INT ) ENGINE=innodb
PARTITION BY KEY (id)
PARTITIONS 16;
Composite Partitioning
RANGE-HASH partitioning
This RANGE-HASH partitioning will have four has subpartitions for every range.
CREATE TABLE sales (
invoice_id INT NOT NULL AUTO_INCREMENT,
sdate DATE NOT NULL,
samt DECIMAL (10,2) UNSIGNED NOT NULL,
region_id INT (3),
cust_id INT,
INDEX ( invoice_id),
INDEX ( sdate) ) ENGINE=innodb
PARTITION BY RANGE (to_days(sdate))
SUBPARTITION BY HASH( region_id)
SUBPARTITIONS 4(
PARTITION before2008 VALUES LESS THAN (to_days('2008-01-01')) ,
PARTITION Jan2008 VALUES LESS THAN (to_days('2008-02-01')) ,
PARTITION Feb2008 VALUES LESS THAN (to_days('2008-03-01')) ,
PARTITION Mar2008 VALUES LESS THAN (to_days('2008-04-01')) ,
PARTITION plast VALUES LESS THAN MAXVALUE );
CREATE TABLE sales (
invoice_id INT NOT NULL AUTO_INCREMENT,
sdate DATE NOT NULL,
samt DECIMAL (10,2) UNSIGNED NOT NULL,
region_id INT (3),
cust_id INT,
INDEX ( invoice_id),
INDEX ( sdate) ) ENGINE=innodb
PARTITION BY RANGE (to_days(sdate))
SUBPARTITION BY HASH( region_id) (
PARTITION before2008 VALUES LESS THAN (to_days('2008-01-01'))
( SUBPARTITION s0,
SUBPARTITION s1 ),
PARTITION Jan2008 VALUES LESS THAN (to_days('2008-02-01'))
( SUBPARTITION s3,
SUBPARTITION s4 ),
PARTITION Feb2008 VALUES LESS THAN (to_days('2008-03-01'))
( SUBPARTITION s5,
SUBPARTITION s6 ),
PARTITION Mar2008 VALUES LESS THAN (to_days('2008-04-01'))
( SUBPARTITION s7,
SUBPARTITION s8,
SUBPARTITION s9,
SUBPARTITION s10),
PARTITION plast VALUES LESS THAN MAXVALUE
(SUBPARTITION s11,
SUBPARTITION s12)
);
SELECT * FROM sales
WHERE sdate BETWEEN '2008-01-01' AND '2008-01-01';
A few points about partitioning in MySQL 5.1
- Full tables scans are too expense.
- Maintenance operations on an entire table take too long.
- Availability is being impacted by the table being too large.
- ILM can be leveraged by going to partitioning.
Some restrictions of using partitioning in MySQL 5.1
- Partitioned tables must be partitioned on the partition key using an integer or must contain an expression that evaluates to an integer expression (key partitioning is an exception to this). Partition pruning can still be performed on date columns if you use the TO_DAYS() or YEAR() functions.
- There can be a maximum of 1024 partitions.
- Foreign keys are not supported.
- If a table contains a primary key, the columns in the partition key must be part of the primary key.