Introduction to Table Partitioning

Divide and conquer is what Oracle was thinking with this one. Table partitioning is about optimizing “medium selectivity queries”. The Oracle database has optimization techniques for high selectivity queries, with the use of indexes. If we need to process all data in a big table, we have to live with the fact it will take a while, but the engine

Divide and conquer is what Oracle was thinking with this one. Table partitioning is about optimizing “medium selectivity queries”. The Oracle database has optimization techniques for high selectivity queries, with the use of indexes. If we need to process all data in a big table, we have to live with the fact it will take a while, but the engine will process the data as fast as possible. However, a medium selectivity query needs just a portion of the data, for instance a tenth. This is to much data for indexes, to little data for full table scans so the processing time might become rather long regarding the outcome.

Let me give you a brief explanation of how Oracle collects its data, and how table partitioning can help for these queries. Do note this is not for everyone; “Oracle Partitioning” is an extra cost option, for Enterprise Edition only.

Access methods in a nutshell

Oracle has two commonly used table access methods, “full table access” and “access by rowid”. For the first method, Oracle reads all the blocks in a table, and applies filters afterwards. This might look like a lot of overhead, but multiblock reads are used for reading large amounts of data in bulk. The access by rowid is mostly used in conjunction with indexes. The index scan returns a rowid, and a single block read is used to get the block we’re interested in. This looks like an interesting method, because we read a lot less data. But when we need a lot of records, this means a lot of small operations that need to be set up, which brings a lot of overhead. Depending on your data, you’re better off with multi block reads when you’re fetching more than 5% to 10% of the table.

Screenshot: Oracle multi block and single block reads

Table partitioning to the rescue

When we store our data in a non-partitioned table, without indexes, we will always have a full table scan. Consider following code:

With table partitioning, a table can be physically divided into multiple smaller tables, called partitions, while logically it stays one table. This means your code stays the same, but full partition scans will be executed instead of a full table scan.

These partitions are created based on a key. Depending on which value a certain column has, the record will be stored in a certain partition. It’s important to choose a column that is often used in queries as our key.

The big advantage of partitioning is the possibility for “partition pruning”. When we look for a value of “1” in column “N10”, we know we will only find this in partition “PART1”, so we don’t need to access the other partitions, and our query will execute ten times as fast. When we need all values, it will scan all partition, and there will only be a small overhead.

As you can see, the partition pruning makes an enormous difference, going from 2.3 seconds to 0.5 seconds.

Table partitioning methods

Table partitioning already exists since 8i, but a lot of new possibilities were added since 11g. The different types of partitioning are:

Range partitioning

The table is divided in ranges, typically used for date ranges. This is beneficial when the filters using inbetween, greater than or less than. It is able to skip all partitions not in the range.

When inserting a record that does not belong to any of the partitions, an ORA-14400 error will be raised. If you create a partition with “MAXVALUE”, it will be used as default partition. New partitions can be added or dropped manually when needed, or they can be created automatically with interval partitioning.

List partitioning

A limited set of possible values is given; rows containing the same value are grouped. This can be used for columns with few distinct values, when all values are known, like department or country. It can be beneficial to group values together that are combined in filters.

To avoid the ORA-14400 error, one can use the “DEFAULT”-keyword.

Hash partitioning

A value is hashed, and random distribution occurs. This is used with many distinct values, when there are no searches on ranges. The advantages can be a more evenly distribution than with range partitioning.

Composite partitioning

A partition can be subpartitioned by any of the previous methods, in any combination since 11g. This will allow more queries to benefit from table partitioning. Partition pruning will occur when filtering on both keys or only one of the keys.

Refpartitioning

Partitioning is also possible on parent and child relations. When the parent table is partitioned, the child table can be partitioned based on the foreign key. For instance when the “ORDERS” table is partitioned on “ORDER_DATE”, the “ORDER_LINES” table can be partitioned on the order date as well, without storing the actual value in the “ORDER_LINES” table. This is very beneficial when joining the two tables on the foreign key.

Partitioned indexes

Not only tables can be partitioned, but indexes can become very large objects as well. The indexes for partitioned tables can be stored on three different ways:

  • Non partitioned: No partitioning applied
  • Globally partitioned: The index is not partitioned on the same key as the table
  • Locally partitioned: The index is partitioned on the same key as the partitioned table

Which type of index partitioning type you chose is depended on the query it needs to support. If the filter always contains the key, local indexes can be used, which are easier to manage. Globally partitioned indexes offer higher flexibility.

Now let’s start partitioning!

If you can afford this option, it will speed up your queries dramatically, without changing anything to your code. Since 11g partitions have been greatly enhanced, but they can be used as from Oracle 8i.
If you cannot use this option, the concept might still be a life saver, dividing your big tables into smaller ones. This will be very handy for recent data and archived data, but your code will become very complex, very quickly.