Wednesday, October 21, 2009

Oracle Table Compression

Oracle Table Compression
Oracle employs a very nice compression algorithm on database block level which is quite efficient and can yield threefold compression ratio in most cases.

The table compression is highly recommended for Data Warehouse environments where the operations tend to be read heavy and batch. Table compression is not recommended for OLTP environments or on tables with high OLTP like operations because of the cost associated with DML operations on compressed tables. This is true with Oracle 10g. Oracle 11g has gone wild, and offers OLTP compression as well!

I have found this quite nice short explenation in Oracle Documentation about what Oracle Table compression is:

"...Oracle table compression feature compresses data by eliminating duplicate values in a database block. Compressed data stored in a database block (also known as disk page) is self-contained. That is, all the information needed to re-create the uncompressed data in a block is available within that block. Duplicate values in all the rows and columns in a block are stored once at the beginning of the block, in what is called a symbol table for that block. All occurrences of such values are replaced with a short reference to the symbol table..."

The benefits associated with Table Compression are:


Save space
Faster backups
Query Performance

1. Find out the candidate table to compress

Usually NUM_ROWS will reveal the table in need for compression. In Data Warehouses these are usually fact tables. The fact tables tend to have lots of redundant data in the columns such as keys, data and status flag data. The compression algorithm uses a symbolic table structure to record this redundancy numerically rather than physically and repeatedly a the table level. That is compression on the 'Relational Table' level rather than byte level.

My worked example:

As example we will use the X table, the largest table which I will partition and compress all its partitions except the current partition. The reason we don't compress the current partition is because in the current month partition there will be load of ETL activity (DML) and we do not want to introduce high CPU usage because of compression. Compression is no good when there is lots of DML.
The size of the table before compression is 6GB:

SQL> select segment_name, sum(bytes)/1024/1024 MB from user_Segments where segment_name like 'X'
group by segment_name
SQL> /

SEGMENT_NAME MB
------------------------------ ----------
X 6179


2. List the partitions of the candidate table and decide which to compress

You can list the partition names for the tables you are interested to compress like this:


SQL> set pages 49999
SQL> l

1* select table_name, partition_name, compression from user_tab_partitions where table_name ='X'

SQL> /TABLE_NAME PARTITION_NAME COMPRESS

------------------------------ ------------------------------ --------

...

X P0201 DISABLED
X P0202 DISABLED

...
Then you can write an SQL query to build the command which will move and compress the partitions. Just by altering the property of the table to COMPRESS => ENABLED doesn't compress existing rows in the table. It will compress only the new rows which will be added to the table. To save space we must compress what is there and leave only the current partition and future partitions uncompressed. You must remember to NOT compress the current partition and the future partitions, as the script below will generate commands to compress all partitions. To find out the current partition look at the PARTITION_POSITION in USER_TAB_PARTITIONS, the highest positioned partition is the current partition. You can also order by PARTITION_POSITION as below and don't do the first line.
Generate PARTITION COMPRESS commands with the following script.


SQL> select 'ALTER TABLE 'table_name' MOVE PARTITION 'partition_name ' TABLESPACE NEW_TABLESPACE COMPRESS NOLOGGING PARALLEL; ' from user_tab_partitions where table_name ='X'
order by partition_position desc;

'ALTERTABLE'TABLE_NAME'MOVEPARTITION'PARTITION_NAME'TABLESPACENEW_TABLESPACECOMPRESSNOLOGGINGPARALLEL;'

---------------------------------------------------------------------------------------------------
ALTER TABLE X MOVE PARTITION P0207 TABLESPACE NEW_TABLESPACE COMPRESS NOLOGGING PARALLEL;
ALTER TABLE X MOVE PARTITION P0107 TABLESPACE NEW_TABLESPACE COMPRESS NOLOGGING PARALLEL;

WARNING : It is very important to choose to compress the tables in NEW tablespace as compressing them alone doesn't reorganize the extents in the datafiles in a way for them to be able to be re-sized and shrunk later on.

3. If there are bitmap indexes on the candidate table you must first mark them unusable

WARNING : To be able to apply the commands below the table in question must not have bitmap indexes , if it does you must mark them as unusable otherwise you get ORA-14646 err.



ORA-14646: Specified alter table operation involving compression cannot be performed
in the presence of usable bitmap indexes

See here for more info :http://download-uk.oracle.com/docs/cd/B19306_01/server.102/b14223/parpart.htm#i1006787

You will have to rebuild the indexes after you have finished with compressing the partitions.

4. Run the compression commands

For each partition you wish to compress you are now ready to run the commands like :

ALTER TABLE X MOVE PARTITION P0207 TABLESPACE NEW_TABLESPACE COMPRESS NOLOGGING PARALLEL;
ALTER TABLE X MOVE PARTITION P0107 TABLESPACE NEW_TABLESPACE COMPRESS NOLOGGING PARALLEL;

You can choose to use NOLOGGING for this operation as it generates lots of redo and will affect your archive logging and PARALLEL to make it faster.

After compressing all partitions on the X table the savings realised are as follows:

A 41% saving!

-- Before Compression

SEGMENT_NAME MB
------------------------------ ----------
X 6179

-- After Compression

SEGMENT_NAME MB
------------------------------ ----------
X 3670

5. You must rebuild the BITMAP INDEXES marked UNUSABLE at the previous step 4.


You will have to rebuild all bitmap , and other if any, indexes marked unusable at the previous steps, to do it quickly you ca use NOLOGGING and PARALLEL. Similarly write an SQL statement querying the USER_IND_PARTITIONS table and generate statement like the following for all indexes.



ALTER INDEX my_bitmap_index REBUILD PARTITION P1001 NOLOGGING;

ALTER INDEX my_bitmap_index REBUILD PARTITION P1002 NOLOGGING;

...
This is a one-off operation you won't do it again for the table, ie when you add a new partition.

6. You must rebuild ALL LOCAL INDEXES in the partitioned tables as their state has now become UNUSABLE

After the compression of the table partitions all local indexes associate with the table become UNUSABLE. As the oracle documentation categorically indicates:

If you use the ALTER TABLE ... MOVE statement, the local indexes for a partition become unusable. You have to rebuild them afterwards. Otherwise you will get errors in partition maintenance operations like this:


*

ERROR at line 1:

ORA-01502: index 'my_schema.my_X_table_index' or partition of such index is in unusable

state

You can do that by writing something like this and change the table_name=X for each table.

SQL> select 'alter table X modify partition ' partition_name ' rebuild unusable local indexes;' from user_tab_partitions where table_name='X'

NOTE: You will also have to rebuild indexes whenever you COMPRESS/NOCOMPRESS any partitions. That is, when you use any of these two operations, data moves, and indexes break and have to be rebuild.

7. Check to see the space you have gained.

After moving the table partition segments to their new tablespace and rebuilding necessary indexes you ought to gain some space. You will realize that the space which the objects before compression has decreased. You will require less space to store them, when compressed, in their new tablespace. After you move everything from the old tablespace into the new and you are sure the old tablespace is empty you can then drop the old tablespace and its datafiles.

Well the only approach to 'claim' space back after a table compression operation seems to be the use of ALTER TABLE ... MOVE TABLESPACE 'YOURNEWTABLESPACE' on a 'new' tablespace. Although we can claim physical space back from the database only by resizing datafiles and we could have 'moved' the tables withing their 'own' tablespace, it is not recommended.

Moving the tables within the same tablespace is not a good idea, because you are never sure if the reorganization of the table segments will actually move the HWM of the datafile down so that you can resize the datafile. (see below asktom.oracle.com links for more info)

There might be 'that' extent still hanging at the end of the tablespace which will prevent you from resizing the datafiles. Although you can fish that 'hanging at the end' extent out and reorganize the segment you are better off by moving all to a new tablespace.

Use DBA_SEGMENTS before you decide to drop a tablespace to see if any segments are left in it. Alternativaly you can choose to resize the datafiles to very small files.

I am also reading lately that in Oracle 11g you can get Advanced Table Compression in OLTP environments as well! It is an option which you have to puchase separately in Enterprise Edition though.

References:

Forum Questions to Asktom :

http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:7873209745771#240368600346159666

http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:54178027703899

Oracle Docs :

http://download-uk.oracle.com/docs/cd/B19306_01/server.102/b14223/parpart.htm#sthref196