In part one of this series we saw that the mechanics of basic table compression came in to play only for direct path inserts, CTAS, and “alter table move”. We also saw that Oracle would, by default, set the free space percentage (pctfree) for a table to zero if the table were defined as compressed – giving us a hint that basic compression should be viewed as a read-only strategy.
When we examined a block dump we found that Oracle didn’t “compress” data, it “deduplicated” the data on a block by block basis, creating a list of the duplicated values within each block, replacing duplicated items with tokens representing those items. Moreover Oracle could rearrange the column order for each block to improve its chances of being able to use a single token to represent multiple adjacent column values. This told us that Oracle didn’t need to “decompress” blocks when reading them, it merely needed to reconstruct rows by following pointers which, of course, is a CPU intensive task.
In this article we look at what happens if you don’t stick to the “read-only” guideline, and from this we can move on in the third article to the (separately licensed) “OLTP” compression option. As before, all examples come from an instance of Oracle 11.2.0.3.
Deduplication and Deletes
You will recall that in the last article I picked a row from a data block dump that consisted entirely of a single token, then found that Oracle had recursively applied deduplication to reduce that token to a combination of two tokens and two extra column values. Here is the row we examined:
tab 1, row 0, @0x1b28 tl: 5 fb: --H-FL-- lb: 0x0 cc: 4 col 0: [ 4] 41 41 41 41 col 1: [10] 41 41 41 41 41 41 41 41 41 41 col 2: [ 2] c1 02 col 3: [10] 20 20 20 20 20 20 20 20 20 31 bindmp: 2c 00 01 04 31
And here is what we found when we looked up the value of the single token referenced::
Tab 0, row 49, @0x1ed0 tl: 19 fb: --H-FL-- lb: 0x0 cc: 4 col 0: [ 4] 41 41 41 41 col 1: [10] 41 41 41 41 41 41 41 41 41 41 col 2: [ 2] c1 02 col 3: [10] 20 20 20 20 20 20 20 20 20 31 bindmp: 00 08 04 36 40 ca c1 02 d2 20 20 20 20 20 20 20 20 20 31
The first five bytes of the bindmp tells use that the token is used 8 times in the block (00 08), is made up of 4 columns, and requires us to look up tokens 54 (0×36) and 64 (0×40), which are as follows:
tab 0, row 54, @0x1f74 tl: 7 fb: --H-FL-- lb: 0x0 cc: 1 col 0: [ 4] 41 41 41 41 bindmp: 00 0a cc 41 41 41 41 tab 0, row 64, @0x1f7b tl: 13 fb: --H-FL-- lb: 0x0 cc: 1 col 0: [10] 41 41 41 41 41 41 41 41 41 41 bindmp: 00 05 d2 41 41 41 41 41 41 41 41 41 41
With this picture in front of us, we can now recognize the extra work that has to be introduced if we delete the original row. Two things have to happen – the row has to be marked as deleted (in the normal way), but the “usage count” on token 49 also has to decrease by 1. Here’s a tiny extract from the block immediately after deleting our row, first the row entry itself:
tab 1, row 0, @0x1b28 tl: 2 fb: --HDFL-- lb: 0x2 bindmp: 3c 02
And here’s the binary dump from token 49 – note, particularly, the second byte:
bindmp: 00 07 04 36 40 ca c1 02 d2 20 20 20 20 20 20 20 20 20 31
So one thing we can see is that the work done to maintain the block increases – even on a simple delete. But this token is used in seven other rows in the block, so you what happens if I delete those rows as well? The answer depends on how many concurrent sessions do the deletes. If I use a single process to delete all eight rows Oracle deletes the token as I delete the eighth row – at which point tokens 63 and 64 have to be updated to show that they have one less dependency. If I repeat the test using multiple sessions to delete the rows and don’t commit after each delete, I can get to a situation where a token shows zero usage but doesn’t disappear. (It’s possible that some later block tidying operation that I haven’t yet observed will get rid of tokens in this state.)
Until I mentioned the concurrency test I hadn’t said anything about commits (or rollbacks); the changes to the tokens take place on the delete, and don’t wait for a commit – so what happens if I commit, or rollback? On a commit the usual commit cleanout may take place, updating the transaction’s ITL slot with the commit SCN (in other words, nothing new or interesting happens). On a rollback the data is restored – the row is recreated from the undo information, any tokens that have been deleted will also be recreated, and any relevant token usage counts are incremented.
An important point to note is that on the rollback, compression is preserved. There will be a difference between the original block and the block after the rollback, though – the rows (and tokens) that are restored will be written back into the block’s free space – which may require the block to go through a process that coalesces the free space. So if you repeat a block dump after the rollback you may see that the contents of the block have moved. In my case (after deleting the 8 rows that referenced token 49 and rolling back) I saw the following difference:
tab 0, row 49, @0x1ed0 -- original position of token 0 tab 0, row 49, @0x134a -- position of token 0 after rollback tab 1, row 0, @0x1b28 -- original position of row 0 tab 1, row 0, @0x1322 -- position of row 0 after rollback
Compression and Free space
The business of rows “moving” if you delete and roll back introduces an interesting point about free space. When you declare a table with basic compression, it defaults to pctfree 0 – no free space, and yet there was free space in my block for rows to move into when a rollback took place.
There is a minor complexity that can confuse the issue when using small rows (and “compressed” rows can be very small) but even allowing for this, I found that Oracle seemed to leave a little free space (a few tens of bytes, enough for roughly two full rows in my small test case). This may simply be an error in the estimates that Oracle used for space requirements while using the direct path mechanism to build blocks – it may have been deliberate.
This small amount of space allowed Oracle to restore the deleted rows, pushing down and tidying up the block to reclaim the little holes left from the prior deletes as it did so. In some cases you may find that this free space could even allow you to survive a tiny number of updates – so let’s look at a few numbers.
I’m going to change my starting data set slightly so that a typical row looks like this: (1000001, ‘AAAA’, ’AAAAAAAAAA’,’ 1’), the first column is a sequence number, the second column cycles from As to Es (5 distinct values), the third column from As to Js (10 distinct values), and the last column, space padded to 10 characters, cycles from 1 to 50; and I’ve generated 800 rows of data. Because of the way I’ve created the data, there happen to be eleven rows in the first block where the 2nd and 3rd columns are all A’s, so I’ve going to run the following SQL statement and dump the first block of the table to see what has happened.
update t1
set
vc_rep = 'BBBB'
where
vc_rep = 'AAAA'
and vc_cycle = 'AAAAAAAAAA'
and rownum <= 4
;
It turned out that there was enough space in the block for two rows to be updated and still fit in the block, but two of my rows had to be migrated. Here are the before and after images of one of the rows that stayed in the block; take a close look at the binary dumps:
tab 1, row 0, @0x1bb8 -- before tl: 11 fb: --H-FL-- lb: 0x0 cc: 4 col 0: [ 4] 41 41 41 41 col 1: [10] 41 41 41 41 41 41 41 41 41 41 col 2: [10] 20 20 20 20 20 20 20 20 20 31 col 3: [ 5] c4 02 01 01 02 bindmp: 2c 00 02 03 1b cd c4 02 01 01 02 tab 1, row 0, @0x4f3 -- after tl: 37 fb: --H-FL-- lb: 0x2 cc: 4 col 0: [ 4] 42 42 42 42 col 1: [10] 41 41 41 41 41 41 41 41 41 41 col 2: [10] 20 20 20 20 20 20 20 20 20 31 col 3: [ 5] c4 02 01 01 02 bindmp: 2c 02 04 00 cc 42 42 42 42 d2 41 41 41 41 41 41 41 41 41 41 d2 20 20 20 20 20 20 20 20 20 31 cd c4 02 01 01 02
On the update, Oracle has expanded the row to the full four columns before modifying it and writing it back to the block in the free space (note the change of length (tl: ) and row address). You’ll have to take my word for it that there were two tokens in the token table that could have been used to substitute for the first two columns of this updated row, but Oracle didn’t try to find them and use them. – so at first sight it looks as if updates to compressed data are likely to cause total chaos, a highly compressed row could end up being expanded to a huge row, that won’t fit in the tiny amount of free space and therefore has to be migrated.
In passing, although we now have a bit of a mess, with a couple of expanded rows and a couple of migrated rows, when I issued a rollback, Oracle cleaned up all the mess and (apart from the physical rearrangement of rows in the block) left all the rows in their original compressed, un-migrated state.
So how bad can updates get? Before getting too alarmed it’s worth looking more closely at what I did with my update. I modified a column that was part of a token which covered most of the row. What if my update had changed a column which hadn’t been tokenized – would Oracle still expand the row to deal with the update? The answer is no. The last column in the block dumps above are reporting the id column, here are the “before and after” dumps of a row when I repeated my test but updated the just the id column
tab 1, row 0, @0x1bb8 -- before tl: 11 fb: --H-FL-- lb: 0x0 cc: 4 col 0: [ 4] 41 41 41 41 col 1: [10] 41 41 41 41 41 41 41 41 41 41 col 2: [10] 20 20 20 20 20 20 20 20 20 31 col 3: [ 5] c4 02 01 01 02 bindmp: 2c 00 02 03 1b cd c4 02 01 01 02 tab 1, row 0, @0x1bb8 -- after tl: 10 fb: --H-FL-- lb: 0x2 cc: 4 col 0: [ 4] 41 41 41 41 col 1: [10] 41 41 41 41 41 41 41 41 41 41 col 2: [10] 20 20 20 20 20 20 20 20 20 31 col 3: [ 4] c3 64 64 64 bindmp: 2c 02 02 03 1b cc c3 64 64 64
The update took place in situ (and the total row length dropped from 11 bytes to 10 bytes) because the new value was actually smaller than the old value – but notice how the binary dumps show us that the row consists, in both cases, of token 1b (fifth byte) which represents the first three columns (fourth byte) reported, and one actual stored value. Row expansion has NOT taken place on this update.
It gets even better; my initial test rows happened to be rows where virtually the whole row was covered by a single token – what happens if I have a row that is represented by several tokens and my update affects only one of the columns that is tokenized ? Whatever you may have heard about compressed data and expansion, Oracle doesn’t expand the whole row – it only expands the tokens affected by the update. Again, here’s a “before and after” pair of dumps to make the point:
tab 1, row 18, @0x1ac2 tl: 13 fb: --H-FL-- lb: 0x0 cc: 4 col 0: [ 4] 44 44 44 44 col 1: [10] 58 58 58 58 58 58 58 58 58 58 col 2: [10] 20 20 20 20 20 20 20 20 33 34 col 3: [ 5] c4 02 01 01 14 bindmp: 2c 00 04 03 32 37 45 cd c4 02 01 01 14 tab 1, row 18, @0x1ab8 tl: 23 fb: --H-FL-- lb: 0x2 cc: 4 col 0: [ 4] 44 44 44 44 col 1: [10] 59 59 59 59 59 59 59 59 59 59 col 2: [10] 20 20 20 20 20 20 20 20 33 34 col 3: [ 5] c4 02 01 01 14 bindmp: 2c 02 04 00 32 d2 59 59 59 59 59 59 59 59 59 59 45 cd c4 02 01 01 14
At the start of this test, the binary dump shows that this row consisted of three separate tokens (0×32, 0×37 and 0×45) followed by a stored value. I updated “col 1” which held the value ‘XXXXXXXXXX’ to ‘YYYYYYYYYY’ and, as you can see, the final binary dump shows that this row still includes tokens 0×32 and 0×45, but token 0×37 has been replaced by the actual value. You can also see that the total length of the row has increased by 10 bytes, from 13 to 23, which means Oracle had to move it into the (small amount of) free space and, during a subsequent repacking of the rows in the block, the final address of the row has changed.
So, when you update data that has been subject to basic compression, Oracle may have to do some expansion of tokens into column values, but it keeps this expansion to a minimum (and if the update is a “no-change” update, expansion simply doesn’t take place). Since there seems to be a tiny amount of free space left in a block after compression – even when pctfree has defaulted to 0 – you may be able to do a tiny number of updates to a compressed table without causing dramatic expansion and row migration, but you may find the effects impossible to predict.
If you do need to do a little data maintenance on compressed data, you really need to do some careful testing with real data to see if you can find a suitable setting for pctfree that will keep row migrations to an acceptable level.
Summary
When you delete rows from a table defined with basic compression, there may be a little extra CPU usage as Oracle maintains the token table to reduce the reference counts on related tokens, and deletes tokens when their count reaches zero; but apart from this, and apart from a little space wastage when tokens get to a zero usage but don’t get deleted, the overheads on deletion don’t seem to present a threat.
When you update rows in a table defined with basic compression, you have to remember that Oracle will have set the pctfree to zero by default when you first defined the table as compressed (and will keep resetting it every time you move the table) so there will be very little space for rows to grow unless you explicitly set pctfree to something larger.
To make things more difficult, if your update modifies columns that have been tokenized, Oracle will work on a copy of the row with those modified tokens expanded to their full value – and it will not attempt to recompress the modified row afterwards even if there are suitable tokens already in existence in the block. As a side effect, you are likely to find that updates to compressed tables result in significant increases in row length and excessive row migration. Ironically, the smarter Oracle has been in compressing your data (resulting, typically, in multiple consecutive columns turning into a single token) the worse the explosion is likely to be when you start to update the data.
As a basic guideline – unless you are extremely well-informed about your data and its usage then basic compression should be used only for read-only data. In the next article we’ll be looking at “OLTP” compression and see what Oracle does (if anything) to improve this situation.











12 Comments
DBA
24/01/2013
Could you do:
ALTER TABLE <table_name> MOVE COMPRESS PCTFREE 10;
And eliminate all the UPDATE/DELETE related issues with Basic Compression?
DBA
24/01/2013
I might have answered my own question. I think it doesn't matter what you set PCTFREE to when used with Compression. After an Update is run on a Compressed table, Oracle will increase the size of the table regardless of PCTFREE setting.
It re-affirms what you said to use Basic Compression for mostly read only tables/partitions.
Jonathan Lewis
25/01/2013
DBA,
With a "sufficiently large" setting for pctfree (and that will vary from table to table) you can get away with a small amount of update activity. But if you rebuild the table you have to remember to re-specify the pctfree.
DBA
28/01/2013
The unfortunate aspect of this Compression is that the actual table data could very well be uncompressed even though the COMPRESSION or COMPRESS_FOR columns of DBA_TABLES are set.
Is there a way to tell if the data is actually compressed?
Jonathan Lewis
28/01/2013
DBA
In effect, no.
You might be able to make a reasonable guess - check the average rows per block (num_rows / blocks), then cross-check with the avg_row_len (which reports the uncompressed row length, even when the rows are compressed). You could also use dbms_rowid to convert rowids to file and block numbers and check the number of rows per block individually - because you might have a table where half the data was compressed and half wasn't.
In principle there is a method, mentioned originally in the 9.2 manuals I think. Column spare1 of table sys.seg$ has two bits set to show whether or not a segment is compressed and whether it holds any compressed data, so you could query seg$ (joined to various other data dictionary tables) to check. The tests are: bitand(spare1, 2048) = 2048 to check if the segment is defined with compression, and bitand(spare1, 4096) = 4096 to check if any of the data is compressed - unfortunately when I did a quick check of bit 4096 in a recent version of Oracle I found that it got set as soon as bit 2048 got set. Even it it worked, of course, all you could learn from a single bit was that some of the data in the table was compressed, you'd still have no idea about how much was compressed.
Randolf Geist
29/01/2013
Hi Jonathan,
just a comment that I think to remember that this "optimization":
"...if the update is a “no-change” update, expansion simply doesn’t take place..."
was only added in the 11.2.0.3 release. I believe that previous versions (11.2.0.2 and below) expand/replace the affected tokens even in case of "no-change" updates, which is a pity.
Randolf
Jonathan Lewis
30/01/2013
Randolf,
Thanks for that very important point - which I'm about to edit into part 3.
I can't check earlier point releases of 11.2, but I can confirm that your observation is correct for 10.2.0.5 and 11.1.0.7
DBA
29/01/2013
Randolf,
My following test contradicts your assertion with version 11.2.0.3. Of course, the size of the table after such a 'no-change' update is still smaller than its uncompressed version.
SQL> SELECT * FROM v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE 11.2.0.3.0 Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production
SQL>
SQL> DROP TABLE T PURGE;
Table dropped.
SQL>
SQL> CREATE TABLE t
2 AS
3 SELECT * FROM all_objects;
Table created.
SQL>
SQL> exec dbms_stats.gather_table_stats (user,'T');
PL/SQL procedure successfully completed.
SQL>
SQL> SELECT table_name,
2 num_rows,
3 chain_cnt,
4 pct_free,
5 compression,
6 compress_for
7 FROM user_tables
8 WHERE table_name = 'T';
TABLE_NAME |NUM_ROWS| CHAIN_CNT| PCT_FREE|COMPRESSION |COMPRESS_FOR
------------|--------|----------|----------|---------------|------------
T | 144,268| 0| 10|DISABLED |{null}
SQL>
SQL> SELECT segment_name, bytes, blocks
2 FROM user_segments
3 WHERE segment_name = 'T';
SEGMENT_NAME| BYTES| BLOCKS
------------|----------|----------
T | 18874368| 2304
SQL>
SQL> ALTER TABLE t MOVE COMPRESS;
Table altered.
SQL>
SQL> SELECT table_name,
2 num_rows,
3 chain_cnt,
4 pct_free,
5 compression,
6 compress_for
7 FROM user_tables
8 WHERE table_name = 'T';
TABLE_NAME |NUM_ROWS| CHAIN_CNT| PCT_FREE|COMPRESSION |COMPRESS_FOR
------------|--------|----------|----------|---------------|------------
T | 144,268| 0| 0|ENABLED |BASIC
SQL>
SQL> SELECT segment_name, bytes, blocks
2 FROM user_segments
3 WHERE segment_name = 'T';
SEGMENT_NAME| BYTES| BLOCKS
------------|----------|----------
T | 5767168| 704
SQL>
SQL> UPDATE t
2 SET object_name = object_name;
144268 rows updated.
SQL>
SQL> COMMIT;
Commit complete.
SQL>
SQL> SELECT table_name,
2 num_rows,
3 chain_cnt,
4 pct_free,
5 compression,
6 compress_for
7 FROM user_tables
8 WHERE table_name = 'T';
TABLE_NAME |NUM_ROWS| CHAIN_CNT| PCT_FREE|COMPRESSION |COMPRESS_FOR
------------|--------|----------|----------|---------------|------------
T | 144,268| 0| 0|ENABLED |BASIC
SQL>
SQL> SELECT segment_name, bytes, blocks
2 FROM user_segments
3 WHERE segment_name = 'T';
SEGMENT_NAME| BYTES| BLOCKS
------------|----------|----------
T | 16252928| 1984
Jonathan Lewis
30/01/2013
DBA,
Interesting effect. I can't reproduce the extreme result that you've got, but my all_objects only holds 55,000 rows, so there may be something about the extra rows from your system that exaggerates the effect. It's also possible that you're running with ASSM and there's an extra side effect of migrating rows.
In my quick test on 55,000 my table grew from 220 blocks to 250 blocks, with an average of about 4 rows migrated per block. However, it was clear that only a small fraction of the rows suffered token expansion. I suspect there is a detail of the update mechanics that I've not noticed that becomes increasingly likely to occur if you update a large number of rows in a block.
DBA
30/01/2013
Yes, I am using Automatic segment space management and uniform extent size of 512K.
I ran a test with 10,000 rows and the table grew from 256 blocks to 384 blocks.
Jonathan Lewis
31/01/2013
DBA,
That's bad, but not as bad as your original example.
You have been looking at xxx_segments for allocation, though. Perhaps if you re-ran the stats collection and checked the used blocks the figures wouldn't look so extreme. (In fact, if you use the old analyze command, you will also get the chain_cnt populated to see how migrated rows it took to acquire the extra blocks.)
Jonathan Lewis
01/02/2013
DBA,
You emailed me your test script and results, showing the table growing (apparently) from 47 blocks to 128 blocks with 5,520 chained rows. On my system - with different contents in all_objects - my table grew (apparently) from 43 blocks to 64 blocks with only 223 chained rows. If we used the same data we might get closer results - for example if we selected "where object_id <= 12000" my system gives 10,017 rows - which are likely to be commonly installed objects.
One thing to note - when I used dbms_space to check the free-space figures (fs1 - fs4) for the extra blocks, and then checked in detail with block dumps, I found that my extra 21 blocks consisted of 3 full blocks, two blocks with 1 or two rows, and 14 blocks which had been formatted in anticipation. So part of the growth in small tests is due to the ASSM strategy of making batches of blocks available for free space. I recall doing one test (I forget which version of Oracle, possiblg 11.2.0.1) where I created a table and inserted one row - which resulted in 64 blocks being formatted and the very last block being used for the one row !
Compression in Oracle – Part 1: Basic Table Compression – All Things Oracle
24/01/2013
[...] Read Part 2: Read-Only Data Jonathan Lewis [...]
Compression « Oracle Scratchpad
24/01/2013
[...] Basic Compression – the implications [...]
Compression in Oracle – Part 3: OLTP Compression – All Things Oracle
31/01/2013
[...] part two of this series we examined the effect of updates on compressed data when using basic compression, [...]
Compression in Oracle – part 4: Basic Index Compression – All Things Oracle
13/02/2013
[...] the first three parts (Part 1: Basic Table Compression, Part 2: Read-Only Data, Part 3: OLTP Compression) of this series we examined table compression – both basic and the [...]
No trackbacks yet.