Data Blocks and Freelists (from ixora)

Questions and Answers

Data Blocks and Freelists

Transaction and process freelists

26 October 1998

---|---|---
You mentioned that there are different types of free lists. Could you please explain a bit more about all this?
| Each segment has at least a master free list. This free list is implemented as a linked list. The segment header block contains a pointer to the first block on the free list. In the block header for that block is a pointer to the next free list block and so on. The free list pointer in the block header of the last block on a free list is null. The free list header record in the segment header block also contains a pointer to the last block in the free list.

If a segment is created with multiple free lists, then the segment header block also contains a free list header record for each of these process free lists in addition to the master free list.

When a process performs DML that causes a block to fall below the PCTUSED value for the table/cluster, or entirely frees an index leaf block (however, the index case is more complex, because blocks on a free list are not unlinked from the B*-tree structure until an attempt is made to reuse them), that block is put onto a transaction free list. Transaction free lists are dynamically created as necessary, and also have a header record in the segment header block. The transaction free list header also records the transaction id.

When a transaction needs to insert/migrate a row, it uses the first block on its transaction free list, if it has previously freed some blocks (except, for an index, because block splits always occur in a recursive transaction). Otherwise a process free list is used, based on the process number (PID) mod the number of process free lists. If the process free list is empty, or if there are no process free lists, then the master free list is used.

If the master free list is empty, and if there are some transaction free lists for transactions that have committed, then the transaction free list is marked as unused and the blocks are merged into the master free list. Otherwise, the high water mark is raised, initially by 1 block at a time for the first 5 data blocks in the segment, and thereafter by the greater of 5 blocks or _bump_highwater_mark_count blocks (which defaults to 0) times the number of process free lists plus 1 (for the master free list), up to the number of blocks remaining in the extent. These blocks are newed, which means that a free buffer is allocated in the cache and the block header is formatted to make the blocks part of the segment. Raising the high water mark may involve dynamic extension. Once the master free list is not empty, up to 5 blocks are moved to the target process free list if any, from where they can be used.

The number of free list headers that can fit into the segment header block is limited by the database block size. At least half of the free list slots must be available for transaction free lists. You can see the exact number with the query
select kviival from x$kvii where kviitag = 'ktsmtf';
connected as SYS (or internal). On busy segments, more transaction free lists than that may be dynamically created, slots permitting.

If a segment is created with multiple free list groups, one block after the segment header is used for the free list header records for each free list group. There is still a master free list in the segment header, as well as a master free list in each group. Free list group selection is based on the instance number mod the number of free list groups.

As mentioned previously, free list contention occurs when multiple processes using the same free list attempt to modify the data block on the head of the free list concurrently. It is shown in V$WAITSTAT against the data block class. V$WAITSTAT can also show contention for the segment header and free list blocks. This occurs where multiple transaction in the same free list group need to update their free list header records simultaneously. There are various ways of addressing these problems such as rebuilding the table with more free list groups, or increasing _bump_highwater_mark_count , or the novel idea of fixing the application.

To drill down on which segments are causing data block contention, I suggested using event 10046, level 8. This creates a trace file much like to one produced by the sql_trace facility, except that for each event wait a line is printed to the trace file. In particular, each buffer busy wait is recorded together with the P1 and P2 values which are the data file and block number of the wait. So to find which blocks a process has been waiting on, you just grep the trace file for buffer busy waits lines and produce a histogram of the file and block numbers most commonly waited for. Once you have suspect file and block numbers, you can relate them to a segment by querying DBA_EXTENTS. In the case of free list contention on a table it is common to have several hot blocks just below the high water mark for the segment.

There is no CPU cost to having multiple process free lists, and a trivial cost to having multiple free list groups. If you have too many process free lists the worst you get is more free space below the HWM that is not accessible to any particular process. You also get a lower limit on the number of possible transaction free lists you can have.

---|---

Data block and free list waits

1 April 1999

---|---|---
In querying the V$WAITSTAT, the results show that 'data block' has a count of 797344 and 'free list' has a count of 0. I have read that if the 'data block' count is high, then I should increase the freelists on the tables that are subject to concurrent inserts. However, if there were freelist contention, would it not show up with a count for the class of freelist? What then does a count for 'data block' mean?
| Unless you create multiple freelist groups, you will never see waits for free list blocks in V$WAITSTAT. The freelist headers only go into separate free list block if there are multiple free list groups - otherwise, they go into the segment header block. V$WAITSTAT shows buffer busy waits by the data block class and if there are no free list blocks, then you cannot wait for them!

However, waits for 'data block' class blocks are commonly due to insufficient process freelists. If multiple insert processes use the same process free list concurrently, then they will all attempt to modify the data block on the head of that free list. The associated buffer busy waits are recorded as 'data block' class waits because that is the class of block on which they are trying to establish a buffer lock.

Discrete transactions hold buffer locks for the duration of the transaction and can also cause data block buffer busy waits.

---|---

V$WAITSTAT

5 April 1999

---|---|---
I'm looking for help on how to resolve "data block" waits that are found in the V$WAITSTAT table. What causes them and what's the best way to resolve this?
| The main cause of this is two processes trying to modify the block at the same time. However, it also occurs when one process is busy reading the block into the buffer cache, but it is not yet there, and another process needs to access the same block. You can tell the difference by looking at P3 in V$SESSION_WAIT if you are lucky enough to catch some of the buffer busy waits there. A 0 or 1014 value is the read case; all other values are the modification contention case.

Your next step should be to try to identify which database segments are involved. Once again, if you can catch some waits in V$SESSION_WAIT then you can use the P1 & P2 (file & block) values to lookup the segment name in DBA_EXTENTS. If it is a table, then 9 times out of 10 all you need to do is to recreate the table with more process freelists. One way of working out how many freelists to create is to dump some blocks from the segment not very far below the high-water mark and take a look at the interested transaction list size. The peak number of interested transactions plus one is the minimum number of process freelists that you need.

---|---

Transaction and process freelists

18 April 1999

---|---|---
How can I determine how many transaction freelists are available and how much space they consume? Does it depend on whether the FREELISTS parameter is specified in the STORAGE clause?
| Other than the freelist headers in the segment header block (or freelist block if multiple freelist groups have been specified), freelists do not consume any space at all. Blocks are linked into a freelist using a fixed pointer in the block header.

The number of freelists available depends on your database block size and Oracle version. For a 2K block size under 8.0.5 there can be up to 47 process and transaction freelists in total. At least 25 must be reserved for transaction freelists. You can see this number with the following query:

select kviival from x$kvii where kviitag = 'ktsmtf';

---|---

Row overhead

19 April 1999

---|---|---
I would like to know if the AVG_ROW_LEN column in USER_TABLES that is populated by the ANALYZE command includes the row directory overhead of 3 * UB1 for each row stored or not.

	The row directory overhead is 2 bytes per row, not 3. There is also a 3 byte overhead per row in the data section of the block: 1 for the total row length, 1 for flags, and 1 for the column count. AVG_ROW_LEN does not include the row directory overhead, but does include the other three bytes of row data overhead.

Blocks on freelists

25 May 1999

---|---|---
How can you calculate the number of blocks that are on a freelist, and how can you tell whether or not a block is on a freelist and which one it is on if there are more than one?

You can use DBMS_SPACE.FREE_BLOCKS to count the blocks on the freelists for a particular freelist group, but not down to the level of distinct freelists. It is not feasible to determine which freelist a particular block is on, because the freelists are singly linked. The code would have to start at the freelist header block (segment header) and follow all the freelists one block at a time to determine this, which makes it infeasible on performance grounds.

Freelist groups

24 June 1999

---|---|---
You suggested 2 freelist groups. Now every manual I know of relates this setting to multi-instance Oracle and I believe that the group is just on the instance number. So why 2?

You have buffer busy waits on the segment header blocks for your key tables. By having two freelist groups, you effectively move the freelists out of the segment header block into a separate freelist block. There will be two such freelist blocks, one of which will never be used. But the load that you presently have on the segment header block will be divided between the new segment header block and the first freelist block.

INITRANS and MAXTRANS

29 June 1999

---|---|---
I found an Oracle book that said that the value of INITRANS and MAXTRANS can be low if the index entries are large. Could you tell me why?
| When a transaction modifies a block, it needs exclusive use of a transaction slot, from that moment, until it commits (or rolls back). This is used to implement row level locking. If a table block only contains one row, then only one transaction slot can be needed. Similarly, if an index block only has 3 keys, then at most 4 transaction slots can be needed (the extra one is for a recursive transaction for an index block split).

Although the number of row/keys in a block sets the maximum for the number of transaction slots that might be needed, it is most unusual to require that many slots. Indeed, given that Oracle can dynamically allocate extra transaction slots from internal block free space as required, it is questionable whether you should use a non-default INITRANS setting at all, except in unusual circumstances (such as parallel DML). Another exception is where rows/keys are updated in sequence by distinct transactions.

So your author is right about INITRANS in general. However it has more to do with Oracle's ability to dynamically allocate transaction slots than the questionable assumption of random data access. I would also caution against tampering with MAXTRANS.

---|---

First INSERT slow

30 June 1999

---|---|---
The main account table is processed during batch by an array fetch and update. The first INSERT to the account table does 55000 disk reads and takes 15 minutes. Subsequent INSERTs work normally.

This sounds like a freelist issue. If your overnight process first drops block space usage below PCTUSED, and then raises it close to PCTFREE again, you will thereby create a very long transaction freelist. The first insert has to migrate that transaction free list to the master freelist, and unlink the leading blocks on the freelist that do not have enough free space for the row to be inserted. You can control the maximum number of blocks that will be unlinked from the freelist by the first insert with the _release_insert_threshold parameter. I think this is supposed to default to 5 blocks, but if my hunch is correct, it seems to be much higher in your case. Alternately, you can reduce PCTUSED to avoid the problem entirely.

Strange enqueue wait

1 October 1999

---|---|---
Why would Oracle incur an enqueue wait on update statements issued by two database sessions that touch two different records if both records are on the same block, but otherwise succeed if the statements update two records from different blocks?

SESSION_ID TYPE ID1        ID2        MODE_HELD      MODE_REQUESTED
---------- ---- ---------- ---------- -------------- --------------
        19 TX   262182     71967      None           Share
        18 TX   262182     71967      Exclusive      None

	You can tell from the fact that the waiting session wants a SHARE mode lock that it is waiting for a free transaction slot in the block. The problem is that there is not enough free space in the block to dynamically allocate another transaction slot.

Impact of multiple freelists

25 October 1999

---|---|---
I have found out that freelists on one table was set up to 10. Generally you set freelists to greater than 1 (maybe 2) for heavy inserting tables. My question is, What kind negative impact do high freelists have on a table aside from more storage requirements?
| Multiple process freelist have a minimal impact on table storage requirements. On average each process freelist will have either 2 or 3 blocks. That is, the difference between 2 freelists and 10 freelists in terms of storage space is likely to be just 20 more blocks below the highwater mark.

The freelist headers themselves have no storage requirements, as they go into the freelist group block or segment header block. Having a high number of process freelists does limit the space in that block for the dynamic creation of transaction freelists to which blocks that fall below PCTUSED may be returned. However, there is a database block size based minimum to prevent the creation of too many process freelists. That minimum can be seen with

select kviidsc, kviival from x$kvii where kviitag = 'ktsmtf';

---|---

Block header size

25 October 1999

---|---|---
I'd like to know the space actually reserved for row data in a block. I suppose it's (total block size - block header size). In this case the problem becomes knowing the block header size.

The common block header takes 20 bytes, and the tail takes 4 bytes. The transaction control header takes 24 bytes, and there are 24 bytes for each transaction slot. Then there is a 14-byte common header for the data layer. Cluster blocks have a table directory, followed by a row directory. Each slot in the table directory takes 4 bytes, and each slot in the row directory takes 2 bytes. Table blocks are the same as cluster block, except that there can only be one table in the table directory. Index blocks have a 16-byte common index block header following the common data block header, and then either a 24-byte branch block header, or a 32-byte leaf block header, depending on the block type. These index headers are followed by a row directory, as for clusters and tables. There is also a 3-byte row header on each row in the data area. So there you have it ... it depends.

Minimum row length

17 November 1999

---|---|---
In a recent article in Oracle Professional magazine, it says

> PCTFREE will always default to at least 10 percent, even if its set lower than 10 percent

and to confirm this the author suggests

> Create a new tablespace; create a single table with PCTFREE set to 1 or so, and load short rows of just a few characters. It should be easy to find the block and see. Use od on Unix. Anyway, the data appears from the back and moves forward in the block. I would say if there is a significant blank spot ('00') at the beginning of the data, then it is indeed leaving behind some space".

Can you throw any light on this?

The author is mistaken. This phenomenon that he has observed is due to the minimum row length requirement. If he were to use rows longer than 6 bytes of real data, he would be able to fill up a block completely. It is not necessary to resort to od to see the available space, it is much preferable to use an Oracle blockdump because Oracle only logically deletes rows in blocks by changing a flag until such time as the space is needed.

Freelists

8 December 1999

---|---|---
Will an insert statement lock the table being inserted into? If so, is there any way to avoid this? I have one table into which all transactions are inserting. I feel this table is acting as a bottleneck and slowing down all the users. What is the best way to approach this?

	Such a table should be recreated with multiple process FREELISTS. To get the appropriate number of freelists, take a blockdump of some of the blocks in the table, and use the maximum number of interested transaction list entries that you see in any one block.

Minimum row length

9 December 1999

---|---|---
I am unaware of a minimum row length. Can you enlighten me some more here?

	The minimum row length requirement is that at least 11 bytes of data area are required for each row, to allow for the possibility of all the rows in the block needing to be migrated. The 11 bytes are 2 for a row directory entry, 3 for the row header and 6 bytes for the rowid of the migrated row piece. Therefore no more than (data area size / 11) rows can be placed in any one block.

Freelists

11 December 1999

---|---|---
Is there any benefit in increasing the number of freelists on a single CPU machine? Also, is there any point in doing so if there are no deletes, because all inserts will go to the high-water mark, rather than using freelists?

	Yes, I have seen benefit in multiple freelists on a single CPU machine. While buffer lock duration is brief, it is still long enough relative to a time-slice to make a significant impact. The frequency of deletes is not a factor. All inserts use freelists, even when inserting at the high-water mark.

Freelists

27 December 1999

---|---|---
I am trying to understand transaction and process freelists. What is the procedure when Oracle needs to get a b