RAC object remastering ( Dynamic remastering )

In RAC, every data block is mastered by an instance. Mastering a block simply means that master instance keeps track of the state of the block until the next reconfiguration event (due to instance restart or otherwise).

Hash to the master

These data blocks are mastered in block ranges. For example, range of blocks starting from file 10, block 1 through block 128 may be mastered by instance 1, blocks from file 10, block 129 through 256 are mastered by instance 2 etc. Of course, there are differences between various versions 10g, 11g etc, but Idea here is that block ranges are uniformly mastered between various instances so that Global cache grants are evenly distributed among the instances. Interestingly, length of the block range is 128 from 10g onwards (Julian Dyke mentioned that is 1089 in 9i, but I have not personally tested it). Of course, Support recommends you to unset db_file_multiblock_read_count which will be auto adjusted to 128 which means that Full block range can be read with fewer GC messages, I suppose. I digress.

Further, Michael Möller pointed out that this hash-algorithm is further optimized: The hash-algorithm used when initially computing the master node from the DBA, results in a “virtual master”, which is then translated to a real (online&open) master by a lookup table (the length of which is the maximum number of possible nodes (128 ?). This means that when one node goes off/on-line, RAC does NOT have to recalculate the hash for all blocks, but only distribute the new Hash-to-node table. (One can later visualize dynamic remastering as an additional lookup table between the hash value and node. This table also needs redistributing on node changes.)

Following SQL is helpful in showing masters and owners of the block. This SQL joins, x$kjbl with x$le to retrieve resource name. If you are familiar with Oracle locking strategy, you would probably recognize the format of these cache fusion (aka old PCM) locks. Lock type in this case is BL, id1 is block# and id2 is file_id in this case. Column kjblname2 provides a decimal format lock resource.

Please observe the output below:

  1. Block range: File 1, block 6360-6376 is mastered by node 3 and also owned by node 3.
  2. Block range: File 1, blocks upto 10709 is mastered by instance 0 and owned by instance 3.
  3. Next block range from 10770 is mastered by instance 2 and owned by 3.

Also, note that this output is generated from a database with no remastering done yet.

REM In kjblname2 first entry before ',' is block and seond entry file_id*65536 for BL locks select kj.*, le.le_Addr from ( select kjblname, kjblname2, kjblowner, kjblmaster, kjbllockp, substr ( kjblname2, instr(kjblname2,',')+1, instr(kjblname2,',',1,2)-instr(kjblname2,',',1,1)-1)/65536 fl, substr ( kjblname2, 1, instr(kjblname2,',')-1) blk from x$kjbl ) kj, x$le le where le.le_kjbl = kj.kjbllockp order by le.le_addr / KJBLNAME KJBLNAME2 KJBLOWNER KJBLMASTER FL BLK LE_ADDR ---------------------- --------------- ---------- ----------- --- ----- ---------------- [0x18d8][0x10000],[BL] 6360,65536,BL 3 3 1 6360 000000038DF9AB08 ... [0x18e7][0x10000],[BL] 6375,65536,BL 3 3 1 6375 000000038DFBF818 => case #2 [0x18e8][0x10000],[BL] 6376,65536,BL 3 3 1 6376 000000038DFD3BA0 ... [0x29d1][0x10000],[BL] 10705,65536,BL 3 0 1 10705 00000005D6FE9230 ... [0x29d5][0x10000],[BL] 10709,65536,BL 3 0 1 10709 000000038EFBB990 [0x2a12][0x10000],[BL] 10770,65536,BL 3 2 1 10770 000000075FFE3C18 ... [0x2a18][0x10000],[BL] 10776,65536,BL 2 2 1 10776 000000038EFA8488 => case #1 [0x2a19][0x10000],[BL] 10777,65536,BL 3 2 1 10777 000000038EFB7F90 [0x2a1a][0x10000],[BL] 10778,65536,BL 1 2 1 10778 000000038EFCC318