Cache Fusion Recovery

RAC环境下实例崩溃恢复的几种情况。

下介绍下文章中的几个缩写的含义:

 lock_le.jpg

After an instance dies and the failure is detected, the SMON process of a surviving
instance will start the first pass log read of the failed instance’s redo thread. 
SMON will merge the redo thread ordered by SCN to ensure that changes are written in
an orderly fashion.  SMON will also find BWR (block written records) in the redo stream
and remove entries that are no longer needed for recovery because they were past
images of blocks already written to disk.  The final product of the first pass log
read is a recovery set that only contains blocks modified by the failed instance
with no subsequent BWR to indicate that the blocks were later written.  Each entry
in the recovery list is ordered by first-dirty SCN to specify the order to acquire
instance recovery locks.  The recovering SMON process will then inform each lock
element’s master node for each block in the recovery list that it will be taking
ownership of the block and lock for recovery.  This is handled differently depending
on ownership of the lock element as described below:


Case 1: LE not open (or in NL0 mode) on recovering instance, no other instances own
lock element:恢复实例的锁模式是NL0,其他实例没有LE

    —————-      —————–     —————– 
    |  Recovering   |     |  Other Open   |    |     Failed     |
    |   Instance    |     |   Instance    |    |    Instance    |
    |               |     |               |    |                |
    |   Lock Held   |     |   Lock Held   |    |   Lock Held    |
    | on LENUM 123: |     | on LENUM 123: |    | on LENUM 123:  |
    |               |     |               |    |                |
    —————-      —————–     —————– 

Action: Acquire lock element in XL0 mode, read block from disk, and apply redo
changes then DBWR will write out recovery buffer when complete:

在XL0mode也就是本地排它锁并且没有BLOCK前影像的情况下,需要从硬盘读取,需要恢复的BLOCK,然后用用重做日志恢复,

    —————-      —————–     —————– 
    |  Recovering   |     |  Other Open   |    |     Failed     |
    |   Instance    |     |   Instance    |    |    Instance    |
    |               |     |               |    |                |
    |   Lock Held   |     |   Lock Held   |    |   Lock Held    |
    | on LENUM 123: |     | on LENUM 123: |    | on LENUM 123:  |
    |      XL0      |     |               |    |                |
    —————-      —————–     —————– 
            |
keep block in recovery list

Case 2: LE not open (or in NL0 mode) on recovering instance, other instance has LE
in SL0 or XL0 mode: 

    —————-      —————–     —————– 
    |  Recovering   |     |  Other Open   |    |     Failed     |
    |   Instance    |     |   Instance    |    |    Instance    |
    |               |     |               |    |                |
    |   Lock Held   |     |   Lock Held   |    |   Lock Held    |
    | on LENUM 123: |     | on LENUM 123: |    | on LENUM 123:  |
    |               |     |      XL0      |    |                |
    —————-      —————–     —————– 

Action: No recovery needed because a current copy of the buffer already exists on
another instance, remove block entry from recovery set.  

在NL0模式下,其他实例有XL0或SL0,情况下,不需要恢复此BLOCK,

    —————-      —————–     —————– 
    |  Recovering   |     |  Other Open   |    |     Failed     |
    |   Instance    |     |   Instance    |    |    Instance    |
    |               |     |               |    |                |
    |   Lock Held   |     |   Lock Held   |    |   Lock Held    |
    | on LENUM 123: |     | on LENUM 123: |    | on LENUM 123:  |
    |               |     |      XL0      |    |                |
    —————-      —————–     —————– 
            |
remove block from recovery list

Case 3: LE not open (or in NL0 mode) on recovering instance, other instance has LE
in SG# or XG#:恢复的实例处于NL0模式,其他实例处于SG#或XG#,模式下,因为其他实例有全局锁,且有当前需要恢复的BLOCK的映像

    —————-      —————–     —————– 
    |  Recovering   |     |  Other Open   |    |     Failed     |
    |   Instance    |     |   Instance    |    |    Instance    |
    |               |     |               |    |                |
    |   Lock Held   |     |   Lock Held   |    |   Lock Held    |
    | on LENUM 123: |     | on LENUM 123: |    | on LENUM 123:  |
    |               |     |      XG0      |    |                |
    —————-      —————–     —————– 

Action: Initiate write of current block, no recovery needed because a current copy of
the buffer already exists on another instance, remove block entry from recovery set.  
Write completion will release recovery buffer and lock as usual:

不需要恢复,因为其他实例有BLOCK的当前影像。

    —————-      —————–     —————– 
    |  Recovering   |     |  Other Open   |    |     Failed     |
    |   Instance    |     |   Instance    |    |    Instance    |
    |               |     |               |    |                |
    |   Lock Held   |     |   Lock Held   |    |   Lock Held    |
    | on LENUM 123: |     | on LENUM 123: |    | on LENUM 123:  |
    |      NG1      |     |               |    |                |
    —————-      —————–     —————– 
            |                        |
            |                     write block to disk
remove block from recovery list   

Case 4: LE not open (or in NL0 mode) on recovering instance, other instance has LE
in NG1. 

    —————-      —————–     —————– 
    |  Recovering   |     |  Other Open   |    |     Failed     |
    |   Instance    |     |   Instance    |    |    Instance    |
    |               |     |               |    |                |
    |   Lock Held   |     |   Lock Held   |    |   Lock Held    |
    | on LENUM 123: |     | on LENUM 123: |    | on LENUM 123:  |
    |               |     |      NG1      |    |                |
    —————-      —————–     —————– 

Action: Get consistent read image of latest past image based on SCN, apply redo
changes and write out recovery buffer when complete.

基于SCN读取原来一致性的影像,应用日志恢复。

    —————-      —————–     —————– 
    |  Recovering   |     |  Other Open   |    |     Failed     |
    |   Instance    |     |   Instance    |    |    Instance    |
    |               |     |               |    |                |
    |   Lock Held   |     |   Lock Held   |    |   Lock Held    |
    | on LENUM 123: |     | on LENUM 123: |    | on LENUM 123:  |
    | acquires XG0  |     |      NG1      |    |                |
    —————-      —————–     —————– 
            |                         |
            |                  send CR block to recovering instance    
keep block in recovery list   

Case 5: LE open in recovering instance in SL0 or XL0, other instance has no lock.

    —————-      —————–     —————– 
    |  Recovering   |     |  Other Open   |    |     Failed     |
    |   Instance    |     |   Instance    |    |    Instance    |
    |               |     |               |    |                |
    |   Lock Held   |     |   Lock Held   |    |   Lock Held    |
    | on LENUM 123: |     | on LENUM 123: |    | on LENUM 123:  |
    |      XL0      |     |               |    |                |
    —————-      —————–     —————– 

Action: No recovery needed because a current copy of the buffer already exists on
another instance, remove block entry from recovery set. 

    —————-      —————–     —————– 
    |  Recovering   |     |  Other Open   |    |     Failed     |
    |   Instance    |     |   Instance    |    |    Instance    |
    |               |     |               |    |                |
    |   Lock Held   |     |   Lock Held   |    |   Lock Held    |
    | on LENUM 123: |     | on LENUM 123: |    | on LENUM 123:  |
    |      XL0      |     |               |    |                |
    —————-      —————–     —————– 
            |
remove block from recovery list

Case 6: LE open in recovering instance in SG# or XG#, other instance doesn’t matter:

当前恢复实例有当前BLOCK的一致性影像,不需要实例恢复,DBWR会把回复区内容写到硬盘
    —————-      —————–     —————– 
    |  Recovering   |     |  Other Open   |    |     Failed     |
    |   Instance    |     |   Instance    |    |    Instance    |
    |               |     |               |    |                |
    |   Lock Held   |     |   Lock Held   |    |   Lock Held    |
    | on LENUM 123: |     | on LENUM 123: |    | on LENUM 123:  |
    |      XG0      |     |      NG1      |    |                |
    —————-      —————–     —————– 

Action: Initiate write of current block, no recovery needed on recovering instance.
Release recovery buffer and decrement past image count when block write completes.

    —————-      —————–     —————– 
    |  Recovering   |     |  Other Open   |    |     Failed     |
    |   Instance    |     |   Instance    |    |    Instance    |
    |               |     |               |    |                |
    |   Lock Held   |     |   Lock Held   |    |   Lock Held    |
    | on LENUM 123: |     | on LENUM 123: |    | on LENUM 123:  |
    |      XG0      |     |      NG1      |    |                |
    —————-      —————–     —————– 
            |
write block to disk
remove block from recovery list

Case 7: LE open in recovering instance in NG1 mode, other instance has LE in SG# or
XG# mode.

    —————-      —————–     —————– 
    |  Recovering   |     |  Other Open   |    |     Failed     |
    |   Instance    |     |   Instance    |    |    Instance    |
    |               |     |               |    |                |
    |   Lock Held   |     |   Lock Held   |    |   Lock Held    |
    | on LENUM 123: |     | on LENUM 123: |    | on LENUM 123:  |
    |      NG1      |     |      XG0      |    |                |
    —————-      —————–     —————– 

Action: Initiate write of current block on remote instance, no recovery needed on
recovering instance.  Release recovery buffer and decrement past image count when
block write completes:

    —————-      —————–     —————– 
    |  Recovering   |     |  Other Open   |    |     Failed     |
    |   Instance    |     |   Instance    |    |    Instance    |
    |               |     |               |    |                |
    |   Lock Held   |     |   Lock Held   |    |   Lock Held    |
    | on LENUM 123: |     | on LENUM 123: |    | on LENUM 123:  |
    |      NG1      |     |      XG0      |    |                |
    —————-      —————–     —————– 
            |                        |
            |                     write block to disk
remove block from recovery list   

Case 8: LE open in recovering instance in NG1 mode, other instance has LE in NG#
mode:

    —————-      —————–     —————– 
    |  Recovering   |     |  Other Open   |    |     Failed     |
    |   Instance    |     |   Instance    |    |    Instance    |
    |               |     |               |    |                |
    |   Lock Held   |     |   Lock Held   |    |   Lock Held    |
    | on LENUM 123: |     | on LENUM 123: |    | on LENUM 123:  |
    |      NG1      |     |      NG0      |    |                |
    —————-      —————–     —————– 

Action: Get consistent read copy of block from highest past image based on SCN. 
Apply redo changes and write out recovery buffer when complete:

    —————-      —————–     —————– 
    |  Recovering   |     |  Other Open   |    |     Failed     |
    |   Instance    |     |   Instance    |    |    Instance    |
    |               |     |               |    |                |
    |   Lock Held   |     |   Lock Held   |    |   Lock Held    |
    | on LENUM 123: |     | on LENUM 123: |    | on LENUM 123:  |
    | acquires XG1  |     |      NG0      |    |                |
    —————-      —————–     —————– 
            |                         |
            |                  send CR block to recovering instance    
keep block in recovery list   

After the above operation the recovering instance should have locks on every block
in the recovery set.  Other instances will not be able to acquire these locks until
the recovery operation is completed.  When blocks are cached for recovery, instance
recovery buffers cannot be replaced or aged out except by another recovery buffer
request.  At this point the second pass log read and redo application can begin. 
When the second pass log read begins again redo threads for failed instances are
merged by SCN and the redo is applied to the datafiles. 

Instance Recovery Failure Scenerios:

        o If recovery fails without the death of the recovering instance instance
          recovery will be restarted.

        o If the recovering instance dies, a surviving instance (if one exists) will
          acquire the instance recovery enqueue and start recovery.  Crash recovery
          will be necessary if all instances are down.

        o If a non-recovering instance fails, SMON will abort recovery, release the
          IR enqueue, and the next live instance will re-attempt instance recovery.

        o If there are I/O errors the file is taken offline and instance recovery
          is restarted.  If the file is the system datafile the recovering instance
          will crash; eventually all instances in the cluster will go down and
          media recovery will be required.

        o If block corruption is encountered during redo application online block
          recovery will attemp to clean up the block in order for instance recovery
          to proceed. 

Online Block Recovery for Cache Fusion
————————————–

When a data buffer becomes corrupt in an instance’s cache, the instance will
initiate online block recovery.  Block recovery will occur if either a foreground
process dies while applying changes or an error is generated during redo application. 
In the first case, PMON initiates block recovery and in the second case the
foreground process initiates block recovery.  Online block recovery consists of
finding the block’s predecessor and applying redo changes from the online logs of the
thread in which corruption occurred.  The predecessor of a fusion block is its most
recent past image.  If there is no past image then the block on disk is the
predecessor.  For non-fusion blocks, the disk copy is always the predecessor.

If the LE of the block needing recovery is held in XL0 status then the predecessor
will be located on disk.

If the LE of the block needing recovery is held in XG# status then the predecessor
will exist in another instance’s buffer cache.  The instance with the highest SCN PI
image of the block will send a consistent read copy of the block to the recovering
instance.

Media Recovery for Cache Fusion
——————————-

Cache fusion does not impact the existing mechanism for media recovery



评论暂缺