DBA工作备忘录之三:rman备份,未使用catalog,控制文件丢失的解决办法

** [Oracle] DBA工作备忘录之三:rman备份,未使用catalog,控制文件丢失的解决办法 **

作者: Fenng
日期:July 19 2004

情况描述

客户报告数据库故障,新来的系统管理员误操作。删掉了一些文件。
询问:删掉了那些文件?
答曰:所有重要数据文件,所有控制文件。数据库原来是归档模式,用rman备份数据,rman 使用控制文件。
幸运的是,最后一次rman full 备份是包括了控制文件在内。系统没有设定自动备份控制文件.现在状况是数据库无法启动.

不用说,客户的备份方案不够完善,但是这时候再去说这些话责备用户有事后诸葛亮之嫌,用户是上帝,不要去得罪他。还有,客户有Full备份(虽然不是自动备份控制文件,这样无法用常规的恢复步骤来进行恢复)。这对我们来说是个绝对的好消息。

下面我们通过一次模拟操作来演示这个问题的解决办法。

解决过程

首先,用控制文件作数据库系统的全备份:

代码:------------------------蓝色部分是输入内容,黑色部分是敏感信息,须加以注意----------------------------------------------------
C:WUTemp> rman target /

Recovery Manager: Release 9.2.0.1.0 - Production.

Copyright (c) 1995, 2002, Oracle Corporation. All rights reserved.

connected to target database: DEMO ( DBID=3272375326 )

RMAN> run {
2> allocate channel C1 type disk;
3> backup full tag 'FullBackup' format 'd:\KDE%d_%u_%s_%p.dbf' database include current controlfile;
4> sql ' alter system archive log current';
5> release channel C1;
6> }

using target database controlfile instead of recovery catalog
allocated channel: C1
channel C1: sid=15 devtype=DISK

Starting backup at 18-JUL-04
channel C1: starting full datafile backupset
channel C1: specifying datafile(s) in backupset
including current SPFILE in backupset
including current controlfile in backupset
input datafile fno= 00001 name=D:\ORACLE\ORADATA\DEMO\SYSTEM01.DBF
input datafile fno= 00002 name=D:\ORACLE\ORADATA\DEMO\UNDOTBS01.DBF
input datafile fno= 00004 name=D:\ORACLE\ORADATA\DEMO\EXAMPLE01.DBF
input datafile fno= 00009 name=D:\ORACLE\ORADATA\DEMO\XDB01.DBF
input datafile fno= 00005 name=D:\ORACLE\ORADATA\DEMO\INDX01.DBF
input datafile fno= 00008 name=D:\ORACLE\ORADATA\DEMO\USERS01.DBF
input datafile fno= 00003 name=D:\ORACLE\ORADATA\DEMO\DRSYS01.DBF
input datafile fno= 00006 name=D:\ORACLE\ORADATA\DEMO\ODM01.DBF
input datafile fno= 00007 name=D:\ORACLE\ORADATA\DEMO\TOOLS01.DBF
channel C1: starting piece 1 at 18-JUL-04
channel C1: finished piece 1 at 18-JUL-04
piece handle= D:\KDE\DEMO_01FR79OT_1_1.DBF comment=NONE
channel C1: backup set complete, elapsed time: 00:01:17
Finished backup at 18-JUL-04

sql statement: alter system archive log current

released channel: C1

--如上所示,我们做了一次数据库的Full备份.备份片中包括控制文件.注意上面输出内容的黑体部分.我们在后面的恢复操作中会用到.

模拟错误,关掉实例,删掉所有的控制文件和所有的.DBF文件。然后starup会看到如下的出错信息:

SQL> startup
ORACLE instance started.

Total System Global Area 152115804 bytes
Fixed Size 453212 bytes
Variable Size 100663296 bytes
Database Buffers 50331648 bytes
Redo Buffers 667648 bytes
ORA-00205: error in identifying controlfile, check alert log for more info

查看alert Log,应该是系统找不到控制文件.现在情形和客户问题一致.不过在继续讲述之前,我们还需要介绍一点背景知识.

背景知识:

在Oracle 816 以后的版本中,Oracle提供了一个包:DBMS_BACKUP_RESTORE.DBMS_BACKUP_RESTORE 包是由dbmsbkrs.sql 和 prvtbkrs.plb 这两个脚本创建的.catproc.sql 脚本运行后会调用这两个包.所以是每个数据库都有的这个包是Oracle服务器和操作系统之间IO操作的接口.由恢复管理器直接调用。而且据说这两个脚本的功能是内建到Oracle的一些库文件中的.

由此可见,我们可以在数据库 nomount 情况下调用这些package ,来达到我们的恢复目的。在dbmsbkrs.sql 和prvtbkrs.plb 这两个脚本中有详细的说明文档,出于篇幅问题,就不一一加以翻译了,但在下面会直接引用一些原文说明。

关键的内容有:

** FUNCTION deviceAllocate(
type IN varchar2 default NULL
,name IN varchar2 default NULL
,ident IN varchar2 default NULL
,noio IN boolean default FALSE
,params IN varchar2 default NULL )
RETURN varchar2; **

-- Describe the device to be used for sequential I/O. For device types where
-- only one process at a time can use a device, this call allocates a device
-- for exclusive use by this session. The device remains allocated until
-- deviceDeallocate is called or session termination. The device can be used
-- both for creating and restoring backups.
--
-- Specifying a device allocates a context that exists until the session
-- terminates or deviceDeallocate is called. Only one device can be specified
-- at a time for a particular session. Thus deviceDeallocate must be called
-- before a different device can be specified. This is not a limitation since
-- a session can only read or write one backup at a time.
--
-- The other major effect of allocating a device is to specify the name space
-- for the backup handles (file names). The handle for a sequential file does
-- not necessarily define the type of device used to write the file. Thus it
-- is necessary to specify the device type in order to interpret the file
-- handle. The NULL device type is defined for all systems. It is the file
-- system supplied by the operating system. The sequential file handles are
-- thus normal file names.
--
-- A device can be specified either by name or by type.
-- If the type is specified but not the name, the system picks an
-- available device of that type.
-- If the name is specified but not the type, the type is determined
-- from the device.
-- If neither the type or the name is given, the backups are files in
-- the operating system file system.

-- Note that some types of devices, optical disks for example, can be shared
-- by many processes, and thus do not really require allocation of the device
-- itself. However we do need to allocate the context for accessing the
-- device, and we do need to know the device type for proper interpretation
-- of the file handle. Thus it is always necessary to make the device
-- allocation call before making most other calls in this package.
--
-- Input parameters:
-- type
-- If specified, this gives the type of device to use for sequential
-- I/O. The allowed types are port specific. For example a port may
-- support the type "TAPE" which is implemented via the Oracle tape
-- API. If no type is specified, it may be implied by specifying a
-- particular device name to allocate. The type should be allowed to
-- default to NULL if operating system files are to be used.
--
-- name
-- If specified, this names a particular piece of hardware to use for
-- accessing sequential files. If not specified, any available
-- device of the correct type will be allocated. If the device cannot
-- be shared, it is allocated to this session for exclusive use.
-- The name should be allowed to default to NULL if operating system
-- files are to be used.
--
-- ident
-- This is the users identifier that he uses to name this device. It
-- is only used to report the status of this session via
-- dbms_application_info. This value will be placed in the CLIENT_INFO
-- column of the V$SESSION table, in the row corresponding to the
-- session in which the device was allocated. This value can also
-- be queried with the dbms_application_info.read_client_info procedure.
--
-- noio
-- If TRUE, the device will not be used for doing any I/O. This allows
-- the specification of a device type for deleting sequential files
-- without actually allocating a piece of hardware. An allocation for
-- noio can also be used for issuing device commands. Note that some
-- commands may actually require a physical device and thus will get
-- an error if the allocate was done with noio set to TRUE.
--
-- params
-- This string is simply passed to the device allocate OSD. It is
-- completely port and device specific.
--
-- Returns:
-- It returns a valid device type. This is the type that should be
-- allocated to access the same sequential files at a later date. Note
-- that this might not be exactly the same value as the input string.
-- The allocate OSD may do some translation of the type passed in. The
-- return value is NULL when using operating system files.

** PROCEDURE restoreControlfileTo(cfname IN varchar2); **

-- This copies the controlfile from the backup set to an operating system
-- file. If the database is mounted, the name must NOT match any of the
-- current controlfiles.
--
-- Input parameters:
-- cfname
-- Name of file to create or overwrite with the controlfile from the
-- backup set.

**PROCEDURE restoreDataFileTo( dfnumber IN binary_integer
,toname IN varchar2 default NULL); **
--
-- restoreDataFileTo creates the output file from a complete backup in the
-- backup set.

如果您有兴趣可以去阅读一下这两个文件的注释说明.

我们首先尝试恢复控制文件:

SQL> startup force nomount;

SQL> DECLARE
2 devtype varchar2(256);
3 done boolean;
4 BEGIN
5 devtype:=sys.dbms_backup_restore.deviceAllocate(type=>'',ident=>'T1');
6 sys.dbms_backup_restore.restoreSetDatafile;
7 sys.dbms_backup_restore.restoreControlfileTo(cfname=>'d:\oracle\Control01.ctl');
8 sys.dbms_backup_restore.restoreBackupPiece(done=>done,handle=>'D:\KDE\DEMO_01FR79OT_1_1.DBF', params=>null);
9 sys.dbms_backup_restore.deviceDeallocate;
10 END;
11 /

PL/SQL procedure successfully completed.

OK,控制文件恢复完成.对以上内容的解释:

第五行 分配一个device channel,因为使用的操作系统文件,所以这里为空,如果是从磁带上恢复要用
"sbt_tape";
第六行 指明开始restore ;
第七行 指出待恢复文件目标存储位置;
第八行 从哪个备份片中恢复;
第九行 释放设备通道.

不妨对以上操作的结果验证一下:

SQL> host dir d:\oracle
Volume in drive D is DATA
Volume Serial Number is DC79-57F8
Directory of d:\oracle

07/18/2004 09:08 PM

  1<dir> .   
  207/18/2004 09:08 PM <dir> ..   
  306/08/2004 03:21 PM <dir> admin   
  4**07/18/2004 09:08 PM 1,871,872 CONTROL01.CTL**   
  507/16/2004 11:27 AM <dir> ORA92   
  607/18/2004 09:02 PM <dir> oradata   
  7  
  8这样,我们成功的restore了控制文件 .如果控制文件在Full备份之后单独做的,接下来关掉实例,拷贝控制文件到具体位置,然后rman 执行restore database;即可。 
  9
 10可是,我们这里的情况有些不同. 
 11
 12视丢失文件的情况而定,继续进行如下的恢复操作:   
 13  
 14代码:--------------------------------------------------------------------------------   
 15SQL&gt; DECLARE    
 162  devtype varchar2(256);    
 173  done boolean;    
 184  BEGIN    
 195  devtype:=sys.dbms_backup_restore.deviceAllocate (type=&gt;'',ident=&gt;'t1');    
 206  sys.dbms_backup_restore.restoreSetDatafile;    
 217  sys.dbms_backup_restore.restoreDatafileTo(dfnumber=&gt;01,toname=&gt;'   
 22d:\oracle\oradata\demo\SYSTEM01.DBF');    
 238  sys.dbms_backup_restore.restoreDatafileTo(dfnumber=&gt;02,toname=&gt;'   
 24d:\oracle\oradata\demo\UNDOTBS01.DBF');    
 259  sys.dbms_backup_restore.restoreDatafileTo(dfnumber=&gt;03,toname=&gt;'   
 26d:\oracle\oradata\demo\DRSYS01.DBF');    
 2710  sys.dbms_backup_restore.restoreDatafileTo(dfnumber=&gt;04,toname=&gt;'   
 28d:\oracle\oradata\demo\EXAMPLE01.DBF');    
 2911  sys.dbms_backup_restore.restoreDatafileTo(dfnumber=&gt;05,toname=&gt;'   
 30d:\oracle\oradata\demo\INDX01.DBF');    
 3112  sys.dbms_backup_restore.restoreDatafileTo(dfnumber=&gt;06,toname=&gt;'   
 32d:\oracle\oradata\demo\ODM01.DBF');    
 3313  sys.dbms_backup_restore.restoreDatafileTo(dfnumber=&gt;07,toname=&gt;'   
 34d:\oracle\oradata\demo\TOOLS01.DBF');    
 3514  sys.dbms_backup_restore.restoreDatafileTo(dfnumber=&gt;08,toname=&gt;'   
 36d:\oracle\oradata\demo\USERS01.DBF');    
 3715  sys.dbms_backup_restore.restoreDatafileTo(dfnumber=&gt;09,toname=&gt;'   
 38d:\oracle\oradata\demo\XDB01.DBF');    
 3916  sys.dbms_backup_restore.restoreBackupPiece(done=&gt;done,handle=&gt;'   
 40D:\KDE\DEMO_01FR79OT_1_1.DBF', params=&gt;null);    
 4117  sys.dbms_backup_restore.deviceDeallocate;    
 4218  END;    
 4319  / 
 44
 45PL/SQL procedure successfully completed.   
 46  
 47\--我们的情形是所有的数据文件都丢失了,那就如法炮制 ...........   
 48\--文件对应编号来自前面全备份时候的屏幕输出内容.所以,在备份的时候保留操作Log是个很好的习惯. 
 49
 50SQL&gt; startup force mount;    
 51ORACLE instance started. 
 52
 53Total System Global Area 152115804 bytes   
 54Fixed Size 453212 bytes   
 55Variable Size 100663296 bytes   
 56Database Buffers 50331648 bytes   
 57Redo Buffers 667648 bytes   
 58Database mounted.   
 59SQL&gt; Recover database using backup controlfile until cancel ;    
 60ORA-00279: change 243854 generated at 07/18/2004 20:57:03 needed for thread 1   
 61ORA-00289: suggestion : D:\KDE\ARC00002.001   
 62ORA-00280: change 243854 for thread 1 is in sequence #2   
 63Specify log: {<ret>=suggested | filename | AUTO | CANCEL}   
 64  
 65D:\KDE\ARC00002.001    
 66ORA-00279: change 244089 generated at 07/18/2004 20:58:18 needed for thread 1   
 67ORA-00289: suggestion : D:\KDE\ARC00003.001   
 68ORA-00280: change 244089 for thread 1 is in sequence #3   
 69ORA-00278: log file 'D:\KDE\ARC00002.001' no longer needed for this recovery   
 70  
 71Specify log: {<ret>=suggested | filename | AUTO | CANCEL}   
 72cancel    
 73Media recovery cancelled.   
 74  
 75SQL&gt; alter database open resetlogs; 
 76
 77Database altered.   
 78  
 79最后,不得不resetlogs . 
 80
 81然后,打扫战场,马上进行数据库的全备份。如果您是DBA的话,应该进一步制定并完善备份计划.亡羊补牢,为时未晚。   
 82
 83
 84** 总结一下  **
 85
 861 控制文件在备份中意义重大,建议每次对其单独备份,如果数据库版本允许的话,应该设置为控制文件自动备 份。同时应该尽可能地增大CONTROL_FILE_RECORD_KEEP_TIME这个初始化参数的值。以便备份信息能更长时间的保留 
 87
 882 应该制定比较完善的备份计划,否则备份计划一旦出现缺口,将可能给系统带来灾难.记住, "可能出错的地方一定会出错". 
 89
 903 熟悉RMAN内部备份机制,对DBMS_BACKUP_RESTORE的用法有一定的掌握在关键时侯很有帮助. 
 91
 924 备份脚本应该对Log重定向并保存.以便在出错的查找有用信息.   
 93
 94
 95**参考资料** : 
 96
 97**RMAN Recovery Without Recovery Catalog or Controlfiles by Bonnie Bizzaro** **dbmsbkrs.sql 和 prvtbkrs.plb 文件说明注释** (可在你的系统 $ORACLE_HOME/rdbms/admin/中找到.)   
 98
 99
100**相关链接** :   
101
102
103本文的更多讨论,请参考这里:   
104http://www.itpub.net/244345.html    
105DBA工作备忘录之二: Exp出错的一个案例   
106http://www.itpub.net/showthread.php?s=&amp;threadid=238819    
107DBA工作备忘录之一:用events 跟踪解决不能创建物化试图一例   
108http://www.dbanotes.net/Oracle/Oracle-Case-of-10046_I.htm    
109  
110  
111  
112  
113**原文出处**   
114<a href="http://www.dbanotes.net/Oracle/Rman_nocatalog_lost_controlfile_howto.htm">   
115http://www.dbanotes.net/Rman_nocatalog_lost_controlfile_howto.htm</a>   
116  
117  
118
119
120* * *
121
122  
123本文作者:   
124Fenng,现任某美资公司DBA,业余时间混迹于各数据库相关的技术论坛。目前关注如何利用ORACLE数据库有效的构建企业应用。对Oracle tuning、troubleshooting有一点研究。个人技术站点:  http://www.dbanotes.net/  。可以通过电子邮件 [email protected] 联系到他。   
125  
126
127
128* * *  
129  
130---  
131  
132
133
134本文为DBAnotes.net版权所有,转载请注明出处、作者并尽量保留本文所有超链接。 
135
136* * *</ret></ret></dir></dir></dir></dir></dir>
Published At
Categories with 数据库类
Tagged with
comments powered by Disqus