standby ora-00367 ora-19567

2009.11.01 10:08 上午 »Author: bosonmaster »
DATAGURAD之前也就是自己玩过一两次,这次实施的时候遇到的问题也不少。
在启动备库恢复的时候,
ALERT.LOG里如下错误:
Errors in file /u01/app/oracle/admin/dczh/bdump/dczh1_mrp0_1092.trc:
ORA-00367: checksum error in log file header
ORA-00316: log 1 of thread 1, type 0 in header is not log file
ORA-00312: online log 1 thread 1: '/dev/vgora/rlvol_redo1_11_512'
Clearing online redo logfile 1 /dev/vgora/rlvol_redo1_11_512
Clearing online log 1 of thread 1 sequence number 1347
Sat Oct 31 00:16:25 2009
Errors in file /u01/app/oracle/admin/dczh/bdump/dczh1_mrp0_1092.trc:
ORA-19527: physical standby redo log must be renamed
ORA-00312: online log 1 thread 1: '/dev/vgora/rlvol_redo1_11_512'
Clearing online redo logfile 1 complete
 
经查询
METALINK:352879.1文档,设置参数log_file_name_convert,不管你主库和备库是否一致,都要设置,我设置以后问题解决
 
Oracle Server - Enterprise Edition - Version: 10.2.0.1 to 11.1.0.7
This problem can occur on any platform.
This issue is seen starting in release 10gR2
Symptoms
Upon starting the Managed Recovery Process in a Standby Database the following Errors may be seen
 
Thu Oct 27 09:41:47 2005
Attempt to start background Managed Standby Recovery process (ora)
MRP0 started with pid=47, OS id=32094
Thu Oct 27 09:41:47 2005
MRP0: Background Managed Standby Recovery process started (ora)
Managed Standby Recovery not using Real Time Apply
Thu Oct 27 09:41:52 2005
Errors in file /app/oracle/admin/ora/bdump/ora_mrp0_32094.trc:
ORA-00313open failed for members of log group 1 of thread 1
ORA-00312: online log 1 thread 1: '/u03/oradata/ora/ora_rdo01c.log'
ORA-27037: unable to obtain file  status
Linux Error: 2: No such file or directory
Additional information: 3
ORA-00312: online log 1 thread 1: '/u02/oradata/ora/ora_rdo01b.log'
ORA-27037: unable to obtain file status
Linux Error: 2: No such file or directory
Additional information: 3
 
If the files are created then you may then receive the following errors
 
 
Thu  Oct 27 09:41:52 2005
Errors in file /app/oracle/admin/ora/bdump/ora_mrp0_32094.trc:
ORA-19527: physical standby redo log must be renamed
ORA-00312: online log 1 thread 1: '/ora01/oradata/ora/ora_rdo01a.log'
Clearing online redo logfile 1 complete
Media Recovery Waiting for thread 1 sequence 55
Thu Oct 27 09:41:53 2005
Completed: alter database recover managed standby database disconnect from
session.
 
 
You may also see following messages on MRP startup even with log_file_name_convert parameter set
 
 
ORA-00312: online log 11 thread 2: '+ARCH_1/p2brp_dr/onlinelog/group_11.285.609666683'
ORA-17503: ksfdopn:2 Failed to open file +ARCH_1/p2brp_dr/onlinelog/group_11.285.609666683
ORA-15012: ASM file '+arch_1.285.609666683' does not exist
ORA-00312: online log 11 thread 2: '+DATA_1/p2brp_dr/onlinelog/group_11.299.609666681'
ORA-17503: ksfdopn:2 Failed to open file +DATA_1/p2brp_dr/onlinelog/group_11.299.609666681
ORA-15012: ASM file '+data_1.299.609666681' does not exist
 
 
 
 
 
 
 
Cause
This is in fact an Enhancement to the Data Guard Technology introduced in 10.2.0.
 
The Goal here is to improve speed of Switchover and Failover. In previous Versions a Role Transition would require to clear the Online Redo Logfiles before it can become a Primary DatabaseNow we attempt to clear the Online Redo Logfiles when starting Managed Recovery.
 
If the Files exist then they will be cleared, but if they do not exist we report the Error, attempts to create the Online Redo Logfiles and starts Recovery. Even if this is not possible because of different Structure and log_file_name_convert is not set, MRP does not fail; it only raises these Errors.
 
As an extra Enhancement if the Online Redologs do exist you must specify the log_file_name_convert Parameter even if there is no difference in the Name. This has been implemented to reduce the chances that the Primary Online Redologs are cleared when MRP startsIt is the equivalent of asking - Are you sure you want the logs to be called this....
 
If the log_file_name_convert parameter is not set then the ORA-19527 is reported and the log file is not cleared at this time..
 
Solution
Solution to stop both of these errors is to ensure log_file_name_convert is set correctly.

hacmp 5.4.1 10gr2

2009.09.24 3:04 下午 »Author: bosonmaster »

在AIX系统上安装ORACLE 10GR2 RAC时,如果HACMP版本是5.4.1的,那么在安装CRS时,将无法看到节点信息,如下图
aix
这是ORACLE的一个BUG,相关PATCH文档号:Patch 6718715
Oracle 10gR2 patchset 10.2.0.3, CRS bundled Patch 6160398 is required. In addition, rootpre.sh Patch 6718715 is required when using HACMP 5.4.1 with a fresh install of Oracle RAC clusterware or when upgrading from Oracle 10gR1 to Oracle 10gR2. This patch should be installed on all nodes before installing Oracle 10.2.0.1 software. Be sure to download the 10gR2 version of Patch 6718715.

建议在AIX 下使用HACMP时,参照文档:404474.1,可以少走一些弯路

RAC VIP ORA-12545

2009.09.20 3:19 下午 »Author: bosonmaster »
最近在做一个RAC 实施时,发现客户应用连接RAC时,老是报ora-12545 因目标主机或对象不存在 连接失败,之前应用,因为我们使用的都是预连接,而且用的实际地址,所以没怎么遇到过,这次用的VIP,确总是报错,最后在METALINK364855.1
找到解决方法:
Symptoms
When we try to connect to a RAC service name we sometimes get redirected by the first node's listener to the public address/hostname of the second node instead of its VIP address. An ORA-12545 error may be generated if that public hostname is not configured in DNS.
 
We were expecting the connection to eventually be redirected to the VIP of the other node.
 
 
Cause
The Database on one RAC node remote registers with the wrong local IP address to the listener on the other RAC node (e.g. the public IP address instead of the wanted VIP address).
 
The PMON process handles database registration to the local and remote listeners. For remote listeners registration PMON will have to find out what is the IP address of the local system in order to present it to the remote listener as database contact address.
 
In the default Oracle configuration, for hosts which have more than one IP address configured on the network interfaces,  it is undefined which IP address will be selected for remote registration.
 
 
Solution
Modify the local_listener database parameter to point to the local VIP address. For the parameter value use either an alias name which contains in the DESCRIPTION field only the VIP address or use an explicit connection statement like the following:
alter system set LOCAL_LISTENER="(ADDRESS=(PROTOCOL=TCP)(HOST=<VIP_address>)(PORT=1521))" scope=both sid=
'instance_name';
记得一定要是双引号
 
 
Where "instance_name" is the unique instance name.   Issue this statement for all instances in the cluster. The LOCAL_LISTENER database parameter will give PMON a hint in respect of which IP address it should use for remote registration with other nodes
' listener(s).

修复包过程函数

2009.08.15 11:47 下午 »Author: bosonmaster »
当因为某种原因导致数据库所有的包、过程、函数无法使用时,临时可以用一下方法修复 SYS用户
 
SQL> alter package standard compile;
 
 
程序包已变更。
 
 
SQL> alter package dbms_standard compile;
 
 
程序包已变更。
 
 
SQL> @?/rdbms/admin/utlrp.sql
 
 
PL/SQL 过程已成功完成。

再战EM

2009.08.15 11:34 下午 »Author: bosonmaster »
周五在客户在给客户装了一套ORACLE,平台AIX 5309 64BITDB 10.2.0.4 DBCA的时候,在88%左右的时候就报EM有问题,这个之前遇到过很多次,都是DBCA完了处理一般没什么大问题。DBCA后,用一下方法去重新安装EM
 
 
drop user sysman cascade;
drop role MGMT_USER;
drop user MGMT_VIEW cascade;
drop public synonym MGMT_TARGET_BLACKOUTS;
drop public synonym SETEMVIEWUSERCONTEXT;
在安装
EM
 
emca -config dbcontrol db -repos create
可是还不
OK,启动的时候有如下报错提示:
ps: 0509-048 Flag -o was used with invalid list.
ps: Not a recognized flag: -
Usage: ps [-ANPaedfklmMZ] [-n namelist] [-F Format] [-o specifier[=header],...]
                [-
p proclist][-G|-g grouplist] [-t termlist] [-U|-u userlist] [-c classlist] [ -T pid] [ -L pidlist]
Usage: ps [aceglnsuvwxU] [t tty] [processnumber]
 
最后还是启动失败,一开始并没有太在意以上错误,后来重新搞了几次还是不行,看来必须先解决上面问题
 
METALINK发现如下文档:文档ID758568.1
Applies to:
Enterprise Manager Grid Control - Version: 10.2.0.3 to 10.2.0.4
IBM AIX Based Systems (64-bit)
 
Symptoms
 
'emctl start dbconsole' command shows ps command error as below:
 
$
ORACLE_HOME/bin/emctl start dbconsole
Oracle Enterprise Manager 10g Database Control Release 10.2.0.3.0
Copyright (c) 1996, 2006 Oracle Corporation. All rights reserved.
http://<host>:<port>/em/console/aboutApplication
ps: 0509-048 Flag -o was used with invalid list.
ps: Not a recognized flag: -
Usage: ps [-ANPaedfklmMZ] [-n namelist] [-F Format] [-o specifier[=header],...]
[-
p proclist][-G|-g grouplist] [-t termlist] [-U|-u userlist] [-c classlist] [ -T
pid] [ -L pidlist]
Usage: ps [aceglnsuvwxU] [t tty] [processnumber]
Starting Oracle Enterprise Manager 10g Database Control .............. started.
----------------------------------------------------------------
--
Cause
In emctl.pl we have a command as below:
ps -p $PID -o cmd --cols 1000 |grep DEMDROOT
 
In AIX platforms for some OS kernels, this command doesn't work. The correct command is:
ps -p $PID -o args | grep DEMDROOT
 
Solution
If you get these errors while starting DBConsole, follow below action plan:
 
Follow below action plan.
 
a) Stop DBConsole -
'emctl stop dbconsole'
b) Take backup of
'emctl.pl' from $ORACLE_HOME/bin
c) Edit emctl.pl and goto line number 1249, which is:
my $ps=`ps -p $PID -o cmd --cols 1000 |grep DEMDROOT`;
Modify above line as below:
my $ps=`ps -p $PID -o args | grep DEMDROOT`;
d) Save the file.
e) Start DBConsole -
'emctl start dbconsole' from $ORACLE_HOME/bin
 
安装上边处理完后,停止EM的时候有如下报错:
$ emctl stop dbconsole
 
Oracle Enterprise Manager 10g Database Control Release 10.2.0.4.0 
 
Copyright (c) 1996, 2007 Oracle Corporation.  All rights reserved.
 
https://host:1158/em/console/aboutApplication
 
Stopping Oracle Enterprise Manager 10g Database Control ...
 
--- Failed to shutdown DBConsole Gracefully ---
 
 failed.
 
处理方法:
 
KILL 掉oc4j和dbconsole相关进程,然后停止em agent 和dbconsole就好了
 
在重新启动的时候还是没有启动起来,
emctl.trc有如下报错
 
009-08-14 19:20:15 Thread-1958 ERROR http: 11: Unable to initialize ssl connection with server, aborting connection attempt
2009-08-14 19:20:15 Thread-1958 ERROR pingManager: nmepm_pingReposURL: Cannot connect to https://host:1158/em/upload/: retStatus=-1
2009-08-14 19:20:15 Thread-1958 ERROR ssl: Open wallet failed, ret = 28750
2009-08-14 19:20:15 Thread-1958 ERROR ssl: nmehlenv_openWallet failed
2009-08-14 19:20:15 Thread-1958 ERROR http: 11: Unable to initialize ssl connection with server, aborting connection attempt
2009-08-14 19:20:15 Thread-1958 ERROR pingManager: nmepm_pingReposURL: Cannot connect to https://host:1158/em/upload/: retStatus=-1
2009-08-14 19:20:23 Thread-1960 ERROR upload: Error in uploadXMLFiles.  Trying again in 300.00 seconds.
2009-08-14 19:20:45 Thread-1966 ERROR ssl: Open wallet failed, ret = 28750
2009-08-14 19:20:45 Thread-1966 ERROR ssl: nmehlenv_openWallet failed
2009-08-14 19:20:45 Thread-1966 ERROR http: 12: Unable to initialize ssl connection with server, aborting connection attempt
2009-08-14 19:20:45 Thread-1966 ERROR pingManager: nmepm_pingReposURL: Cannot connect to https://host:1158/em/upload/: retStatus=-1
2009-08-14 19:20:45 Thread-1966 ERROR ssl: Open wallet failed, ret = 28750
2009-08-14 19:20:45 Thread-1966 ERROR ssl: nmehlenv_openWallet failed
2009-08-14 19:20:45 Thread-1966 ERROR http: 12: Unable to initialize ssl connection with server, aborting connection attempt
2009-08-14 19:20:45 Thread-1966 ERROR pingManager: nmepm_pingReposURL: Cannot connect to https://host:1158/em/upload/: retStatus=-1
2009-08-14 19:20:45 Thread-1967 ERROR ssl: Open wallet failed, ret = 28750
2009-08-14 19:20:45 Thread-1967 ERROR ssl: nmehlenv_openWallet failed
2009-08-14 19:20:45 Thread-1967 ERROR http: 12: Error initializing SSL connection for incoming request, aborting request. ret=-1
2009-08-14 19:20:52 Thread-1969 ERROR upload: Error in uploadXMLFiles.  Trying again in 300.00 seconds.
2009-08-14 19:21:14 Thread-1029 ERROR ssl: Open wallet failed, ret = 28750
2009-08-14 19:21:14 Thread-1029 ERROR ssl: nmehlenv_openWallet failed
 
在文档:749243.1 有如下解决方法:
Applies to:
Enterprise Manager Grid Control - Version: 10.2.0.1
This problem can occur on any platform.
 
Symptoms
Database Console fails to start with:
 
emctl start dbconsole
 
TZ set to Europe/Madrid
Oracle Enterprise Manager 10g Database Control Release 10.2.0.3.0
Copyright (c) 1996, 2006 Oracle Corporation. All rights reserved.
https://myserver.mydomain:5503/em/console/aboutApplication
Starting Oracle Enterprise Manager 10g Database Control
.............................................................................................
failed.
 
emdctl.trc
-----------
2008-09-15 10:58:20 Thread-4136126688 ERROR http: 8: Unable to initialize ssl connection with
server, aborting connection attempt
2008-09-15 10:59:52 Thread-4136126688 ERROR ssl: nzos_Handshake failed, ret=29024.
 
Cause
The Dbconsole certificate life time has expired.
Solution
Run the commands:
 
1. Unsecure the Dbconsole
- Unsecure database control using
$ORACLE_HOME/bin>emctl unsecure dbconsole
 
2. Force an upload:
 
$ORACLE_HOME/bin> emctl upload
 
3. Also consider Resecuring the Dbconsole
- Secure database control using
$ORACLE_HOME/bin>emctl secure dbconsole
 
 Starting with 10.2.0.4, HTTPS is used by default.
 
经过以上处理后就OK了,途中来来回回搞了不少次。
 
总结就是EM有问题了,一定要去$ORACLE_HOME/类似主机名的目录sysman/log看agent和emctl.trc emdb.nohup里报错,解决了那些报错。我想你的EM也不会有什么问题了