2008年12月31日星期三

再--见oracle bug

oracle的bug果然是无休无止, 今天有幸再次碰到了一个:4519934
Doc ID: 4519934.8
现象是alert log中出现如下的错误信息:
*** ACTION NAME:(AUTO_SPACE_ADVISOR_JOB) 2008-12-30 22:00:12.402
*** MODULE NAME:(DBMS_SCHEDULER) 2008-12-30 22:00:12.402
*** SERVICE NAME:(SYS$USERS) 2008-12-30 22:00:12.402
*** SESSION ID:(116.3582) 2008-12-30 22:00:12.402
*** 2008-12-30 22:00:12.402
ORA-12012: 自动执行作业 39537 出错
ORA-44003: SQL 名称无效

2008年12月30日星期二

Oracle中常用的to_Char用法详解

The following are number examples for the to_char function.
to_char(1210.73, '9999.9') would return '1210.7'
to_char(1210.73, '9,999.99') would return '1,210.73'
to_char(1210.73, '$9,999.00') would return '$1,210.73'
to_char(21, '000099') would return '000021'

The following is a list of valid parameters when the to_char function is used to convert a date to a string. These parameters can be used in many combinations.

Parameter Explanation
YEAR Year, spelled out
YYYY 4-digit year
YYY
YY
Y
Last 3, 2, or 1 digit(s) of year.
IYY
IY
I
Last 3, 2, or 1 digit(s) of ISO year.
IYYY 4-digit year based on the ISO standard
Q Quarter of year (1, 2, 3, 4; JAN-MAR = 1).
MM Month (01-12; JAN = 01).
MON Abbreviated name of month.
MONTH Name of month, padded with blanks to length of 9 characters.
RM Roman numeral month (I-XII; JAN = I).
WW Week of year (1-53) where week 1 starts on the first day of the year and continues to the seventh day of the year.
W Week of month (1-5) where week 1 starts on the first day of the month and ends on the seventh.
IW Week of year (1-52 or 1-53) based on the ISO standard.
D Day of week (1-7).
DAY Name of day.
DD Day of month (1-31).
DDD Day of year (1-366).
DY Abbreviated name of day.
J Julian day; the number of days since January 1, 4712 BC.
HH Hour of day (1-12).
HH12 Hour of day (1-12).
HH24 Hour of day (0-23).
MI Minute (0-59).
SS Second (0-59).
SSSSS Seconds past midnight (0-86399).
FF Fractional seconds.


The following are date examples for the to_char function.

to_char(sysdate, 'yyyy/mm/dd'); would return '2003/07/09'
to_char(sysdate, 'Month DD, YYYY'); would return 'July 09, 2003'
to_char(sysdate, 'FMMonth DD, YYYY'); would return 'July 9, 2003'
to_char(sysdate, 'MON DDth, YYYY'); would return 'JUL 09TH, 2003'
to_char(sysdate, 'FMMON DDth, YYYY'); would return 'JUL 9TH, 2003'
to_char(sysdate, 'FMMon ddth, YYYY'); would return 'Jul 9th, 2003'


You will notice that in some examples, the format_mask parameter begins with "FM". This means that zeros and blanks are suppressed. This can be seen in the examples below.

to_char(sysdate, 'FMMonth DD, YYYY'); would return 'July 9, 2003'
to_char(sysdate, 'FMMON DDth, YYYY'); would return 'JUL 9TH, 2003'
to_char(sysdate, 'FMMon ddth, YYYY'); would return 'Jul 9th, 2003'

The zeros have been suppressed so that the day component shows as "9" as opposed to "09".

2008年12月25日星期四

Rman的format格式中的%

%c 备份片的拷贝数
%d 数据库名称
%D 位于该月中的第几天 (DD)
%M 位于该年中的第几月 (MM)
%F 一个基于DBID唯一的名称,这个格式的形式为c-IIIIIIIIII-YYYYMMDD-QQ,其中IIIIIIIIII为该数据库的DBID,YYYYMMDD为
日期,QQ是一个1-256的序列
%n 数据库名称,向右填补到最大八个字符
%u 一个八个字符的名称代表备份集与创建时间
%p 该备份集中的备份片号,从1开始到创建的文件数
%U 一个唯一的文件名,代表%u_%p_%c
%s 备份集的号
%t 备份集时间戳
%T 年月日格式(YYYYMMDD)

2008年12月22日星期一

Standby export出现ORA-00600

进来在重新整改公司DB的备份容灾方案,有一个DB跑在HP UNIX上,
并且在另外一台HP UNIX上做了standby, 但是因为两台server采用的是
共享存储,不能用来做容灾,于是希望能将其在公司现有的其它windows server上
做一个镜像,定期import production资料, 做容灾用. 因为担心export会对线上产生影响,
且在hot状态export也会影响export出文件的数据一致性,之前就经常出现sequence不一致的情况.
于是采用了从standby export.
JOB很快写好,但是运行的时候老是出现问题, 其中一个schema的数据在export的时候老是出现错误:
EXP-00008: 遇到 ORACLE 错误 600
而查看standby alert log:
-----------------
Errors in file /disk/oracle/admin/HX/udump/hishx_ora_5435.trc:
ORA-07445: exception encountered: core dump [joxuinlocal_internal()+1652] [SIGSEGV] [unknown code] [0x1000000000C] [] []
-----------------
且出现很多这样的错误消息:
-----------------
ksvcreate: Process(m000) creation failed
-----------------

查看对应的trc文件:
-----------------
ksedmp: internal or fatal error
ORA-07445: exception encountered: core dump [joxuinlocal_internal()+1652] [SIGSEGV] [unknown code] [0x1000000000C] [] []
Current SQL statement for this session:
SELECT SYNNAM, DBMS_JAVA.LONGNAME(SYNNAM), DBMS_JAVA.LONGNAME(SYNTAB), TABOWN, TABNODE, PUBLIC$, SYNOWN, SYNOWNID, TABOWNID, SYNOBJNO FROM SYS.EXU9SYN WHERE SYNOWNID = :1 ORDER BY SYNTIME
----- Call Stack Trace -----
calling call entry argument values in hex
location type point (? means dubious value)
-------------------- -------- -------------------- ----------------------------

多方寻找解决方案,最后在metalink上证实这是oracle的一个bug, 在standby数据库多次open read only后,会出现这样的现象:
Bug 5583049

不得以, 我只能改写job, 将其每次export改为重启以后再export, 而不是直接在
recover managed standby database disconnect from session

open read only
之间切换.

2008年12月4日星期四

通过RMAN备份克隆9i数据库

今天要将一个已经升级到10g的数据库克隆一个升级前的版本到另外一个server做一些测试, 因为升级, 剩下的只有一个升级前的RMAN全备份以及全备份后的一个自动备份(controlfile+spfile).
首先将备份拷贝到目标server, 因为我需要恢复到不同路径, 9i的RMAN据我所知是没有这个功能的, 好在我的目标server上还装有10g.
1.创建一个新的windows服务 ORADIM -NEW -SID HIS
2. 将oracle home注册表下新建一个ORACLE_SID的键, 并修改为HIS
3. 将默认的init.ora稍作修改,copy一份为initHIS.ora
4. rman target /登录
5. startup nomount
6. restore spfile from '***********"
7. create pfile from spfile, shutdown database
8. 创建相关路径,修改pfile中的路径(主要是dump file路径, controlfile路径)
9. 使用新的pfile启动到mount状态
10. crosscheck backup, delete expired backup
11. catalog start with '***********'
12. run {
set newname for datafile 1 to '***********';
......................
......................
......................
restore database;
switch datafile all;
}
13. recover database;
14. alter database rename file '******.redo01.dbf' to '*********.redo01.log';
15. alter database rename file '******.redo01.dbf' to '*********.redo02.log';
16. alter database rename file '******.redo01.dbf' to '*********.redo03.log';
17. alter database open resetlogs;
18. alter tablespace temp add datafile '**********.temp01.dbf' size 100M;

"sqlplus / as sysdba"登录ORACLE

平时, 我们经常会使用SQLPLUS / AS SYSDBA登录数据库, 但在有些环境下, 会发现不能正常登录,报出各种错误.
其实要成功使用操作系统认证, 需要有以下条件:
1. 要让oracle知道登录哪个数据库, 在LINUX/UNIX环境下, 必须设置环境变量ORACLE_SID, 而在windows下, 则必须设置ORACLE_HOME下的注册表项ORACLE_SID
2. 在windows下面需要使用oradim -new -sid **** 创建一个oracle服务
3. 对应ORACLE_HOME的sqlnet.ora需要添加参数设置:
SQLNET.AUTHENTICATION_SERVICES = (NTS)
4. 相应用户必须在ORA_DBA组中

2008年12月2日星期二

数据库升级后莫名崩溃

星期六晚上成功将一个生产环境数据库从9.2.0.6升级到10.2.0.1.
操作系统为windows 2003 enterprise edition sp1.
今天早上就出现数据库连接不上的状况, 登到server一看, listener挂掉, 起不来, 也不能以DBA登录到db, 报tns无监听程序,协议适配器错误.
查看sqlnet.log:
***********************************************************************
Fatal NI connect error 12505, connecting to: (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=HIS01)(PORT=1521))(CONNECT_DATA=(SID=HIS)(CID=(PROGRAM=D:\oracle\product\10.2.0\HIS\bin\emagent.exe)(HOST=HIS)(USER=SYSTEM))))

VERSION INFORMATION:
TNS for 32-bit Windows: Version 10.2.0.1.0 - Production
Windows NT TCP/IP NT Protocol Adapter for 32-bit Windows: Version 10.2.0.1.0 - Production
Time: 01-12月-2008 11:38:42
Tracing not turned on.
Tns error struct:
ns main err code: 12564
TNS-12564: TNS: 拒绝连接
ns secondary err code: 0
nt main err code: 0
nt secondary err code: 0
nt OS err code: 0
***********************************************************************

查看db alert log:
Mon Dec 01 11:36:55 2008
Errors in file d:\oracle\admin\his\udump\his_ora_3868.trc:
ORA-07445: exception encountered: core dump [] [] [] [] [] []
ORA-07445: exception encountered: core dump [] [] [] [] [] []
ORA-07445: exception encountered: core dump [] [] [] [] [] []
ORA-07445: exception encountered: core dump [] [] [] [] [] []
ORA-07445: exception encountered: core dump [] [] [] [] [] []
ORA-07445: exception encountered: core dump [] [] [] [] [] []
ORA-07445: exception encountered: core dump [] [] [] [] [] []
ORA-07445: exception encountered: core dump [] [] [] [] [] []
ORA-07445: exception encountered: core dump [] [] [] [] [] []
ORA-07445: exception encountered: core dump [] [] [] [] [] []
ORA-07445: exception encountered: core dump [] [] [] [] [] []
ORA-07445: exception encountered: core dump [] [] [] [] [] []
ORA-07445: exception encountered: core dump [ACCESS_VIOLATION] [unable_to_trans_pc] [PC:0x7C8224B2] [ADDR:0x2D484D4E] [UNABLE_TO_READ] []
ORA-07445: exception encountered: core dump [ACCESS_VIOLATION] [unable_to_trans_pc] [PC:0x7C8224B2] [ADDR:0x2D484D4E] [UNABLE_TO_READ] []

查看对应trace文件:
ORA-07445: exception encountered: core dump [ACCESS_VIOLATION] [unable_to_trans_pc] [PC:0x7C8224B2] [ADDR:0x2D484D4E] [UNABLE_TO_READ] []
OR
Current SQL information unavailable - no session.
----- Call Stack Trace -----
calling call entry argument values in hex
location type point (? means dubious value)
-------------------- -------- -------------------- ----------------------------

重启server以后,数据库以及listener均正常工作, 但过一段时间,大概是30分钟左右, 问题依旧.

为了production环境AP能够正常使用, 只能再次重启, 并且等待下一次crash.

最终查到Metalink Note:405904.1, note中描述状况与我们遇到的基本一致.

Solution:
修改SQLNET.ORA, 添加:
SQLNET.INBOUND_CONNECT_TIMEOUT = 0
修改LISTENER.ORA, 添加:
INBOUND_CONNECT_TIMEOUT_LISTENER = 0

另, 发现该instance PGA设的很小:24M, 改为128M.
再次重启server之后, 恢复正常.

但是这次正常也只不过持续了4个小时, 又挂掉了, 问题依旧.

于是开始怀疑是内存问题, 因为将PGA修改之后, 能够持续得更长时间.
到晚上,对比该环境与另外一生产环境(同样也是跑在10g上)的pfile, 终于发现了问题所在.

虽然我们升级了DB, 但是相关parameter依旧采用9i的模式, 例如SGA为采用自动管理(未设定sga_target), 而是手动设定了里面相关模块的固定大小.
pga也存在同样问题.
不光如此, sga中设置的个内存模块大小比例也感觉很不恰当, 且还有一些其它的奇怪设定.
对照修改pfile之后, 第二天上班, 一切正常.

看来升级数据库的时候还是有不少陷阱的, 一定要注意参数的修改.