■ 江蘇 繆振龍
編者按:本文是筆者親歷的一次Oracle TNS連接超時引起的故障,雖然看似疑難,但最終發(fā)現(xiàn)是由于一個簡單的配置文件錯誤導(dǎo)致。
筆者最近準(zhǔn)備對一臺Oracle 10g數(shù)據(jù)庫進(jìn)行數(shù)據(jù)備份,采用exp方式全庫導(dǎo)出。但嘗試多次,一直提示如下錯誤:
EXP-00056: ORACLE error 12170 encountered
ORA-12170: TNS:Connect timeout occurred
EXP-00000: Export terminated unsuccessfully
經(jīng)查閱相關(guān)資料得知,“EXP-00056”產(chǎn)生錯誤的原因是“ORA-12170”。具體來說是由于TNS連接超時導(dǎo)致。
TNS是Oracle Net的一部分,用來管理和配置Oracle數(shù)據(jù)庫和客戶端的連接工具,客戶端只有正確配置TNS相關(guān)文件才能訪問Oracle服務(wù)器。
在日常維護(hù)中通常使用tnsping命令檢查Oracle監(jiān)聽服務(wù)和網(wǎng)絡(luò)連通性。筆者首先在服務(wù)器上tnsping實例名orcl,發(fā)現(xiàn)等了很久,返回操作超時錯誤。信息如下:
-bash-3.2$ tnsping orcl
TNS Ping Utility for Solaris: Version 10.2.0.1.0 - Production on 18-JAN-2019 19:58:22
Copyright (c) 1997,2005, Oracle.All rights reserved.
Used parameter files:
/u01/app/oracle/product/10.2.0/db_1/network/admin/sqlnet.ora
Used TNSNAMES adapter to resolve the alias
Attempting to contact(DESCRIPTION = (ADDRESS= (PROTOCOL = TCP)(HOST= SUN)(PORT = 1521))(CONNECT_DATA = (SERVER =DEDICATED) (SERVICE_NAME= orcl)))
TNS-12535: TNS:operation timed out
接下來,筆者tnsping本機(jī)IP,返回結(jié)果正常。如下:
-bash-3.2$ tnsping 172.28.28.36
TNS Ping Utility for Solaris: Version 10.2.0.1.0 - Production on 18-JAN-2019 20:08:50
Copyright (c) 1997,2005, Oracle.All rights reserved.
Used parameter files:
/u01/app/oracle/product/10.2.0/db_1/network/admin/sqlnet.ora
Used EZCONNECT adapter to resolve the alias
Attempting to contact
(DESCRIPTION=(CONNECT_DATA=(SERVICE_NAME=172.28.28.36))(ADDRESS=(PROTOCOL=TCP)(HOST=172.28.28.36)(PORT=1521)))
OK (0 msec)
筆者懷疑監(jiān)聽服務(wù)存在問題,執(zhí)行“l(fā)snrctl stop” 和“l(fā)snrctl start”命令來重啟監(jiān)聽服務(wù),發(fā)現(xiàn)故障提示依舊。接著執(zhí)行“l(fā)snrctl status”查看監(jiān)聽服務(wù)狀態(tài)。返回結(jié)果如圖1所示。
通過最后兩行,我們可以看出實例orcl監(jiān)聽處于未知狀態(tài)。orcl實例沒有注冊到監(jiān)聽服務(wù)中。于是,通過SQLPlus工具登錄數(shù)據(jù)庫進(jìn)行實例注冊。具體方法如下:
-bash-3.2$ sqlplus /nolog
SQL*Plus: Release 10.2.0.1.0 - Production on Fri Jan 18 20:10:51 2019
Copyright (c) 1982,2005, Oracle.All rights reserved.
SQL> conn system/rs_789 as sysdba;
Connected.
SQL> alter system register;
System altered.
圖1 執(zhí)行l(wèi)snrctl status查看監(jiān)聽服務(wù)狀態(tài)
圖2 查看Oracle監(jiān)聽服務(wù)相關(guān)日志
再次使用“l(fā)snrctl status”命令查看監(jiān)聽服務(wù)狀態(tài),發(fā)現(xiàn)故障提示依舊。那看看配置文件是不是有問題呢?查看tnsnames.ora文件。配置如下:
ORCL =
(DESCRIPTION =
(ADDRESS =(PROTOCOL = TCP)(HOST =SUN)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME =orcl)
)
)
EXTPROC_CONNECTION_DATA =
(DESCRIPTION=
(ADDRESS_LIST =
(ADDRESS =(PROTOCOL = IPC)(KEY = EXTPROC1))
)
(CONNECT_DATA =
(SID =PLSExtProc)(PRESENTATION =RO)
)
該配置文件并未發(fā)現(xiàn)什么異常。接下來,筆者又查看了Oracle監(jiān)聽服務(wù)相關(guān)日志。最后幾行日志如圖2所示。
上面一直有警告提示,“Subscription for node down event still pending”。
經(jīng)上網(wǎng)查詢相關(guān)資料,此錯誤引起的原因可能由于監(jiān)聽無法在指定端口啟動,前一次監(jiān)聽服務(wù)未正常退出或有其他程序占用端口。提示查看/etc/hosts相關(guān)配置。于是,筆者打開/etc/hosts文件,配置如下:
-bash-3.2$ cat /etc/hosts
#
# Internet host table
#
::1 localhost
127.0.0.1 localhost
222.*.*.* SUN loghost
172.28.28.36 WWW
果然發(fā)現(xiàn)一些問題,由于當(dāng)時服務(wù)器要對外網(wǎng)開放使用,服務(wù)器同時接入內(nèi)網(wǎng)和外網(wǎng)。但后來處于網(wǎng)絡(luò)安全考慮,所有服務(wù)器的外網(wǎng)都統(tǒng)一轉(zhuǎn)移到上層硬件防火墻進(jìn)行訪問控制。服務(wù)器只接入內(nèi)網(wǎng)。但/etc/hosts的配置信息并沒有做相應(yīng)修改。既然問題已找到,于是修改/etc/hosts文件,注意要先切換到root用戶,配置完成后要使用“wq!”,強(qiáng)制保存退出。更改后配置如下:
-bash-3.2# cat /etc/hosts
#
# Internet host table
#
::1 localhost
127.0.0.1 localhost
172.28.28.36 SU
最后再切換到Oracle用戶下,執(zhí)行l(wèi)snrctl stop命令和lsnrctl start命令重啟監(jiān)聽服務(wù)。剛開始監(jiān)聽服務(wù)還處于未知狀態(tài),稍等約1分鐘,再次執(zhí)行l(wèi)snrctl status命令,可以看到實例監(jiān)聽處于Ready狀態(tài)。返回如下信息:
-bash-3.2$ lsnrctl status
LSNRCTL for Solaris:Version 10.2.0.1.0 -Production on 18-JAN-2019 20:11:35
Copyright (c) 1991,2005, Oracle.All rights reserved.
Connecting to (DESCRI PTION=(ADDRESS=(PROTOCOL=IPC)(KEY=EXTPROC1)))
STATUS of the LISTENER
……
(DESCRIPTION=(AD DRESS=(PROTOCOL=tcp)(HOST=172.28.28.36)(PORT=1521)))
Services Summary...
Service "PLSExtProc"has 1 instance(s).
Instance"PLSExtProc", status UNKNOWN, has 1 handler(s)for this service...
Service "orcl" has 2 instance(s).
Instance "orcl",status UNKNOWN, has 1 handler(s) for this service...
Instance "orcl",status READY, has 1 handler(s) for this service...
Service "orclXDB" has 1 instance(s).
Instance "orcl",status READY, has 1 handler(s) for this service...
Service "orcl_XPT"has 1 instance(s).
Instance "orcl",status READY, has 1 handler(s) for this service...
The command completed successfully
通過本次故障可以發(fā)現(xiàn),一個看似疑難的故障問題往往是由一個簡單的配置文件錯誤導(dǎo)致。所以,在我們今后的工作中,一定要扎扎實實學(xué)好理論基礎(chǔ),面對故障問題才能迎刃而解。