A programmer deployed a set of oracle 11g on Alibaba Cloud. The boss said that they had been working on it for several days, and the monitoring could not be started. Let me take a look. When I got on board, it turned out to be a problem with the host name setting (the elastic IP of Alibaba Cloud was directly written in the /etc/hosts file, and the network card address of the cloud host is generally a private address). On this matter, I also made a special article, the address is http://blog.51cto.com/sery/2084706. After fixing this problem, I felt that it was inappropriate to put the database directly on the public network. At the same time, I also found some other problems, such as unreasonable disk space planning and mixing java. Therefore, it is recommended to use this machine as a test environment, purchase several cloud hosts, put them into the vpc network, redeploy the application and database, and separate the database from other applications. Click here for the video
It's all my fault, so I had to do it myself and re-deploy oracle on the cloud. I never recommend deploying oracle in the public cloud. The main reasons are as follows:
1. The cloud host does not have swap partitions. I know several service providers, and this is the case!
2. The customization of the operating system of the cloud server is not flexible. In my experience, when oracle 11g is deployed on an operating system like centos5, there will be no dependency problems, while in versions above centos6, some dependency packages will be verified incorrectly during the installation of oracle.
3. Cloud hosting performance issues. Under normal circumstances, we will choose a highly configured physical server to deploy.
Deploying oracles in the public cloud, although awkward, still has to work. Although during the installation process, several packages did not match (actually the version was high), these checks were ignored, deployed, and run for a long time (about a month). Check the alarm log, and no exception is found, indicating that the installation is still successful. At present, there is no online cloud server in personal hands, here we use a virtual environment to restore the entire deployment process, hoping to help those in need.
Buy a centos7 cloud host, plus a 250G cloud disk. The cloud disk is divided into two areas, one is used for the swap partition, and the rest is used as the Oracle installation directory and data storage. The divided data partitions are attached to the system. In order to conform to oracle's installation habits, the mount point is /u01.
Before officially executing the Oracle installation script, there are a series of pre-operations. Since I often do deployment, in order to save trouble, I wrote a script, the content is:
[root@oradb190 ~]# more oracle_rep.bash #!/bin/bash #writed by sery 2012-05-16 ######################################### #install depending packages # ######################################### yum install gcc* gcc-* gcc-c++-* glibc-devel-* glibc-headers-* compat-libstdc* libstdc* elfutils-libelf-devel* libaio-devel* sysstat* unixODBC-* pdksh-* ######################################## #add groups,user and create dir # ######################################## /usr/sbin/groupadd -g 501 oinstall /usr/sbin/groupadd -g 502 dba useradd -u 1000 -g oinstall -G dba oracle mkdir /u01/app/ mkdir -p /u01/app/oraInventory mkdir -p /u01/app/oracle chown -R oracle:oinstall /u01/app chmod -R 775 /u01/app ############################################## #modify sysctl.conf # ############################################## cat >> /etc/sysctl.conf <<done fs.file-max = 6815744 kernel.shmall = 2097152 #kernel.shmmax = 536870912 kernel.shmmni = 4096 kernel.sem = 250 32000 100 128 net.ipv4.ip_local_port_range = 9000 65500 net.core.rmem_default = 262144 net.core.rmem_max = 4194304 net.core.wmem_default = 262144 net.core.wmem_max = 1048576 fs.aio-max-nr = 1048576 done sysctl -p ############################################### #modify /etc/security/limits.conf # ############################################### cat >> /etc/security/limits.conf << done oracle soft nproc 2047 oracle hard nproc 16384 oracle soft nofile 1024 oracle hard nofile 65536 done ################################################ #modify /etc/pam.d/login # ################################################ echo "session required pam_limits.so">>/etc/pam.d/login ################################################ # setting user oracle env # ################################################ cat >> /home/oracle/.bash_profile <<done export ORACLE_SID=zyzf1 export ORACLE_UNQNAME=zyzf1 export ORACLE_base=/u01/app/oracle export ORACLE_HOME=/u01/app/oracle/product/11.2.0 export PATH=$ORACLE_HOME/bin:$PATH done |
授予该脚本执行权限,然后执行./oracle_rep.bash。执行完毕,挨个检查一下,看是否生成目录、是否创建了用户、是否修改了相关配置文件....
接下来,配置和启用vnc,以图形方式来远程安装oracle。物理服务器时代,我们还可能去机房,做在物理服务器前边,接上显示器。但用了云主机以后,这个路没有了哟,当然也算解放了,不用去机房接受噪音和辐射嘛!
云主机安装的centos可能没有默认安装vnc服务,需要把gnome桌面,vncserver给安装上去。centos7默认启动了防火墙,sshd也有些限制,为了减少干扰,统统把它们关闭掉。操作步骤如下(文字多点不要紧,反正也没人给稿费):
1、关闭防火墙
systemctl stop firewalld.service systemctl disable firewalld.service |
2、修改sshd配置文件/etc/ssh/sshd_config
..........省略若干................................. PermitRootLogin yes ..............略........................... ClientAliveInterval 900 ClientAliveCountMax 30 ...............略................. |
3、安装vncsevncser
yum groupinstall "X Window System" yum install tigervnc-server -y yum groupinstall "GNOME Desktop" "Graphical Administration Tools" |
假如与桌面环境相关的包不安装完整,vnc客户端去连接以后,就可能是蓝屏,一个图标也没有,也无法进行任何操作。安装好vncserver以后,执行指令vncserver,输入密码。我一般用root启动,设置跟系统root一样的密码。我看有些人的文档,把vncserver做成一个常规服务,这没什么必要,安装完oracle以后,就再也不需要桌面环境嘛(如果用静默方式安装,这个也vnc也可以省掉了)。
接下来,大部分的操作,我们就在vnc里边进行了。容易出问题的地方,应该在export DISPLAY这个地方,一定要核对自己vncserver启动时显示的那个带冒号后边的数字(有些人只会抄,直接复制网上的例子)。
其实我们vnc客户端连接也是用ip:加这个数字。
至于上传oracle安装包,解压和目录授权一类的,这里就略过。我一般是把oracle解压出来的目录,复制在/home/oracle目录,然后再给它来一下 chown -R oracle:oinstall /home/oracle ,这样切换成oracle帐号,执行安装就不用担心权限问题。oracle 11g的两个安装包,解压后,得到一个大的目录database,安装脚本runInstaller及其它所需的所有文件,都在里边。
[root@localhost ~]# su - oracle Last login: Thu Apr 26 22:13:21 CST 2018 on pts/2 [oracle@localhost ~]$ ls -al database/ total 16 drwxr-xr-x. 8 oracle oinstall 128 Aug 21 2009 . drwx------. 6 oracle oinstall 123 Apr 26 22:14 .. drwxr-xr-x. 12 oracle oinstall 203 Aug 17 2009 doc drwxr-xr-x. 4 oracle oinstall 223 Aug 15 2009 install drwxrwxr-x. 2 oracle oinstall 61 Aug 15 2009 response drwxr-xr-x. 2 oracle oinstall 34 Aug 15 2009 rpm -rwxr-xr-x. 1 oracle oinstall 3226 Aug 15 2009 runInstaller drwxrwxr-x. 2 oracle oinstall 29 Aug 15 2009 sshsetup drwxr-xr-x. 14 oracle oinstall 4096 Aug 15 2009 stage -rw-r--r--. 1 oracle oinstall 5402 Aug 18 2009 welcome.html |
假定不做前置处理,我们在vnc客户端执行命令 su - oracle ;cd database ; ./runInstaller 会提示“
>>> Could not execute auto check for display colors using command /usr/bin/xdpyinfo. Check if the DISPLAY variable is set. Failed <<<<
”。在root账户下,执行xhost + 及在oracle账户下执行 export DISPLAY=:1(注意,这里很容易出问题,很多网上文档,写的都是DISPLAY=:0.0; export DISPLAY) ,前边强调过了,一定要与vncserver启动时主机名加冒号后带的那个数字,vnc客户端连的那个ip后的数字也是同样的数字嘛!
我在这里踩过坑,从网上复制过来,死活找不到原因呢!都设置对了,肯定会弹出安装界面。step 1 of 9 不选不填,直接按 “Next”。step 2 of 9 选第二个“install database soft only”(只安装数据库软件,数据库本身后边手动创建)。
下一步选单实例数据库,.....后边接着选企业级数据库版本“Enterprise Edition”.到step 9 of 12 这里,安装校验会出现好几处不满足oracle安装要求的项目。我曾经试着按提示去修复(修改系统参数文件/etc/sysctl.conf、以yum方式安装所需要的依赖包--实际大部分包都存在了,只是版本比要求的高或者需要32位的版本),没什么效果,直接忽略掉。
实践证明,这个决策是对的(centos 5.X安装一点也没问题)。因为这个oracle部署完成后,已经稳定运行了一个多月,开发人员没叫唤,查告警日志也没啥异常呢!参看网上其他人写的文档,也是统统忽略,嘿嘿!一路点下一步,到复制文件的时候,有点耗时,如果选的是sata盘,更是慢得要命了。当复制到84%的时候,出来一个报错,看不到内容,只有一点点框框,真是悲催。
鼠标怎么点、怎么拉,都看不见报错信息之真容。你藏起来,以为我找不到你哟,我看日志行不?我还是认识几个英语。
tail -100 /u01/app/oraInventory/logs/installActions2018-04-26_11-30-49PM.log ..........................略..................... INFO: /usr/lib64/libstdc++.so.5: undefined reference to `memcpy@GLIBC_2.14' INFO: collect2: error: ld returned 1 exit status INFO: make: *** [ctxhx] Error 1 INFO: End output from spawned process. INFO: ---------------------------------- INFO: Exception thrown from action: make Exception Name: MakefileException .........................略........................ |
查了资料,说是“glibc的版本过高所致”-当前系统的glibc版本为glibc-2.17-196.el7_4.2.x86_64,网上有这个错误的修正办法,我试了好几遍不修复,也没什么问题。如果诸位要修复这个问题,以关键词“INFO: /usr/lib64/libstdc++.so.5: undefined reference to”搜索即可得出方案。点右上角那个×,安装继续进行,片刻,有跳出来一个一模一样的报错框,也是一样的尿性(点不开,看不到信息),原因、查看是什么问题及解决办法同上(我是直接点×取消)。复制文件继续进行,大概进行到94%的时候,再跳出框,也是一条线(真是考验人的眼力啊)。
到执行两个脚本的时候,那个弹出框是可以用鼠标拉来的,总算松了口气。
以root帐号按顺序执行完提示的那两个脚本,完成安装过程。接下来,必不可少的步骤是配置监听器和创建数据库。这两步是有顺序的,即必须先创建监听器,然后才能创建数据库。
1、创建监听器(主机名最好设置上):oracle环境变量如果设置正确的话(.bash_profile 文件有这么一行
“export PATH=/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin:$ORACLE_HOME/bin”),还是在vnc客户端图形界面下以oracle帐号执行netca命令,点几次鼠标就能顺利完成这个操作。为了检验监听器是否正确安装,可执行监听器命令行 lnsrctl start启动监听器,由于oracle数据库没有创建,自然就不能注册监听器,因此输出信息就会有“The listener supports no services”。更可以进入oracle的安装目录,查看/u01/app/oracle/product/11.2.0/network/admin文件listener.ora是否存在,内容是否与启动监听器时的输出相符合。
[oracle@localhost ~]$ lsnrctl start LSNRCTL for Linux: Version 11.2.0.1.0 - Production on 27-APR-2018 01:34:01 Copyright (c) 1991, 2009, Oracle. All rights reserved. Starting /u01/app/oracle/product/11.2.0/bin/tnslsnr: please wait... TNSLSNR for Linux: Version 11.2.0.1.0 - Production System parameter file is /u01/app/oracle/product/11.2.0/network/admin/listener.ora Log messages written to /u01/app/oracle/diag/tnslsnr/db217/listener/alert/log.xml Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=db217)(PORT=1521))) Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=db217)(PORT=1521))) STATUS of the LISTENER ------------------------ Alias LISTENER Version TNSLSNR for Linux: Version 11.2.0.1.0 - Production Start Date 27-APR-2018 01:34:01 Uptime 0 days 0 hr. 0 min. 0 sec Trace Level off Security ON: Local OS Authentication SNMP OFF Listener Parameter File /u01/app/oracle/product/11.2.0/network/admin/listener.ora Listener Log File /u01/app/oracle/diag/tnslsnr/db217/listener/alert/log.xml Listening Endpoints Summary... (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=db217)(PORT=1521))) The listener supports no services The command completed successfully |
2、创建数据库:仍然是在vnc客户端图形界面,以oracle帐号执行dbca指令,弹出操作界面后,连续点几次鼠标,到 step 2 of 12 这个地方,选定制数据库“Custom database”,往下几步,到填写数据库名,一般情况下,dbca能自动读取oracle环境变量里面设置的值。
再往后进行几步,到 step 7 of 12 的地方,把归档给选上(也可在创建完以后,从sqlplus进去开启)。
连续点几次鼠标,点到“finish”时,就开始真正创建数据(数据文件、重做日志等),这个过程比安装oracle复制文件更慢一些,要耐心等待一下。执行完成后,可到数据目录,查看生成的文件:
[oracle@localhost zyzf1]$ pwd /u01/app/oracle/oradata/zyzf1 [oracle@localhost zyzf1]$ ll total 1705316 -rw-r----- 1 oracle oinstall 9748480 Apr 27 02:00 control01.ctl -rw-r----- 1 oracle oinstall 52429312 Apr 27 02:00 redo01.log -rw-r----- 1 oracle oinstall 52429312 Apr 27 02:00 redo02.log -rw-r----- 1 oracle oinstall 52429312 Apr 27 01:59 redo03.log -rw-r----- 1 oracle oinstall 629153792 Apr 27 02:00 sysaux01.dbf -rw-r----- 1 oracle oinstall 734011392 Apr 27 02:00 system01.dbf -rw-r----- 1 oracle oinstall 20979712 Apr 27 01:59 temp01.dbf -rw-r----- 1 oracle oinstall 209723392 Apr 27 02:00 undotbs01.dbf -rw-r----- 1 oracle oinstall 5251072 Apr 27 02:00 users01.dbf |
同时,oracle实例也会自动启动,通过查看系统进程,即可验证。现在,我们在回过头去查看监听器的状态,看看与未创建数据库之前的差异:
[oracle@localhost zyzf1]$ lsnrctl status LSNRCTL for Linux: Version 11.2.0.1.0 - Production on 27-APR-2018 02:36:15 Copyright (c) 1991, 2009, Oracle. All rights reserved. Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=db217)(PORT=1521))) STATUS of the LISTENER ------------------------ Alias LISTENER Version TNSLSNR for Linux: Version 11.2.0.1.0 - Production Start Date 27-APR-2018 01:34:01 Uptime 0 days 1 hr. 2 min. 13 sec Trace Level off Security ON: Local OS Authentication SNMP OFF Listener Parameter File /u01/app/oracle/product/11.2.0/network/admin/listener.ora Listener Log File /u01/app/oracle/diag/tnslsnr/db217/listener/alert/log.xml Listening Endpoints Summary... (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=db217)(PORT=1521))) Services Summary... Service "zyzf1" has 1 instance(s). Instance "zyzf1", status READY, has 1 handler(s) for this service... Service "zyzf1XDB" has 1 instance(s). Instance "zyzf1", status READY, has 1 handler(s) for this service... The command completed successfully |
实例自动注册进来了(是监听器注册到实例,还是实例注册到监听器,这个有点昏,哪位大大神帮回答一下)。看进程及监听还不够,还需要从sqlplus登进去,执行“select count(*) from v$session;”看输出;关闭实例,再启动,以检验其正确性。
花了大把时间,仅仅是把oracle部署上去了,要交付使用,还有一些工作需要完成,包括修改密码过期时间、修改过小的默认表空间(system、sysaux、temp等)、创建大的redo日志文件、控制文件多路复用、创建单独的用户表空间等等。如果不把这些规范做好,开发人员极有可能直接就开始在system上创建数据表、应用程序直接使用system帐号连接数据库.....
默认密码过期时间是180天,把它改成不过期也无妨。
SQL> ALTER PROFILE DEFAULT LIMIT PASSWORD_LIFE_TIME UNLIMITED; Profile altered. SQL>SELECT * FROM dba_profiles s WHERE s.profile='DEFAULT' AND resource_name='PASSWORD_LIFE_TIME'; PROFILE RESOURCE_NAME RESOURCE LIMIT ------------------------------ -------------------------------- -------- ---------------------------------------- DEFAULT PASSWORD_LIFE_TIME PASSWORD UNLIMITED |
系统相关表空间扩容(以system为例),直接看看几个表空间文件的状况。
[oracle@db217 zyzf1]$ pwd u01/app/oracle/orada/ta/zyzf1 [oracle@db217 zyzf1]$ du -hs * 9.3M control01.ctl 51M redo01.log 51M redo02.log 51M redo03.log 601M sysaux01.dbf 701M system01.dbf 9.2M temp01.dbf 851M undotbs01.dbf 5.1M users01.dbf |
现在硬盘容量都以TB计算了,别舍不得把size搞大一点。
SQL> alter database datafile '/u01/app/oracle/oradata/zyzf1/system01.dbf' resize 10G; oracle@db217 zyzf1#du -hs * 9.3M control01.ctl 21G qhwy201801.dbf 513M redo004.log 513M redo005.log 513M redo006.log 51M redo01.log 51M redo02.log 51M redo03.log 621M sysaux01.dbf 11G system01.dbf 16M temp01.dbf 1.1G undotbs01.dbf 42M users01.dbf |
创建大的redo日志文件,删除系统默认的小的redo,偷了一下懒,一个日志组只创建了一个成员。
SQL> alter database add logfile group 4(‘/u01/app/oracle/oradata/orcl/redo04.log’)size 512M; SQL> alter database add logfile group 5(‘/u01/app/oracle/oradata/orcl/redo05.log’)size 512M; SQL> alter database add logfile group 6(‘/u01/app/oracle/oradata/orcl/redo06.log’)size 512M; SQL>alter system switch logfile; SQL>select GROUP#,THREAD#,BYTES,MEMBERS,STATUS from v$log; GROUP# THREAD# BYTES MEMBERS STATUS ---------- ---------- ---------- ---------- -------------------------------- 1 1 52428800 1 INACTIVE 2 1 52428800 1 INACTIVE 3 1 52428800 1 CURRENT 4 1 536870912 1 INACTIVE 5 1 536870912 1 INACTIVE 6 1 536870912 1 CURRENT SQL> alter system switch logfile; SQL> alter database drop logfile group 1; |
连续切换,以便于干掉旧的小的联机重做日志文件。
控制文件多路复用:默认都可能在一个分区下边,至少要再创建一个,以防万一。
SQL> select inst_id,name from gv$controlfile; INST_ID NAME ---------------------------------------------------------------------------------------------- 1 /u01/app/oracle/oradata/zyzf1/control01.ctl 1 /u01/app/oracle/flash_recovery_area/zyzf1/control02.ctl SQL> alter system setcontrol_files='/u01/app/oracle/oradata/zyzf1/control01.ctl','/u01/app/oracle/flash_recovery_area/zyzf1/control02.ctl','/home/oracle/control03.ctl'scope=spfile; [oracle@db217 ~]$cp /u01/app/oracle/oradata/zyzf1/control01.ctl /home/oracle/control03.ctl SQL> alter database mount; Database altered. SQL> alter database open; Database altered.
Create users and their default tablespaces. The size of the table space is fixed. It is recommended to add 20G when it is full. This is convenient for management and relatively safe. The specific operation method will not be described again. If you are not addicted, click here.