Elazab's oracle blog: Prerequisite for installaing 11g RAC/grid infrastructure Release 2 on solaries-64 bit operating system

Creating Job Role Separation Operating System Privileges Groups and Users:

Run the following commands:

/usr/sbin/groupadd -g 1020 asmadmin

/usr/sbin/groupadd -g 1022 asmoper

/usr/sbin/groupadd -g 1021 asmdba

/usr/sbin/groupadd -g 1032 oper

/usr/sbin/useradd -u 1100 -g oinstall -G dba grid

/usr/sbin/useradd -u 1101 -g oinstall -G dba,asmdba oracle

/usr/sbin/usermod -g oinstall -G dba,asmdba oracle
mkdir -p /app/11.2.0/grid

mkdir -p /app/grid

mkdir -p /app/oracle

chown grid:oinstall /app/11.2.0

chown grid:oinstall /app/11.2.0/grid

chown grid:oinstall /app/grid

chown oracle:oinstall /app/oracle

Configuring Shell Limits:

	Recommended Value
Shell STACK	32768 as Minimum value
Shell NOFILES	4096 as Minimum value
DATA	1048576 as Minimum value
VMEMORY	16777216 can be increased
TIME	-1 (Unlimited)
FILE	-1 (Unlimited)

Software Requirements List for Solaris Operating System (SPARC 64-Bit) Platforms:

Item		Requirements
Operating System, Packages and patches for Oracle Solaris 11		Oracle Solaris 11 (11/2011 SPARC) or later, for Oracle Grid Infrastructure release 11.2.0.3 or later. pkg://solaris/developer/build/make pkg://solaris/developer/assembler No special kernel parameters or patches are required at this time.
Operating System, Packages and Patches for Oracle Solaris 10		Oracle Solaris 10 U6 (5.10-2008.10) SUNWarc SUNWbtool SUNWcsl SUNWhea SUNWi1cs (ISO8859-1) SUNWi15cs (ISO8859-15) SUNWi1of SUNWlibC SUNWlibm SUNWlibms SUNWsprot SUNWtoo SUNWxwfnt 119963-14: Sun OS 5.10: Shared Library Patch for C++ 120753-06: SunOS 5.10: Microtasking libraries (libmtsk) patch 139574-03: SunOS 5.10 141414-02 141414-09 (11.2.0.2 or later) 146808-01 (for Solaris 10 U9 or earlier)
Database Smart Flash Cache (An Enterprise Edition only feature.)		The following patches are required for Oracle Solaris (SPARC 64-Bit) if you are using the flash cache feature: 125555-03 139555-08 140796-01 140899-01 141016-01 141414-10 141736-05
IPMI		The following patches are required only if you plan to configure Failure Isolation using IPMI on SPARC systems: 137585-05 or later (IPMItool patch) 137594-02 or later (BMC driver patch)
Oracle RAC		Oracle Clusterware is required; Oracle Solaris Cluster is supported for use with Oracle RAC on SPARC. If you use Oracle Solaris Cluster 3.2, then you must install the following additional kernel packages and patches: SUNWscucm 3.2.0: 126106-40 VERSION=3.2.0,REV=2006.12.05.22.58 or later 125508-08 125514-05 125992-04 126047-11 126095-05 126106-33 Note: You do not require the additional packages if you are using Oracle Clusterware only, without Oracle Solaris Cluster. If you use a volume manager, then you may need to install additional kernel packages.
Packages and patches for Oracle Solaris Cluster		Note: You do not require Oracle Solaris Cluster to install Oracle Clusterware. For Oracle Solaris 11, Oracle Solaris Cluster 4.0 is the minimum supported Oracle Solaris Cluster version. For Oracle Solaris 10, Oracle Solaris Cluster 3.3 or later UDLM (optional): ORCLudlm 64-Bit reentrant 3.3.4.10 CAUTION: If you install the ORCLudlm package, then it is detected automatically and used. Install ORCLudlm only if you want to use the UDLM interface for your Oracle RAC cluster. Oracle recommends with Oracle Solaris Cluster 3.3 and later that you use the native cluster membership functionality provided with Oracle Solaris Cluster. For more information, refer to Section 2.8, "Oracle Solaris Cluster Configuration on SPARC Guidelines." For Oracle Solaris Cluster on SPARC, install UDLM onto each node in the cluster using the patch Oracle provides in the Grid_ home/clusterware/udlm directory before installing and configuring Oracle RAC. Although you may have a functional version of the UDLM from a previous Oracle Database release, you must install the Oracle 11g release 2 (11.2) 3.3.4.10 UDLM.
Oracle Messaging Gateway		Oracle Messaging Gateway supports the integration of Oracle Streams Advanced Queuing (AQ) with the following software: IBM MQSeries V6 (6.6.0), client and server Tibco Rendezvous 7.2
Pro*C/C++, Oracle Call Interface, Oracle C++ Call Interface, Oracle XML Developer's Kit (XDK)		Oracle Solaris Studio 12 (formerly Sun Studio) (C and C++ 5.9) 119963-14: SunOS 5.10: Shared library patch for C++ 124863-12 C++ SunOS 5.10 Compiler Common patch for Sun C C++ (optional)
Oracle ODBC Driver		gcc 3.4.2 Open Database Connectivity (ODBC) packages are only needed if you plan on using ODBC. If you do not plan to use ODBC, then you do not need to install the ODBC RPMs for Oracle Clusterware, Oracle ASM, or Oracle RAC.
Programming languages for Oracle RAC database	■ ProCOBOL Micro Focus Server Express 5.1 ■ ProFORTRAN Oracle Solaris Studio 12 (Fortran 95) Download at the following URL: http://www.oracle.com/technetwork/server-storage/solarisst udio/overview/index.html
Oracle JDBC/OCI Drivers	You can use the following optional JDK versions with the Oracle JDBC/OCI drivers, however they are not required for the installation: ■ JDK 6 Update 20 (JDK6 - 1.6.20) or later ■ JDK 5 (1.5.0_24) or later Note: JDK 6 is the minimum level of JDK supported on Oracle Solaris 11.
SSH	Oracle Clusterware requires SSH. The required SSH software is the default SSH shipped with your operating system.

Operating System Kernel Requirements:

Verifying UDP and TCP Kernel Parameters

/usr/sbin/ndd -set /dev/tcp tcp_smallest_anon_port 9000

/usr/sbin/ndd -set /dev/tcp tcp_largest_anon_port 65500

/usr/sbin/ndd -set /dev/udp udp_smallest_anon_port 9000

/usr/sbin/ndd -set /dev/udp udp_largest_anon_port 65500

Storage Requirements:

Item	Requirement
/tmp	To be increased to 3G
Swap	Available RAM Swap Space Required If RAM Between 2.5 GB and 16 GB Then Equal to the size of RAM If RAM More than 16 GB Then Equal to 16 GB
3 volume groups	5G for each and run the following for each disk where dsk1ocr is an example for the disk: chown grid:asmadmin dsk1ocr chmod 660 dsk1ocr Or clusterfile system can be used instead.
/app	To be increased to 150G

Checking Resource Limits for Solaris

On Solaris platforms, the /etc/pam.conf file controls and limits resources for users on the system. On login, control and limit resources should be set for users on the system so that users are unable to perform denial of service attacks.

By default, PAM resource limits are not set for Solaris operating systems. To ensure that resource limits are honored, add the following line to the login service section of /etc/pam/conf: login auth required pam_dial_auth.so.1

For example:

# login service (explicit because of pam_dial_auth)

Network Time Protocol Setting

Oracle Clusterware requires the same time zone setting on all cluster nodes. During installation, the installation process picks up the time zone setting of the Grid installation owner on the node where OUI runs, and uses that on all nodes as the default TZ setting for all processes managed by Oracle Clusterware. This default is used for databases, Oracle ASM, and any other managed processes.You have two options for time synchronization: an operating system configured network time protocol (NTP), or Oracle Cluster Time Synchronization Service. Oracle Cluster Time Synchronization Service is designed for organizations whose cluster servers are unable to access NTP services. If you use NTP, then the Oracle Cluster Time Synchronization daemon (ctssd) starts up in observer mode. If you do not have NTP daemons, then ctssd starts up in active mode and synchronizes time among cluster members without contacting an external time server.On Oracle Solaris Cluster systems, Oracle Solaris Cluster software supplies a template file called ntp.cluster (see /etc/inet/ntp.cluster on an installed cluster host) that establishes a peer relationship between all cluster hosts. One host is designated as the preferred host. Hosts are identified by their private host names. Time synchronization occurs across the cluster interconnect. If Oracle Clusterware detects either that the Oracle Solaris Cluster NTP or an outside NTP server is set default NTP server in the system in the /etc/inet/ntp.conf or the /etc/inet/ntp.conf.cluster files, then CTSS is set to the observer mode.

Note: Before starting the installation of the Oracle Grid Infrastructure, Oracle recommends that you ensure the clocks on all nodes are set to the same time.If you have NTP daemons on your server but you cannot configure them to synchronize time with a time server, and you want to use Cluster Time

Synchronization Service to provide synchronization service in the cluster, then deactivate and deinstall the NTP.

To disable the NTP service, run the following command as the root user

# /usr/sbin/svcadm disable ntp

When the installer finds that the NTP protocol is not active, the Cluster Time Synchronization Service is installed in active mode and synchronizes the time across the nodes.

To confirm that ctssd is active after installation, enter the following command as the

Grid installation owner:

$ crsctl check ctss

If you are using NTP, and you prefer to continue using it instead of Cluster Time

Synchronization Service, then you need to modify the NTP initialization file to enable

slewing, which prevents time from being adjusted backward. Restart the network time

protocol daemon after you complete this task. To do this on Oracle Solaris without Oracle Sun Cluster, edit the /etc/inet/ntp.conf file to add "slewalways yes" and "disable pll" to the file. After you make these changes, restart ntpd (on Oracle Solaris 11) or xntpd (on Oracle Solaris 10) using the command /usr/sbin/svcadm restart ntp.

To do this on Oracle Solaris 11 with Oracle Solaris Sun Cluster 4.0, edit the /etc/inet/ntp.conf.sc file to add "slewaways yes" and "disablepll" to the file. After you make these changes, restart ntpd or xntpd using the command /usr/sbin/svcadmn restart ntp. To do this on Oracle Solaris 10 with Oracle Sun

Cluster 3.2, edit the /etc/inet/ntp.conf.cluster file.

To enable NTP after it has been disabled, enter the following command:

# /usr/sbin/svcadm enable ntp

Automatic SSH Configuration During Installation

To install Oracle software, Secure Shell (SSH) connectivity should be set up between all cluster member nodes. OUI uses the ssh and scp commands during installation to run remote commands on and copy files to the other cluster nodes. You must configure SSH so that these commands do not prompt for a password.

Note: SSH is used by Oracle configuration assistants for configuration operations from local to remote nodes. It is also used by Oracle Enterprise Manager. You can configure SSH from the OUI interface during installation for the user account running the installation. The automatic configuration creates passwordless SSH connectivity between all cluster member nodes. Oracle recommends that you use the automatic procedure if possible.

To enable the script to run, you must remove stty commands from the profiles of any Oracle software installation owners, and remove other security measures that are triggered during a login, and that generate messages to the terminal. These messages, mail checks, and other displays prevent Oracle software installation owners from using the SSH configuration script that is built into the Oracle Universal Installer. If they are not disabled, then SSH must be configured manually before an installation

can be run.

By default, OUI searches for SSH public keys in the directory /usr/local/etc/, and ssh-keygen binaries in /usr/local/bin. However, on Oracle Solaris, SSH public keys typically are located in the path /etc/ssh, and ssh-keygen binaries are located in the path /usr/bin. To ensure that OUI can set up SSH, use the following command to

create soft links:

# ln -s /etc/ssh /usr/local/etc

# ln -s /usr/bin /usr/local/bin

In rare cases, Oracle Clusterware installation may fail during the "AttachHome" operation when the remote node closes the SSH connection. To avoid this problem, set the following parameter in the SSH daemon configuration file /etc/ssh/sshd_config on all cluster nodes to set the timeout wait to unlimited:

LoginGraceTime 0

Network Requirements:

1. SCAN VIP IPs:

a. Need 3 IPs used for scan VIP with single name for them.

b. SCAN is a domain name registered to three IP addresses in the

domain name service (DNS). The SCAN name (a domain name)

must be set up to round robin across 3 IP addresses. This

requires a SCAN name resolution via domain name service

(DNS).

c. Cluster Name: portalcluster

d. SCAN Name: Portalcrsscan which will have the three IPs.

2. One IP for single database.

3. Private Interconnect:

a. Each interface on single machine has to be on different VLAN.

b. Interface Name should be identical on all nodes and within the

same VLAN.

Issues that you may face during 11.2.0.2 installation:

·
Both search and domain entries are present in file "/etc/resolv.conf" with the same value will cause the cluster verify utility to fail during checking the prerequisites. You can guarantee everything is working properly by running cluster verify utility before the installation.

·
Swap Space should be handled in a smart way and as per the oracle documentation requirements for example if you will assign 16G memory then swap space should be at least 16G.The cluster verify utility will calculate the needed swap and will let you know if you must change the swap space assigned in case you run it in verbose mode.

· Please note that "project.max-shm-memory" represent the maximum shared memory available for a project, so the value for this parameter should be greater than sum of all SGA sizes for all databases in your environment.

· The last step of cluster verify utility failed with below error:

Errors &issues appeared during last step of installing Grid Infrastrcuture:

INFO: Liveness check failed for "xntpd"
INFO: Check failed on nodes:
INFO: svprtldb02,svprtldb01
INFO: PRVF-5494 : The NTP Daemon or Service was not alive on all nodes
INFO: PRVF-5415 : Check to see if NTP daemon or service is running failed
INFO: Clock synchronization check using Network Time Protocol(NTP) failed
INFO: PRVF-9652 : Cluster Time Synchronization Services check failed
INFO: Checking VIP configuration.
INFO: Checking VIP Subnet configuration.
INFO: Check for VIP Subnet configuration passed.
INFO: Checking VIP reachability
INFO: Check for VIP reachability passed.
INFO: Starting check for The SSH LoginGraceTime setting ...
INFO: WARNING:
INFO: PRVE-0038 : The SSH LoginGraceTime setting on node "svprtldb02" may result in users being disconnected before login is completed
INFO: PRVE-0038 : The SSH LoginGraceTime setting on node "svprtldb01" may result in users being disconnected before login is completed

The solution was to disable the NTP during the installation and then enable it back after installation because of the process CTSS will be in observer mode.Please remove or remove /etc/ntp.conf or /etc/xntp.conf from all nodes and check no ntp daemon is running.

After disabling the NTDP process run the below:

grid@svprtldb01:/apps/grid/clusterverify/bin$ ./cluvfy comp clocksync -n svprtldb01,svprtldb02

Verifying Clock Synchronization across the cluster nodes

Checking if Clusterware is installed on all nodes...

Check of Clusterware install passed

Checking if CTSS Resource is running on all nodes...

CTSS resource check passed

Querying CTSS for time offset on all nodes...

Query of CTSS for time offset passed

Check CTSS state started...

CTSS is in Observer state. Switching over to clock synchronization checks using NTP

Starting Clock synchronization checks using Network Time Protocol(NTP)...

NTP configuration file "/etc/inet/ntp.conf" existence check passed

Liveness check passed for "ntpd"

Check for NTP daemon or service alive passed on all nodes

Check of common NTP Time Server passed

Clock time offset check passed

Clock synchronization check using Network Time Protocol(NTP) passed

Oracle Cluster Time Synchronization Services check passed

Verification of Clock Synchronization across the cluster nodes was successful.

I got the below errors as the disks were mounted to one node but refused to mount to the other node .The first question came to my mind if the disks permissions are set correctly on all nodes but disks permissions were ok.

Errors & issues:

======================================================

Sat Feb 08 23:23:52 2014
Starting background process RSMN
Sat Feb 08 23:23:52 2014
RSMN started with pid=34, OS id=12977
Sat Feb 08 23:23:52 2014
Sweep [inc][217]: completed
Sweep [inc][177]: completed
Sweep [inc2][217]: completed
Sweep [inc2][177]: completed
ORACLE_BASE not set in environment. It is recommended
that ORACLE_BASE be set in the environment
Sat Feb 08 23:23:54 2014
ALTER DATABASE MOUNT
NOTE: Loaded library: System
ORA-15025: could not open disk "/dev/rdsk/c3d17s0"
ORA-27041: unable to open file
SVR4 Error: 13: Permission denied
Additional information: 9
SUCCESS: diskgroup SYSTEM_WEB was mounted
Errors in file /app/oracle/diag/rdbms/webprd/WEBPRD2/trace/WEBPRD2_ckpt_12943.trc (incident=
4177):
ORA-00600: internal error code, arguments: [kfioTranslateIO03], [], [], [], [], [], [], [], [
], [], [], []
Incident details in: /app/oracle/diag/rdbms/webprd/WEBPRD2/incident/incdir_4177/WEBPRD2_ckpt_12943_i4177.trc

To solve it follow the below Oracle note:

Implement the solution provided in oracle note :ORA-00600 [kfioTranslateIO03] [17090] (Doc ID 1336846.1)

After installation was successful i noticed that the process ora.crf status is not started on both nodes. Seems that it was a bug in 11.2.0.2 and applying patchset update 12 solved that issue.

Errors & Issues:

================================================

root@svprtldb02:/app/grid/bin# ./crsctl stat res -t -init
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
1 ONLINE ONLINE svprtldb02 Started
ora.crf
1 ONLINE OFFLINE
ora.crsd
1 ONLINE ONLINE svprtldb02
ora.cssd
1 ONLINE ONLINE svprtldb02
ora.cssdmonitor
1 ONLINE ONLINE svprtldb02
ora.ctssd
1 ONLINE ONLINE svprtldb02 OBSERVER
ora.diskmon
1 ONLINE ONLINE svprtldb02
ora.evmd
1 ONLINE ONLINE svprtldb02
ora.gipcd
1 ONLINE ONLINE svprtldb02
ora.gpnpd
1 ONLINE ONLINE svprtldb02
ora.mdnsd
1 ONLINE ONLINE svprtldb02

After successful implementation of the patch set update the issue was fixed.

· After successful installation of grid infrastructure and databases a node was rebooted for maintenance by System admin team and after startup the setup of the private interconnect was removed and required them to re-setup it again.

Also everything was ok and we started the node the issue was that there was a bug with the 11.2.0.2 and this bug was fixed in 11.2.0.3 patchset updates.

Symptoms:

GIPCD
-----
Line 16: 2014-02-10 05:02:04.504: [GIPCDMON][7] gipcdMonitorSaveInfMetrics: inf[ 0] ipmp0 - rank -1, avgms 30000000000.000000 [ 0 / 0 / 0 ]
...
Line 63636: 2014-02-10 12:42:25.549: [GIPCDMON][7] gipcdMonitorSaveInfMetrics: inf[ 0] ipmp0 - rank -1, avgms 30000000000.000000 [ 0 / 0 / 0 ]

Oracle Support notes regarding the issues:
Doc ID 1479380.1: 11gR2 GI Node May not Join the Cluster After Private Network is Functional After Eviction due to Private Network Problem

For patches in 11.2.0.3 and above I kindly refer to Doc ID 1479380.1

A workaround for this bug follow the below:

a) On surviving node, during non-peak time if possible, kill gipcd.bin process (kill -15 <gipcd.bin ospid>)
NOTE: In 11.2 will also lead to death of evmd.bin, crsd.bin and ctssd.bin processes. None of these processes are fatal. Clusterware will respawn all them automatically.

b) Once gipcd.bin, evmd.bin, crsd.bin and ctssd.bin processes have been re-spawned on the surviving node, verify whether other nodes join cluster.

c) Most of the time, GI will start, but in case it does not, re-start GI on the other nodes with crsctl command.

d) Finally, if GI is still not starting, as a last resort, restart GI on the surviving node.

General Notes:

To check the RAC network issues, please upload the following:

1) Please upload results:

$GRID_HOME/bin/srvctl config network

$GRID_HOME/bin/srvctl config nodeapps -a

$GRID_HOME/bin/srvctl config scan

2) ++Kindly get the cluvfy utility output:

cluvfy stage -pre crsinst -n [nodelist] -verbose

References:

CTSSD Runs in Observer Mode Even Though No Time Sync Software is Running (Doc ID 1054006.1)
ORA-00600 [kfioTranslateIO03] [17090] (Doc ID 1336846.1).

ASM Metadata information using:Note:470211.1 "How To Gather/Backup ASM Metadata In A Formatted Manner?".

http://Tahiti.oracle.com

How to Proceed from Failed 11gR2 Grid Infrastructure (CRS) Installation (Doc ID 942166.1

Elazab's oracle blog

Text box

Wednesday, 12 February 2014

Prerequisite for installaing 11g RAC/grid infrastructure Release 2 on solaries-64 bit operating system

No comments:

Post a Comment

About Me

Labels

Blog Archive