Sunday, March 18, 2007

Sun Studio tools and DB2 V8.2 on Solaris 10 Zones


Every DBA or System Administrations or Software Developers find some need of "drilling-down" to the call-flow of how an application is interacting with the Operating System beneath it.


Various tools and features including DTrace exists to solve the problem of finding "What's going on" on a particular setup. Whilte DTrace is more dynamic, it will be overkill for DTrace to capture all events and then a huge post-processing job to understand the what all those events really mean and how they are interacting with each other.


This is where Sun Studio Performance Analyzer tools still have an edge over DTrace (which is a personal view, many may not agree). Specially when the engineer trying to figure out the problem does not have access to the system that has the problem.


We now look at how Sun Studio Performance Analyzer tools should be used to collect such information about DB2 V8.2 running on Solaris 10 (Zones or otherwise).


Typically, information is collected using the "collect" tool and the information is analyzed using the GUI "analyzer" tool. There is also a text analyzer tool also available called "er_print". The text analyzer tool is primarily useful when the information (which is reffered to as "experiments" in this context) is on a remote server somewhere in west coast and the engineer is located in east coast. (Java remote display event handling still has a long way to go.) Generally the experiment is collected by starting the application as "collect $applicationname" or by invoking dbx on the process and setting "set collector enable true" and "cont" the process.


However things are not that simple with DB2 V8.2. DB2 V8.2 is multi-processed database engine which means if you want to collect on each processes, you have to manually enable it on each process. Also since DB2 V8.2 (that includes DB2 V9.1 and all prior versions) uses setuid flags to use root priviledges to start the DB2 database engine. The collect tool works by means of preloading certain library files which Solaris security mechanism does not allow by default for unsecured libraries to happen between "userid" boundaries (user -> root -> user). Also other problem is default file permissions, so if root creates any file, then the user will not have write permissions to write to it.


The good news is that there is a work around to solve all of the above problems and thats the real content of this blog entry.


For the blog entry sake, lets assume that "db2inst1" is the DB2 Instance owner that hosts the database and /export/home/db2inst1 is the home directory path of the user "db2inst1". You will need "root" access (or atleast get the system administrator to do it for you) in order to setup all the workarounds. (This is where Solaris 10 Zones are ideal, since SysAdmins can create a zone and give developers root access to the zone without really compromising the entire system. This is how ground work for this blog entry was being executed.) In this case, DB2 Instance directory will then be /export/home/db2inst1/sqllib.

For a single partitioned DB2 instance (where the content of sqllib/db2nodes.cfg is exactly one line), if you try to execute the collect statement on db2start, you will see the following message:

-bash-3.00$ /opt/SUNWspro/bin/collect  -F all -A copy -d /export/home/db2inst1/experiments $HOME/sqllib/adm/db2start
Creating experiment database /export/home/db2inst1/experiments/test.1.er ...
ld.so.1: db2start: warning:
/opt/SUNWspro/bin/../prod/bin/../lib/v9/dbxruntime/libcollector.so
03/14/2007 14:30:58 0 0 SQL1063N DB2START processing was successful.
SQL1063N DB2START processing was successful.


DB2 has started and an experiment is created however, the experiment will be empty because the libcollector.so library is not allowed to cross setuid flags since it has an insecured path name. Also when you cross such setuid environments, few files created for profile information while the process has effective userid of root will not be available for the process when the effective userid is "db2inst1" again. Hence umask also has to be modified.


To work around this problem one has to use the crle command using root priviledges as follows:


First find out the default settings:

# crle -64

Default configuration file (/var/ld/64/ld.config) not found
Default Library Path (ELF): /lib/64:/usr/lib/64 (system default)
Trusted Directories (ELF): /lib/secure/64:/usr/lib/secure/64 (system default)



Modify trusted directories path by using the old list and appending the directory that contains the dbxruntime libraries

# crle -64 -s  /lib/secure/64:/usr/lib/secure/64:/opt/SUNWspro/bin/../prod/bin/../lib/v9/dbxruntime


NOTE: There should be no trailing forward slash after the directory name otherwise it will not work for that directory.


Verify the setting again. Note we are changing only for 64-bit binaries.

# crle -64

Configuration file [version 4]: /var/ld/64/ld.config
Default Library Path (ELF): /lib/64:/usr/lib/64 (system default)
Trusted Directories (ELF):
/lib/secure/64:/usr/lib/secure/64:/opt/SUNWspro/bin/../prod/bin/../lib/v9/dbxruntime

Command line:
crle -64 -c /var/ld/64/ld.config -s \
/lib/secure/64:/usr/lib/secure/64:/opt/SUNWspro/bin/../prod/bin/../lib/v9/dbxruntime





Now stop the DB2 engine if it is still running and then modify umask and restart DB2 engine as follows

-bash-3.00$ umask 000
-bash-3.00$ /opt/SUNWspro/bin/collect -F all -A copy -d \
/export/home/db2inst1/experiments $HOME/sqllib/adm/db2start
Creating experiment database /export/home/db2inst1/experiments/test.1.er ...
03/19/2007 10:46:53 0 0 SQL1063N DB2START processing was successful.
SQL1063N DB2START processing was successful.
-bash-3.00$ db2sampl
-bash-3.00$ db2 connect to sample

Database Connection Information

Database server = DB2/SUN64 8.2.6
SQL authorization ID = DB2INST1
Local database alias = SAMPLE

-bash-3.00$ db2 "select count(*) from employee"

1
-----------
32

1 record(s) selected.

-bash-3.00$ db2 connect reset
DB20000I The SQL command completed successfully.
-bash-3.00$ db2stop
03/19/2007 10:52:37 0 0 SQL1064N DB2STOP processing was successful.
SQL1064N DB2STOP processing was successful.
-bash-3.00$




When db2stop is executed, it also marks the end of the collection of the DB2 engine instance. Now you can use "er_print" or the "analyzer" GUI as follows:

-bas-3.00$ cd /export/home/db2inst1/experiments
-bash-3.00$ /opt/SUNWspro/bin/er_print test.1.er


In this case since we know it has lots of sub processes we load it easily as follows:

-bash-3.00$ /opt/SUNWspro/bin/er_print test.1.er/_*.er
(/opt/SUNWspro/bin/er_print) limit 10
(/opt/SUNWspro/bin/er_print) functions
Functions sorted by metric: Exclusive User CPU Time

Excl. Incl. Name
User CPU User CPU
sec. sec.
30.111 30.111
1.651 1.651 take_deferred_signal
1.231 1.231 memset
0.741 1.081 sqlno_qnc_bound_by_func(sqlno_globals*,sqlno_set,sqlno_set,sqlnq_qnc*,sqlno_qtb*,int*,
SQLNO_BND*,int*,sqlnq_qnc**)
0.700 6.545 sqlnp_parser(sqlnp_cb*)
0.550 0.550 memcpy
0.550 1.281 sqlogmblkEx
0.500 0.771 sqlno_bit2intD(sqlno_globals*,sqlno_sdb*,sqlno_apcb*,sqlno_set,sqlno_iset**)
0.470 2.051 sqlno_prop_ff(sqlno_globals*,sqlno_apcb*,sqlno_qtb*,sqlno_join_context*,sqlno_set,sqlno_set,
sqlno_set,float*,sqlno_part*,SQLNN_BOOLEAN,float*,float*,sqlno_set*,sqlno_table*,sqlno_table*)
0.350 0.921 sqlno_fd_normalize_order(sqlno_globals*,sqlno_order_class*,sqlno_order_class*,
sqlno_plan_properties*,sqlno_apcb*,sqlno_order_class**,sqlno_set*)

(/opt/SUNWspro/bin/er_print)



If you use the analyzer GUI then you have to select all processes before you start looking at the data.



Also if this is associated with a technical problem, you can tar up the directory /export/home/experiments and see it to the technical support team which can also look at the data from it using the same tools.


In another entry we will see how to collect data using the Sun Studio tools when DB2 engine is used with DPF - Database Partition Feature. This is easily identified when there are more than 1 lines in the SHOME/sqllib/db2nodes.cfg file.





No comments: