Tuesday, April 08, 2008

Glassfish v2 and PostgreSQL 8.3.1 on Sun SPARC Enterprise T5140 /T5240

Sun SPARC Enterprise T5140 is a two socket system which is about 1U big. Sun SPARC Enterprise T5240 is also two socket system but 2U big with a capacity of 16 internal disks. These are rack server if you look at it. However if you have no idea how it looks like at the first place and run psrinfo on these system, the result is ... awe.. in amazement. The output is 128 lines of cpu information (129 considering the header line). Its hard to imagine a single 1U box having the equivalent of 128 cpus running within it or a 2U box with 128 threads and more than 2TB of internal storage (higher if you go for high density disks). Check out Allan Packer's Weblog for more information.

When I first heard about it, my first reaction was 'how would people of the Open Source world use it?'. This is one system which I think just to use for a single purpose like a "DB Tier", "Java Tier" or a "Web Tier" undermines the capabilities of these pizza boxes unless you are not already as popular as google or yahoo. But if you are in the initial stages then this looks more like "Rack in a  Server".  So you will probably end up putting more than one label or a single big label "DB/Java/Web Tiers".

LDOMS and Solaris Containers certainly come to the rescue for individual needs of these so-called tiers to be installed on T5140. Say you have an application that uses Glassfish as the application server for the Java Applications and PostgreSQL as the backend database, how would one pursue the deployment on such type of systems.

If I am the only going who is going to maintain the system as my side-job, the simplest option is to create Zones on it and use each zone for individual tiers. The low network latency between zones is a boon for these "network joined" tiers and also at the same time, maintenance is reduced due to the nature of zones.

T5140 and T5240 have four inbuilt network ports which can be either used as 4x 1Gbps ports or 2x 10Gbps. I am assuming that many people would have similar infrastructure like ours (no 10 Gbps network infrastructure) which means utilizing multiple GigE (1GBps) becomes important so that network is not a bottleneck. In one of my test with Glassfish and PostgreSQL on Sun Fire T5140, 1 Gbps became saturated  quickly while ramping up the load on the system.  Fortunately the application being tested was scalable, and creating another zone with an additional intance of Glassfish using an additional port in the same subnet and doing load balancing over the two intances of glassfish allowed me to move forward. (Of course link aggregation is another way but it requires the links to be aggregated at the other end which sometimes cannot be controlled easily.)

Another problem that will also arise is to "how to monitor" this big system. Specially  when there is mixture of applications running some in global zone and some in local zones and mpstat will be so long that it will strain your eyes after a while (plus no screen is big enough to accomodate all lines in one viewing :-( )

One of the way that I worked on in my test was to create pools. Setting pools for zones is quite easy use the "add dedicated-cpu" resource in zonecfg and "set ncpus=24" or some value which will map to the cpu threads from the system.

For example in my test setup I dediced to dedicate AND limit PostgreSQL to 56 cpus. I did it as following


# pooladm -s
# pooladm -c
# poolcfg -c 'create pset pset_pg (uint pset.min=56; uint pset.max=56)'
# poolcfg -c 'create pool pool_pg'
# poolcfg -c 'associate pool pool_pg (pset pset_pg)'
# poolcfg -c 'modify pool pool_pg (string pool.scheduler="FX")'
# pooladm -c  
# poolstat
id pool                 size used load
  5 pool_pg                56 0.00 0.03
  0 pool_default           72 0.00 0.05
# zoneadm -z appzone1 boot
# zoneadm -z appzone2 boot
# poolstat 10
id pool size used load
5 pool_pg 56 0.00 44.3
0 pool_default 24 0.00 19.6
9 SUNWtmp_appzone2 24 0.00 13.5
8 SUNWtmp_appzone1 24 0.00 11.3
id pool size used load
5 pool_pg 56 53.9 46.5
0 pool_default 24 13.0 18.7
9 SUNWtmp_appzone2 24 12.0 13.3
8 SUNWtmp_appzone1 24 11.6 11.4

This way of monitoring makes it easy to figure out which group of applications or zones are saturating the CPU (or not really using so you can limit them further). 

All in all, the combination of PostgreSQL, Glassfish on these CMT machines will probably require all workloads to be thrown at it in order to saturate the system.  

Quite interesting I also saw a Solution Brief with Glassfish, PostgreSQL on Sun SPARC Enterprise T5140/T5240.

The Open Application Services solution is also available through Try-n-buy systems.


Post a Comment