Sunday, July 16, 2006

10 Sun Fire X4500 bundle => 20 + TB DSS or BIDW Config?


I noticed today that there is a 24TB 10-server bundle of Sun Fire X4500 (previously known as Project Thumper) which was actually discounting the server by a whopping 32%. This could actually lead enterprise customers to buy those configurations rather than other configuration just based on the price advantage.


So if enterprise customer wants to buy that 10-X4500 bundle, how will they use it?


Again I am narrowing my focus based on what I understand little bit about the field (which I agree is not enough) and making relevant assumptions.


  • Targeted for Enterprise Customers

  • Have an Enterprise License for a database (eg DB2)

  • Are interested in setting up a performant big data warehouse



This assumption then changes many of the implementation details mentioned in the previous blog entry. For example in this scenario I would probably select RAID-10 instead of RAID-Z. I would take advantage of distributed features of DB2 - Dynamic Partition Feature since it involves multiple physical servers connected via network or inifiband.


How did I come to the 20TB+ mentioned in the header? Well read on and then it might be clear on how I arrived at the figure.


Now I know the bundle comes in 10 but my design was more based on a single X4500 configuration which can then be replicated over the other 9 servers. (After all that's the Horizontal Scaling philosophy).


Each X4500 has 2-socket, 4-cores of AMD 285 with 16GB RAM and 4 10/100/1000 BaseT Ethernet Ports per Server. Now I am of the opionion that alway keep one unit of virtual CPU free for quick access to the system and management tasks. With this philosophy of mine, I would use only 3-cores of the CPUs for DB2. Since I have already made the decision of using distributed DB2 and keeping in mind the NUMAness of memory of Opteron, I would use 3 Nodes or partitions of DB2 for the three cores that I plan to use for the database. (Thanks to integration of DB2 and Solaris Resource Pools in DB2 V8.1 fixpack 4 this is easy to do by just creating projects and entering the project id in db2nodes.cfg and all partitions can start in their respective projects containing a resource pool of 1 core each).


Now comes the hard part, disk configurations. There are numerous design patterns available at your disposal with the immense number of spindles available. However remember that same design will have to be replicated on other 9 servers too. Hence if it is too cumbersome to manage one server it will be way to cumbersome to manage 10 servers. So I am going to try some simple configurations with some reasoning behind it but not to a very fine granular strategy of separating out spindles for various tablespaces.


That said I would still use two mirrored disk for the root file system. This leaves 46 disks. Since I plan to use three partitions with RAID-1 storage I need the highest common factor of 3 which is even and less than 46. 45 will result in 15 spindles for each node. 15 to be doesn't seem optimal for RAID-10 configuration but 14 does and also I then have a disk "spare" for each of the mirror nodes. That sounds interesting then I will end up having the disks layout as follows


DISKS  0   1   2   3   4   5   6   7   
c0: P0 P0 P0 P0 P2 P2 P2 P2
c1: P1 P1 P1 P1 P0 P0 P0 P0
c6: P2 P2 P2 P2 P1 P1 P1 P1
c7: S0 P0 P0 P0 P2 P2 P2 S2
c5: B0 B1 P2 P2 P2 P1 P1 P1
c4: S1 P1 P1 P1 P0 P0 P0 X


So now I have each pool of 7 x 250GB = 1750GB for each of the DB2 partition.
If I divide the space as
Tables: 750 GB,Index : 250 GB,Temp : 500 GB,LOBS : 200 GB,Logs : 25 GB


Then I have a setup that handles about 3 x 750GB = 2250 GB per X4500 or 225000 GB or roughly 22TB for the 10-server bundle of X4500.


Lets estimate the feeds and speeds of such a setup.


Each DB2 node then has access to about 350MB/sec ( 7x assuming 50MB/sec/platter) IO read bandwidth which probably won't be saturated with a single opteron core which will also be manipulating the data as it reads it so the setup probably won't be IO Bound which is important in a data warehousing environment. Fortunately we also have one free processing core in the server so all interrupts and administrative operations can be diverted to that core and hence the opteron cores assigned to DB2 will be working to the fullest processing DB2 data. Since the two cores shares the cache, best strategy will be to use the DB2 Node 0 or the coordinating node on the CPU with the free core and the other processor will handle DB2 Node 1 and DB2 Node 2 on the Sun Fire X4500.


Now since the DB2 Nodes will be communicating with each other, an ideal way for each server will be to use 1 network adapter for external interface and use aggregation on the other 3 network adapters and connected to a private network amongst the 10 servers to allow about 3Gbps communication bandwidth amongst the servers.


Most of the things (maybe except the additional network switch and ethernet cables) are already part of the 10-server bundle so it really makes it easier to manage. Plus except for the network cables for inter-server communication and the power cables there are no messy cables for the storage in this setup which makes it elegant to look at in the data center.


Now I need someone to read this entry and try it out. :-)