Wednesday, January 09, 2008

Multi-cores, Operating Systems, Database, Applications and IT Projects

As days goes by, more and more multi-core systems are popping up everywhere. Infact, with the advent of the new 4-socket quad-core Sun Fire X4450 , 16-core systems are soon becoming common out there.


However,personally, I think these systems are still underutilized from their true potentials.  Of course virtualization is a way of increasing the utilization but  that is like working around the symptoms and not fixing the real problems. Software Applications fundamentally has lagged behind the microprocessor innovations. Operating Systems too have lagged behind too. Solaris, however advanced opertating system has still lot to achieve in this area. For example, yes the kernel is multi-threaded, yes it can scale well easily to 100s of cores but that scaling is generally achieved by creating copies of the process (or multiple connections or multiple threads however you look at it) at the APPLICATION level.  One area however that generally falls behind is its  own utility commands.  For example: tar, cp, compress, or pick your favorite /usr/bin command). These utility programs will generally end up using only one core or virtual cpu. Now Solaris does  provide a framework API for  multi-threaded systems, but it is still surprising to me that not many people are asking for versions of the basic utilities  that can use the resources available to make it  a big priority. I think there is a significant loss of productivity waiting for utilities to complete while the system practically is practically running idle loops on the rest of the cores.


Database is an application of an operating system, few commercial databases have handled these challenges to adapt to multi-core systems well, Even opensource databases are not far behind as seen by the multi-threaded nature of MySQL and the scaling of  PostgreSQL on multicore systems. However an area that even PostgreSQL lacks support for multi-core systems is "utilities". Load, Index creations, Backup, Restore to me are all utilities of a database system which generally when it happens have lot of eyes watching and waiting for them to complete. Now lot can be done by having multiple connections. Break up work and execute bits which can be done in parallel to be invoked using different connections. But again to me that looks like the buck is conviently passed to its APPLICATION  to avail of its features.


Similarly as the bucks passes from one SERVICE PROVIDER to its APPLICATION, the buck gets baggage of these heaviness which eventually causes the "eyes" of the end users waiting to see their "simple" tasks done on this systems boasting of 16x the compute potential take a long time to finish. What is the result? Loss of productivity, efficiency, wastage of power running idle threads, etc. Users then generally "Curse" and "Yell" just to kill the time while waiting for mundane tasks to finish on these multi-core systems.


Now if you look at each cores as a resource (like an IT Software Programmer), you have 16 programmers available to crunch code (or numbers in case of cores). To an IT Project manager budget is what generally limits the number of programmers. But when you have already paid  and got 16 programmers, a good IT Project manager will try to utilize to them efficiently to utilize them in the best possible way. After all the buck stops at the IT Project Manager's desk.


The question is what is the best possible way to utilize the 16 programmers to do a task. I think a good project manager will not call all its 16 programmers and tell them to do all  the 'n'  programming tasks asking every one of them to assign 1/n th of their time to each task. That just creates a chaos. Why? Well the assumption is "skills" of all the programmers are the same, all will take the same time to finish  a particular task, all of them will be doing all the tasks that way the shortest time to complete all  tasks will be acheived. Now any good IT Project manager will tell you that all assumptions are wrong here. Well then why in the world do we assume that in multi-cores systems?


 So what are we missing in "Software"? Yep you got it, we need an equivalent of IT Project manager to solve the chaos of multi-core systems. An IT Project manager which understands how the "Experience" and Potential of each of its compute resources (Programmers). But wait, Solaris engineers are already saying we have that Project Manager, it is called Solaris Scheduler. Duh.. then what is wrong? Think about it again from a IT Project Manager's view point.  An IT Project manager takes the input from a IT Director or Sr Manager regarding priorities of tasks  and then it works with Senior IT Architect to break up the Project into series of "Serial" and "Parallel Tasks" assuming it has 16 skilled programmers to achieve the goal of finishing the "Highest" priority projects first. The way currently most Operating System Scheduler works is just trying to find an available compute resource to continue its computing the "Serial" task  with no intelligence on how the task can be broken into sets of "serial" and parallel components. So there is a difference between Scheduler and Project Manager.


But then the question comes can One Project Manager be intelligent about everything in the IT Department? Maybe not, maybe thats why there is an hierarchy where our Project Manager banks on another Project Manager Assistants having knowledge for particular taks in order to assist our Project Manager for creating a chart for specific tasks not known to it. Which means now every application needs to provide a "Project Manager Assistant" in order to provide guidance to the Operating System Project Manager on how best to run the tasks of the applications. Currently except for the "nice" priority  bribes to the Operating Scheduler, an application does not provide much feedback on how it thinks it can be executed optimally on multi-core systems.


Come to think about it, these IT Project Managers do have a role to play in solving Engineering problems.


Now the onus is on the various operating systems, database, applications architects to provide a frame work on how applications can provide their own application project manager input  to the Operating System Project Manager. Maybe we should call it  Project "Project Manager". :-)


In the meanwhile, what we can all do is start thinking about ways to convert atleast the "utilities" that we own to utilize these multi-cores resources on the system.  Change software to adapt to multi-core, one utility at a time.


 


No comments: