Previously I talked about tuning Solaris 10 and improving PostgreSQL performance on Solaris 10.
This time it is about "changing the game" with PostgreSQL Code itself.
In a particular workload that I was running against PostgreSQL 8.0.x snapshot, I found that when I am loading a table using the "COPY" function it is very slow. Now slow is a relative term. The way I define it... I know my hardware can write faster than this... there's gotta be bottleneck somewhere.
Using my favorite syscall.d script I figured I am spending too much time in my writes. Using sys1call.d I found that the code path of that long write calls typically happens more from the WAL Logs of PostgreSQL. Hence I finally did the thing that I try to avoid... Browsing throught the source code I finally started changing the default values and rebuilding PostgreSQL and re-ran my workload against it. The top improvement came from the following
#define XLOG_SEG_SIZE (16*1024*1024)
in $SNAPSHOT/src/include/pg_config_manual.h to
#define XLOG_SEG_SIZE (256*1024*1024)
This seem to improve 3X my load performance. (Unfortunately it also forces to re- "initdb" the database.) But hey it is a step forward in the right direction.
I don't have the right explaination for the improvement but atleast I can continue to my next bottleneck.
I reran my favorite syscall.d and found while write is still one of the top, the real time spent is in semsys. Welcome to the world of bottlenecks.. :-)
I re-ran sys1call.d to dump the stack during semsys and it all came from semaphore lockings, unlockings and trying-to-lock functions in PostgreSQL.
Aargg.... more source code to browse..