Friday, May 20, 2005

Improving PostgreSQL performance on Solaris 10

Previously I talked about tuning Solaris 10 and improving PostgreSQL performance on Solaris 10.
This time it is about "changing the game" with PostgreSQL Code itself.

In a particular workload that I was running against PostgreSQL 8.0.x snapshot, I found that when I am loading a table using the "COPY" function it is very slow. Now slow is a relative term. The way I define it... I know my hardware can write faster than this... there's gotta be bottleneck somewhere.
Using my favorite syscall.d script I figured I am spending too much time in my writes. Using sys1call.d I found that the code path of that long write calls typically happens more from the WAL Logs of PostgreSQL. Hence I finally did the thing that I try to avoid... Browsing throught the source code I finally started changing the default values and rebuilding PostgreSQL and re-ran my workload against it. The top improvement came from the following

I changed

#define XLOG_SEG_SIZE (16*1024*1024)

in $SNAPSHOT/src/include/pg_config_manual.h to

#define XLOG_SEG_SIZE (256*1024*1024)

This seem to improve 3X my load performance. (Unfortunately it also forces to re- "initdb" the database.) But hey it is a step forward in the right direction.

I don't have the right explaination for the improvement but atleast I can continue to my next bottleneck.

I reran my favorite syscall.d and found while write is still one of the top, the real time spent is in semsys. Welcome to the world of bottlenecks.. :-)

I re-ran sys1call.d to dump the stack during semsys and it all came from semaphore lockings, unlockings and trying-to-lock functions in PostgreSQL.

Aargg.... more source code to browse..


Post a Comment