Monday, October 04, 2010

Postgres 9 and sysbench 0.4.12

Ever since the release of Postgres 9 I was hoping for a chance to do some performance tests on it. I finally got around to try it out. This time I tried Postgres running in a virtual machine on top of VMware's vSphere Infrastructure 4.1.


Since there were plenty of sysbench numbers available for mysql, I thought it was time to try it with sysbench. The version I tried was sysbench 0.4.12. Compiling sysbench 0.4.12 for PostgreSQL itself was a huge challenge that it deserves a separate blogpost but for now lets start at the point where I got a working sysbench binary that I run it from a separate virtual machine against my Postgres database server.

The first step to do a sysbench oltp was to do the "prepare" step where the benchmark tool creates the table and loads rows into the table. I saw many mysql tests running with 1 million rows or some with 10 million rows.

So anyway I started with 1 million rows with similar options. The load started and went on and went on for a while.. (I should have taken a lunch break instead of waiting for it to finish.) It literally finished after some 20 odd minutes. I found that odd. So I downloaded MySQL 5.5 rc binaries and created a mysql database and compiled sysbench to run against MySQL and started the same "prepare" step with 1 million rows.

 The "prepare" step finished for MySQL in an amazing 30 seconds or so. My first reaction was MySQL is damn fast than Postgres. As a Postgres lover I found that hard to believe. There's got to be some other problem. I added the --debug flag to sysbench and I saw the insert statements that sysbench did to mysql. When I used the same debug flag for Postgres version it did not show me any statements that it used.. That was no help.

Anyway the Postgres was running on a version of the "late" OpenSolaris and used the query_time_breakdown.d script from pgfoundry dtrace scripts to see that the query that it was using to insert rows. I noticed that it was doing multi-row inserts on MySQL and single-row inserts on Postgres.

Now came the hard part of reading through the sysbench code which took me some while to understand its layout and how it uses the database driver. To cut the long story short, I figured sysbench uses "drv_caps_t" structure to record the capabilities of the database drivers. The MySQL driver enables multi-row inserts while the Postgres driver had multi-row support disabled.

Checking the documentation of Postgres I saw multi-row inserts in a INSERT statement is allowed.

So I modified sysbench/drivers/pgsql/drv_pgsql.c and changed

/* PgSQL driver capabilities */
static drv_caps_t pgsql_drv_caps =
{
0,
1,
1,
0,
0,
1,
0,
NULL
};

to the following

/* PgSQL driver capabilities */
static drv_caps_t pgsql_drv_caps =
{
1,
1,
1,
0,
0,
1,
0,
NULL
};


The resulting difference that it now loads 1000 rows in each INSERT statement (like it does for MySQL)

As for the difference it makes on the "prepare" operation:
$ time ./sysbench --pgsql-user=pguser --pgsql-password=xxxxxx --pgsql-db=sbtest --pgsql-host=pgdb --db-driver=pgsql --oltp-dist-type=special --oltp-table-size=1000000 --oltp-read-only=off --oltp-test-mode=complex --max-requests=0 --init-rng --max-time=300 --num-threads=96 --test=oltp --debug prepare

sysbench 0.4.12: multi-threaded system evaluation benchmark
Creating table 'sbtest'...
NOTICE: CREATE TABLE will create implicit sequence "sbtest_id_seq" for serial column "sbtest.id"
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "sbtest_pkey" for table "sbtest"
Creating 1000000 records in table 'sbtest'...


real 17m47.695s
user 0m0.328s
sys 0m0.700s

to:

$ time ./sysbench --pgsql-user=pguser --pgsql-password=xxxxxx --pgsql-db=sbtest --pgsql-host=pgdb --db-driver=pgsql --oltp-dist-type=special --oltp-table-size=1000000 --oltp-read-only=off --oltp-test-mode=complex --max-requests=0 --init-rng --max-time=300 --num-threads=96 --test=oltp --debug prepare

sysbench 0.4.12: multi-threaded system evaluation benchmark
Creating table 'sbtest'...
NOTICE: CREATE TABLE will create implicit sequence "sbtest_id_seq" for serial column "sbtest.id"
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "sbtest_pkey" for table "sbtest"
Creating 1000000 records in table 'sbtest'...

real 0m36.237s
user 0m0.180s
sys 0m0.064s


This looks much better and acceptable.  Now I can proceed to the real sysbench testing.
If somebody knows the  maintainer of the sysbench, can you please point him to this blog article to  get the drv_pgsql.c patched up to be similar to MySQL driver.

5 comments:

Christopher Hotchkiss said...

I opened the following bug against sysbench (https://bugs.launchpad.net/sysbench/+bug/654832).

David Fetter said...

Filed another bug. We don't all use rollback segments, and anything that assumes we do will hos PostgreSQL. https://bugs.launchpad.net/sysbench/+bug/654868

Greg Smith said...

sysbench has never worked right for me with PostgreSQL running anything other than read-only tests. And I stopped trying to use 0.4.12 a while ago; there was something that went backwards there from 0.4.10 that causes it to not build right for me in a lot of places.

I think we're going to have to volunteer a PostgreSQL maintainer to the sysbench project to get this stuff fixed though. Probably going to be me I think.

Mark Callaghan said...

AFAIK, people who maintain sysbench (including me) are not PG experts so someone from PG-land needs to give us feedback.

With respect to the commit frequency during benchmark prepare, see https://answers.launchpad.net/sysbench/+question/128209

Anonymous said...

This was fixed by commit #71 in the source code repo at https://code.launchpad.net/~sysbench-developers/sysbench/0.4; I just confirmed that prepare time is in the ~30 second range now when using that version.