Thursday, November 18, 2010

sysbench, PostgreSQL 9.0 and OLTP complex read-write test

Continuing with my sysbench saga with PostgreSQL 9.0, I was generally not encountering any errors except recently. I found two differences on how I did execute this test in order to hit this problem. One is I used a relatively small number of rows (1 million) which forces the special distribution of sysbench clients to be smaller and a reasonably high number of threads (80) as follows

$ sysbench --db-driver=pgsql --oltp-dist-type=special --oltp-table-size=1000000 --oltp-read-only=off --oltp-test-mode=complex --max-requests=0 --max-time=300 --num-threads=64 --test=oltp run

sysbench 0.4.12:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 80

Doing OLTP test.
Running mixed OLTP test
Using Special distribution (12 iterations,  1 pct of values are returned in 75 pct cases)
Using "BEGIN" for starting transactions
Using auto_inc on the id column
Threads started!
FATAL: query execution failed: 9490352
FATAL: database error, exiting...

And the test failed with error reported in pg_log:

ERROR:  duplicate key value violates unique constraint "sbtest_pkey"
DETAIL:  Key (id)=(500815) already exists.
STATEMENT:  INSERT INTO sbtest values($1,0,' ','aaaaaaaaaaffffffffffrrrrrrrrrreeeeeeeeeeyyyyyyyyyy')

The way sysbench works for the complex read-write test transaction, after doing some bunch of selects and updates, it deletes a row and inserts the same row back. So in a transaction logic it should not hit this error since it just deleted the row. This took me some time to recreate it in a way that I could understand what is happening. 

What's happening is the timing of the start of a query in transaction with respect to another transaction in flight
working with the same key-value row such that the other transaction just deleted the row and also inserted the same row back in the same transaction.

Lets consider two transactions A and B

Transaction A                      Transaction B
DELETE FROM sbtest WHERE id=500815;
(returns DELETE 1)
INSERT INTO sbtest values(500815,0,'','aaaaaaaaaaffffffffffrrrrrrrrrreeeeeeeeeeyyyyyyyyyy');
(returns INSERT 0 1)
                                   DELETE FROM sbtest WHERE id=500815;< ------- hangs/waits
                                   (returns DELETE 0 – returns success but doesn’t delete any rows . It doesn't roll back the transaction)
                                    INSERT INTO sbtest values(500815,0,'','aaaaaaaaaaffffffffffrrrrrrrrrreeeeeeeeeeyyyyyyyyyy');
                                    ERROR: duplicate key value violates unique constraint "sbtest_pkey"

The way MySQL-InnoDB handles it is slightly different. It actually deletes the new row inserted and hence always can do the INSERT successfully and thats why sysbench with MySQL - InnoDB never showed that problem.

If you read the documentation on PostgreSQL for READ COMMITTED Isolation level, it says: "In effect, a SELECT query sees a snapshot of the database as of the instant the query begins to run". So had the DELETE actually started after the END of the first transaction it would delete the new INSERTed row.I am not sure how other databases (ORACLE, DB2, etc) behave in this scenario. 

As one of my co-worker mentioned "Correctness is in the Implementation of Beholder", it is hard to say who is right or wrong..

Just another one of the minor differences between MySQL and PostgreSQL.
Post a Comment