Thursday, November 18, 2010

sysbench, PostgreSQL 9.0 and OLTP complex read-write test

Continuing with my sysbench saga with PostgreSQL 9.0, I was generally not encountering any errors except recently. I found two differences on how I did execute this test in order to hit this problem. One is I used a relatively small number of rows (1 million) which forces the special distribution of sysbench clients to be smaller and a reasonably high number of threads (80) as follows

$ sysbench --db-driver=pgsql --oltp-dist-type=special --oltp-table-size=1000000 --oltp-read-only=off --oltp-test-mode=complex --max-requests=0 --max-time=300 --num-threads=64 --test=oltp run

sysbench 0.4.12:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 80

Doing OLTP test.
Running mixed OLTP test
Using Special distribution (12 iterations,  1 pct of values are returned in 75 pct cases)
Using "BEGIN" for starting transactions
Using auto_inc on the id column
Threads started!
FATAL: query execution failed: 9490352
FATAL: database error, exiting...

And the test failed with error reported in pg_log:

ERROR:  duplicate key value violates unique constraint "sbtest_pkey"
DETAIL:  Key (id)=(500815) already exists.
STATEMENT:  INSERT INTO sbtest values($1,0,' ','aaaaaaaaaaffffffffffrrrrrrrrrreeeeeeeeeeyyyyyyyyyy')

The way sysbench works for the complex read-write test transaction, after doing some bunch of selects and updates, it deletes a row and inserts the same row back. So in a transaction logic it should not hit this error since it just deleted the row. This took me some time to recreate it in a way that I could understand what is happening. 

What's happening is the timing of the start of a query in transaction with respect to another transaction in flight
working with the same key-value row such that the other transaction just deleted the row and also inserted the same row back in the same transaction.

Lets consider two transactions A and B

Transaction A                      Transaction B
DELETE FROM sbtest WHERE id=500815;
(returns DELETE 1)
INSERT INTO sbtest values(500815,0,'','aaaaaaaaaaffffffffffrrrrrrrrrreeeeeeeeeeyyyyyyyyyy');
(returns INSERT 0 1)
                                   DELETE FROM sbtest WHERE id=500815;< ------- hangs/waits
                                   (returns DELETE 0 – returns success but doesn’t delete any rows . It doesn't roll back the transaction)
                                    INSERT INTO sbtest values(500815,0,'','aaaaaaaaaaffffffffffrrrrrrrrrreeeeeeeeeeyyyyyyyyyy');
                                    ERROR: duplicate key value violates unique constraint "sbtest_pkey"

The way MySQL-InnoDB handles it is slightly different. It actually deletes the new row inserted and hence always can do the INSERT successfully and thats why sysbench with MySQL - InnoDB never showed that problem.

If you read the documentation on PostgreSQL for READ COMMITTED Isolation level, it says: "In effect, a SELECT query sees a snapshot of the database as of the instant the query begins to run". So had the DELETE actually started after the END of the first transaction it would delete the new INSERTed row.I am not sure how other databases (ORACLE, DB2, etc) behave in this scenario. 

As one of my co-worker mentioned "Correctness is in the Implementation of Beholder", it is hard to say who is right or wrong..

Just another one of the minor differences between MySQL and PostgreSQL.


Anonymous said...

InnoDB's default isolation level is REPEATABLE READ. Changing it to READ COMMITTED will yield the same behavior as in PostgreSQL and vice-versa.

thomas said...

It is the differenz between locking (mysql variant) and transaction (postgres variant). postgres make transaction amd mysql make a lock handling.

Greg Smith said...

PostgreSQL doesn't support REPEATABLE READ. It accepts the request, but you get READ COMMITTED anyway. So it's not that easy to make this behavior go away.

This particular problem, by the way, is what made me give up on sysbench for PostgreSQL testing when I first came across it. I don't think the read/write OLTP tests ever worked correctly with it.

Denish Patel said...

Even Oracle doesn't support repeatable read :

Oracle and PostgreSQL SQL transaction isolation levels are same.

Jignesh Shah said...

Actually it uses LOCK/UNLOCK table only for MyISAM. For InnoDB I believe it still uses BEGIN; COMMIT/END statement equivalent.

Terence said...

How do I add Apache Ofbiz to Solaris 10 services? I appreciate if you can contribute the service manifest file.

Anonymous said...

Described transactions flow falls under definition of "phantom read", assuming that conflicting new row does not exist at the moment of delete, but exists at the moment of uniqueness check.

For only 2 conflicting transactions the conflict can be prevented using SELECT FOR UPDATE, but it seems it cannot be preventedfor more transactions.

Insert anomaly could be detected with transaction level serializable, but it makes one of conflicting transactions fail. And it seems sysbench is not ready for this.