|Author||Topic: pgdbe and high contention|
|Gian Vitale Da Rin||pgdbe and high contention|
on Thu, 09 Mar 2023 11:07:20 +0100
Hi all, I've been tasked with the creation of a few tests of PGDBE with the ISAM emulation as a proof-of-concept of the porting of my employer's 30-plus years old Clipper/Xbase++ software to Postgres. So, as one of the first tests, I wrote a simple program that decrements a field on a table (which I've upsized from the original .DBF/.CDX file) from two different threads using two different dac sessions. The test is really simple, after opening a dacSession() to a postgres 10 server it uses dbUseArea( , oSession, ... ) to open the table on the server dbUseArea( .T., oSession, "maga", "ma", .T. ) dbSetIndex( "maga" ) ma->( dbSetOrder( 1 ) ) and then executes this loop for a thousand times. for n := 1 to 1000 if ma->( dbSeek( "0RQ1WB" ) ) while ! ma->( dbRLock() ) enddo ma->C141 := ma->C141 - 1 ma->( dbUnlock() ) else Alert( "error in seek " + oSession:getLastMessage() ) EXIT endif next Before starting the two concurrent threads C141 is given the value 2000 so, at the end of the test, I should find that C141 has a value of zero, but I find different (greater) values instead. In several tests I ended up with values ranging from 990 to 1000. This same test, run on the original .DBF file, ends with C141 having the correct value. I've used several different builds of Xbase++, up to the lastest 1741, and all of them show the same behaviour. After reading all of the messages here and a lot of the ones on www.xbaseforum.de related to postgres I decided to add logging to the dac session via a oSession:SetProperty( PGDIC_TRACE_COMMUNICATION, ... ) Analyzing the log it seems to me that PGDBE is doing something strange, in particular: 1) it makes a SELECT for the dbSeek() To backend> Msg Q To backend> "SELECT * FROM maga AS "ma" WHERE __order_maga_c134 LIKE '0RQ1WB%' and (__deleted=false) ORDER BY __order_maga_c134 LIMIT 1" To backend> Msg complete, length 130 2) it then acquires an advisory lock for the record 818 (the correct one) trying several times before getting one, which should be the "while ! RLock() enddo" loop in my test code To backend> Msg Q To backend> "SELECT pg_try_advisory_lock(36,818)" To backend> Msg complete, length 41 3) it then updates the __lock_owner field on the table To backend> Msg Q To backend> "UPDATE "maga" SET __lock_owner=664, __keyversion=__keyversion+1 WHERE __record=818 and __lock_owner=0" To backend> Msg complete, length 107 4) finally it executes a plan to update field C141 To backend> Msg P To backend> "plan9-t5988-n795" To backend> "UPDATE "maga" SET __DELETED = $1, __KEYVERSION = __KEYVERSION+1 , c141=$2, __order_maga_c134=$3, __order_maga_c111=$4, __order_maga_c113=$5, __order_maga_c122=$6, __order_maga_atc=$7, __order_maga_c121=$8, __order_maga_c115=$9 WHERE __record=$10" This behaviour would explain the wrong result I'm getting, given that PGDBE first reads the content of the record in 1), then locks it in 2) and 3) and then, finally, updates it using the value that it got in 1). Obviously the value has, most of the times, just been changed by the other thread before it gets to point 4). In fact, adding a second ma->( dbSeek( "0RQ1WB" ) ) right in between the RLock() loop and the decrement instruction is enough to fix the problem and the test completes with the correct value inside C141. By the way, a dbSkip( 0 ) instead of the second dbSeek() is not enough to refresh the value of C141. Either I'm missing something relative to PGDBE or this behaviour differs from the one of the other ISAM DBEs and PGDBE cannot be used to write on shared tables, but in this case, how are some of you being able to use it in production? Best regards. Maurilio Longo.