2025 年 2 月 14 日 | 留言

High Concurrent Transaction 與 PostgreSQL Notification

最近工作上遇到高流量系統在進行壓力測試時資料庫會卡住，pg_stat_activity出現很多transaction卡在commit操作，顯示的wait_event_type是Lock，而wait_event是object

據手冊的說法，這樣的情況是：「Waiting to acquire a lock on a non-relation database object.」

app又沒有去操作資料庫本體，上網查也查不到相關的資訊，手冊只有很籠統地說跟資料庫內部有關，到底是怎麼回事？

查了pg_locks沒有看到任何和object lock有關的東西，只看得出是transaction正在commit時卡住，沒辦法我只好把postgres的原始碼載下來用人工的方式反查

先確定定義object lock的地方，找到他的enum名稱，接著依照caller stack整理出所有會產生object lock的code

最後篩選出一個和前面提到的「transaction正在commit」有關，就是PreCommit_Notify()

有在用PostgreSQL的朋友可能知道他有Notify/Listen的功能，可以很方便地利用資料庫滿足pub/sub的需求，還可以直接設定trigger在資料庫有特定操作時自動發送訊息到正在listen的client去

而PreCommit_Notify()的功能就是在commit前將要傳送的訊息放進notification queue裡

剛好我們的資料庫有使用到notify，而且更剛好卡住的transaction會發送notify，那問題已經很明顯了，就是發送了notify導致卡住

至於詳細的原因在程式碼裡面有註解(把商業決策寫下是很棒的好習慣)

/*
 * Serialize writers by acquiring a special lock that we hold till
 * after commit.  This ensures that queue entries appear in commit
 * order, and in particular that there are never uncommitted queue
 * entries ahead of committed ones, so an uncommitted transaction
 * can't block delivery of deliverable notifications.
 *
 * We use a heavyweight lock so that it'll automatically be released
 * after either commit or abort.  This also allows deadlocks to be
 * detected, though really a deadlock shouldn't be possible here.
 *
 * The lock is on "database 0", which is pretty ugly but it doesn't
 * seem worth inventing a special locktag category just for this.
 * (Historical note: before PG 9.0, a similar lock on "database 0" was
 * used by the flatfiles mechanism.)
 */
LockSharedObject(DatabaseRelationId, InvalidOid, 0, AccessExclusiveLock);

大意是為了確保進到queue的notification有成功commit且遵循commit完成的順序，如果transaction有發送notify就會對一個不存在的database上AccessExclusiveLock直到該commit完成

因為這個「不存在的database」是寫死id的，所以不論你正在對哪一個資料庫操作，只要transaction有發送notify通通都會受影響

如果你有大量transaction同時操作的使用場景，最好避免在裡面送notify，避免整個資料庫卡住，嚴重影響效能

ref: https://github.com/postgres/postgres/blob/REL_15_7/src/backend/commands/async.c#L956

ATI的軟體&網管技術誌

奇怪的網站

High Concurrent Transaction 與 PostgreSQL Notification

留言

取消回覆