-
Notifications
You must be signed in to change notification settings - Fork 5k
Description
Summary
When syncing streams configured as syncMode=full_refresh with an empty cursorField=[], the Postgres source connector's resumable full refresh code path generates invalid SQL that causes the error: ERROR: column "null" does not exist.
Connection Details
- Connection URL: https://cloud.airbyte.com/workspaces/4835be10-184d-4b06-8cc1-730e4d6b76a0/connections/1cd35c07-9e42-4490-9c77-0f9bcff9db4b/timeline
- Affected streams:
airtravel.user_searches,airtravel.tracked_flight_routes,airtravel.flight_search_request
Error Message
java.lang.RuntimeException: org.postgresql.util.PSQLException: ERROR: column "null" does not exist
Position: 10
at io.airbyte.integrations.source.postgres.PostgresQueryUtils.lambda$getCursorBasedSyncStatusForStreams$4(PostgresQueryUtils.java:192)
Root Cause Analysis
Show/Hide Detailed Report
Configuration
The failing streams are configured as:
syncMode=full_refresh, cursorField=[], destinationSyncMode=overwrite
The logs show these streams are being processed through the resumable full refresh path:
will sync in resumeable full refresh mode
Code Path
- In
CursorManager.kt(lines 183-189), when the catalog has acursorFieldlist that is present but empty, the code extractsnull:
cursorField =
streamOptional
.map { obj: ConfiguredAirbyteStream -> obj.cursorField }
.flatMap { f: List<String> ->
if (f.size > 0) Optional.of(f[0]) else Optional.empty()
}
.orElse(null)- This results in the log message:
No cursor field set in catalog but not present in state. Stream: airtravel_user_searches, New Cursor Field: null. Resetting cursor value
-
In
PostgresSource.getIncrementalIterators(line 636-637),getCursorBasedSyncStatusForStreamsis called with these streams. -
In
PostgresQueryUtils.getCursorBasedSyncStatusForStreams(lines 166-175), the code builds a SQL query using the null cursor field:
final String cursorField = cursorInfoOptional.get().getCursorField();
final String cursorBasedSyncStatusQuery = String.format(MAX_CURSOR_VALUE_QUERY,
cursorField, // This is null!
fullTableName,
cursorField,
cursorField,
fullTableName);- With
MAX_CURSOR_VALUE_QUERY = SELECT "%s" FROM %s WHERE "%s" = (SELECT MAX("%s") FROM %s);, whencursorFieldisnull, the query becomes:
SELECT "null" FROM "airtravel"."user_searches" WHERE "null" = (SELECT MAX("null") FROM "airtravel"."user_searches");- PostgreSQL interprets
"null"as a column identifier (not the NULL keyword), hence the error.
Related Issues
- Empty table causing a failure in cursor based ctid #28262 - Empty table causing a failure in cursor based ctid (closed)
- ctid with cursor: Sync containing both incremental and full refresh streams will fail to sync #28363 - ctid with cursor: Sync containing both incremental and full refresh streams will fail to sync (closed)
This appears to be a similar class of bug where streams without valid cursor fields are incorrectly processed through the cursor-based code path.
Suggested Fix
The code should either:
- Filter out streams without valid cursor fields from being processed by
getCursorBasedSyncStatusForStreams - Add a null check in
getCursorBasedSyncStatusForStreamsto skip streams with null cursor fields - Throw a more descriptive error message instead of generating invalid SQL
Workaround
Users can work around this by explicitly setting a cursor column (e.g., created_at or updated_at) for the affected streams and switching them to incremental sync mode.
Reported by [email protected] via Devin session: https://app.devin.ai/sessions/099414955b86437ea8cc19c38d0522b2