Skip to content

bug(source-postgres): Resumable full refresh generates invalid SQL with null cursor field #70356

@devin-ai-integration

Description

@devin-ai-integration

Summary

When syncing streams configured as syncMode=full_refresh with an empty cursorField=[], the Postgres source connector's resumable full refresh code path generates invalid SQL that causes the error: ERROR: column "null" does not exist.

Connection Details

Error Message

java.lang.RuntimeException: org.postgresql.util.PSQLException: ERROR: column "null" does not exist
  Position: 10
    at io.airbyte.integrations.source.postgres.PostgresQueryUtils.lambda$getCursorBasedSyncStatusForStreams$4(PostgresQueryUtils.java:192)

Root Cause Analysis

Show/Hide Detailed Report

Configuration

The failing streams are configured as:

syncMode=full_refresh, cursorField=[], destinationSyncMode=overwrite

The logs show these streams are being processed through the resumable full refresh path:

will sync in resumeable full refresh mode

Code Path

  1. In CursorManager.kt (lines 183-189), when the catalog has a cursorField list that is present but empty, the code extracts null:
cursorField =
    streamOptional
        .map { obj: ConfiguredAirbyteStream -> obj.cursorField }
        .flatMap { f: List<String> ->
            if (f.size > 0) Optional.of(f[0]) else Optional.empty()
        }
        .orElse(null)
  1. This results in the log message:
No cursor field set in catalog but not present in state. Stream: airtravel_user_searches, New Cursor Field: null. Resetting cursor value
  1. In PostgresSource.getIncrementalIterators (line 636-637), getCursorBasedSyncStatusForStreams is called with these streams.

  2. In PostgresQueryUtils.getCursorBasedSyncStatusForStreams (lines 166-175), the code builds a SQL query using the null cursor field:

final String cursorField = cursorInfoOptional.get().getCursorField();
final String cursorBasedSyncStatusQuery = String.format(MAX_CURSOR_VALUE_QUERY,
    cursorField,  // This is null!
    fullTableName,
    cursorField,
    cursorField,
    fullTableName);
  1. With MAX_CURSOR_VALUE_QUERY = SELECT "%s" FROM %s WHERE "%s" = (SELECT MAX("%s") FROM %s);, when cursorField is null, the query becomes:
SELECT "null" FROM "airtravel"."user_searches" WHERE "null" = (SELECT MAX("null") FROM "airtravel"."user_searches");
  1. PostgreSQL interprets "null" as a column identifier (not the NULL keyword), hence the error.

Related Issues

This appears to be a similar class of bug where streams without valid cursor fields are incorrectly processed through the cursor-based code path.

Suggested Fix

The code should either:

  1. Filter out streams without valid cursor fields from being processed by getCursorBasedSyncStatusForStreams
  2. Add a null check in getCursorBasedSyncStatusForStreams to skip streams with null cursor fields
  3. Throw a more descriptive error message instead of generating invalid SQL

Workaround

Users can work around this by explicitly setting a cursor column (e.g., created_at or updated_at) for the affected streams and switching them to incremental sync mode.


Reported by [email protected] via Devin session: https://app.devin.ai/sessions/099414955b86437ea8cc19c38d0522b2

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions