Skip to content

Conversation

@wildpea
Copy link
Contributor

@wildpea wildpea commented Jul 10, 2025

Purpose of this pull request

In mergeCatalogTable, if not schemaIncludeAllColumns, column comment information will be lost.
This modification aims to preserve comments in such scenarios

Does this PR introduce any user-facing change?

How was this patch tested?

JdbcCatalogUtilsTest::testColumnNotIncludeMerge
tableOfQuery add comment to each column, and expect reserved in mergeTable

Check list

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR ensures that column comments are retained when merging catalog tables if the schema doesn’t include all columns.

  • Introduces a transformation in mergeCatalogTable to wrap query columns with PhysicalColumn and carry over existing comments from the path schema when types match.
  • Updates unit tests to assert that comments are preserved for each merged column.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
seatunnel-connectors-v2/connector-jdbc/src/main/java/org/apache/seatunnel/connectors/seatunnel/jdbc/utils/JdbcCatalogUtils.java Added streaming logic to map and preserve column comments during schema merge
seatunnel-connectors-v2/connector-jdbc/src/test/java/org/apache/seatunnel/connectors/seatunnel/jdbc/utils/JdbcCatalogUtilsTest.java Updated test cases to include expected comment values in merged columns
Comments suppressed due to low confidence (1)

seatunnel-connectors-v2/connector-jdbc/src/main/java/org/apache/seatunnel/connectors/seatunnel/jdbc/utils/JdbcCatalogUtils.java:364

  • [nitpick] The variable name columnsWithComment is descriptive but could be clearer—consider renaming to columnsWithPreservedComments or mergedColumns to better convey its role.
                                .columns(columnsWithComment)

tableOfPath.getTableId().getDatabaseName(),
tableOfPath.getTableId().getSchemaName(),
tableOfPath.getTableId().getTableName());
List<Column> columnsWithComment =

This comment was marked as off-topic.

Comment on lines +333 to +335
columnsOfQuery
.get(column.getName())
.getDataType()
Copy link

Copilot AI Jul 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The comparison looks up columnsOfQuery.get(column.getName()), but column is already from that source. You can simplify by calling column.getDataType().getSqlType() directly to avoid the redundant map lookup.

Suggested change
columnsOfQuery
.get(column.getName())
.getDataType()
column.getDataType()

Copilot uses AI. Check for mistakes.
Copy link
Contributor

@davidzollo davidzollo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 if CI passed

Copy link

@mengnankkkk mengnankkkk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link

@mengnankkkk mengnankkkk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

代码审查意见(PR #9559: [improve][Connector-jdbc] add comments when schema not include all columns)

优点:

  • 该 PR 有效解决了 schema 未包含所有列时,字段注释丢失的问题,提升了 JDBC connector 的易用性和一致性。
  • 代码实现思路清晰,逻辑简洁,易于维护。
  • 单元测试已同步完善,覆盖了注释保留的场景。

建议与潜在问题:

  1. 代码简化建议:
    • 在类型判断和注释拷贝时,columnsOfQuery.get(column.getName()).getDataType() 实际上可以直接用 column.getDataType(),避免重复 map 查询,提升可读性。
  2. 空指针安全:
    • 请确保所有 map 查询(如 columnsOfPath.get(column.getName()))都做了 null 检查,避免因 schema 差异导致 NPE。
  3. 注释说明:
    • 建议在代码中补充注释,说明“仅当类型一致时才拷贝注释”的设计原因,便于后续维护者理解。
  4. 测试覆盖:
    • 建议补充负向测试,如类型不一致或 path schema 缺失时注释不应被拷贝,提升健壮性。

结论:
本次改动对 JDBC connector 体验有明显提升。若能进一步优化代码细节和测试覆盖,将更加健壮。感谢贡献!

Copy link
Member

@Hisoka-X Hisoka-X left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @wildpea

@Hisoka-X Hisoka-X changed the title [improve][Connector-jdbc] add comments when schema not include all columns [Improve][Connector-jdbc] add comments when schema not include all columns Jul 14, 2025
@Hisoka-X Hisoka-X merged commit 02d2b69 into apache:dev Jul 14, 2025
13 checks passed
chncaesar pushed a commit to chncaesar/seatunnel that referenced this pull request Jul 19, 2025
dybyte pushed a commit to dybyte/seatunnel that referenced this pull request Jul 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants