Skip to content

Conversation

@hailin0
Copy link
Member

@hailin0 hailin0 commented Apr 29, 2025

Purpose of this pull request

Does this PR introduce any user-facing change?

How was this patch tested?

Check list

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes an issue where the starrocks batch data can exceed the maximum allowed limit by introducing validations in the StarRocksStreamLoadVisitor. Key changes include:

  • Adding tests to verify that exceptions are thrown when batch values exceed limits for both CSV and JSON formats.
  • Introducing a dedicated checkBatchMaxBytes method in StarRocksStreamLoadVisitor to validate batch sizes.
  • Adjusting the joinRows method to utilize the updated batch byte validations.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
seatunnel-connectors-v2/connector-starrocks/src/test/java/org/apache/seatunnel/connectors/seatunnel/starrocks/client/StarRocksStreamLoadVisitorTest.java Added tests to confirm proper exception behavior based on batch limits and formats.
seatunnel-connectors-v2/connector-starrocks/src/main/java/org/apache/seatunnel/connectors/seatunnel/starrocks/client/StarRocksStreamLoadVisitor.java Updated batch validation logic and modified joinRows to use long for total bytes with proper checks.
Comments suppressed due to low confidence (2)

seatunnel-connectors-v2/connector-starrocks/src/main/java/org/apache/seatunnel/connectors/seatunnel/starrocks/client/StarRocksStreamLoadVisitor.java:329

  • [nitpick] Consider merging the string concatenation into a single string literal to improve the clarity of the error message.
            + "please reset the batch_max_bytes.",

seatunnel-connectors-v2/connector-starrocks/src/main/java/org/apache/seatunnel/connectors/seatunnel/starrocks/client/StarRocksStreamLoadVisitor.java:154

  • [nitpick] Consider renaming the parameter 'totalBytes' to a more descriptive name like 'cumulativeBytes' to clearly indicate that it represents the total size of all rows.
private byte[] joinRows(List<byte[]> rows, Long totalBytes) {

@corgy-w corgy-w merged commit 84634a4 into apache:dev Apr 30, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants