Skip to content

Conversation

@tjbanghart
Copy link
Contributor

@tjbanghart tjbanghart commented Dec 2, 2025

CALCITE-7254

This is a first pass, but I’m not yet convinced it’s the right approach. We may need to gather potential shared sub-expressions in a dedicated context and defer the rewrite until later. At the moment, the suggester identifies candidate spools and performs the replacement directly within the rule, which may unintentionally prevent the optimizer from exploring cheaper plans.

Likewise, certain rules may prevent shareable components from being recognized. For example, filter pushdown can eliminate a join tree that would otherwise be reusable.

@tjbanghart tjbanghart force-pushed the 7254 branch 2 times, most recently from b60609a to 25e3ae0 Compare December 2, 2025 20:02
@tjbanghart tjbanghart requested a review from xiedeyantu December 2, 2025 20:38
// Combine both queries
return builder.combine(2).build();
})
.returnsUnordered(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that AssertQuery.returnsUnordered(String... lines) can return multiple lines. In JdbcTest, some examples use the format {column_name}={value}, such as returnsUnordered("DID2=1", "DID2=2"). I'm not sure if the current QUERY_0=xxx format is well-suited for future Quidem tests. Assuming that in the future I could use a setting like set enable_combine = true to allow multiple SQL statements to execute simultaneously, would we still want to obtain two separate result sets—just like when the two SQL statements are executed separately?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good questions, I've updated the PR so Combine now returns one row per input query. Each row holds an array of maps, where each map represents a row from that query with column names as keys.

We are constrained by JDBC's limitations of returning a single result set per query but this would allow iterating over individual query results with resultSet.next() and then iterating over each query's results via the array.

For example, combining:
SELECT name FROM depts;
and
SELECT empid, name, deptno FROM emps WHERE deptno = 10;

Yields:

-- Row 1
QUERY=[
   {name=Sales}, 
   {name=Marketing}, 
   {name=HR}
],
-- Row 2
QUERY=[
   {empid=100, name=Bill, deptno=10}, 
   {empid=150, name=Sebastian, deptno=10} , 
   {empid=110, name=Theodore, deptno=10}
]

Later, we should probably allow for aliases for queries so they don't have the generic QUERY column label but that may be more challenging (see the syntax for MULTI described in CALCITE-6188).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the very detailed explanation. I think the current format of the result presentation looks quite good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To later accommodate named queries I've changed the format once more. From the comment in EnumerableCombine:

The output format is a wide table where each column corresponds to a query (named QUERY_0, QUERY_1, etc.) and each row contains a struct with that query's column values for that row index. The number of output rows equals the maximum row count across all input queries. Queries with fewer rows have null values for the additional rows.

Example output for two queries:

QUERY_0                  | QUERY_1
------------------------ | ------------------------
{empno=100, name=Bill}   | {deptno=10, name=Sales}
{empno=110, name=Eric}   | {deptno=20, name=HR}
{empno=120, name=Ted}    | null

@tjbanghart tjbanghart force-pushed the 7254 branch 2 times, most recently from 04a59dd to 95e2419 Compare December 3, 2025 06:21
@tjbanghart tjbanghart requested a review from xiedeyantu December 3, 2025 20:17
Copy link
Member

@xiedeyantu xiedeyantu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it looks good overall, just left a few minor comments.

* @param queryLists array of lists, one per query
* @return list of Object arrays representing combined rows
*/
public static List<Object[]> combineQueryResults(List[] queryLists) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a small test for this method?

// Verify both columns have values (no cross-contamination)
Object query0 = resultSet.getObject("QUERY_0");
Object query1 = resultSet.getObject("QUERY_1");
if (query0 == null || query1 == null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to use AssertThat for positive validation? I don't understand why the throw new AssertionError approach is used here.

@xiedeyantu
Copy link
Member

There are still some minor issues in the CI that need to be fixed.

for (int rowIdx = 0; rowIdx < maxRows; rowIdx++) {
Object[] row = new Object[queryLists.length];
for (int queryIdx = 0; queryIdx < queryLists.length; queryIdx++) {
List queryList = queryLists[queryIdx];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The @Nullable annotation is required here; otherwise CI will fail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thank you - that was it.

Copy link
Member

@xiedeyantu xiedeyantu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I think we can squash all the commits, and I'll leave the rest of the merging to you.

@tjbanghart
Copy link
Contributor Author

Thank you so much for the reviews @xiedeyantu, much appreciated!

@sonarqubecloud
Copy link

sonarqubecloud bot commented Dec 8, 2025

@tjbanghart tjbanghart merged commit 3269244 into apache:main Dec 8, 2025
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants