Skip to content

Conversation

@BohuTANG
Copy link
Member

@BohuTANG BohuTANG commented Nov 15, 2025

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

Improve scan IO profile metrics by adding dedicated counters for different data sources:

Metric Display Name Description
ScanBytesFromRemote read from remote Bytes read from remote storage
ScanBytesFromLocalDisk read from local disk Bytes read from local disk cache (compressed)
ScanBytesFromDataCache read from data cache Bytes read from data cache memory (compressed)
ScanBytesFromMemory read from memory cache Bytes read from memory cache (decompressed data blocks)

Cache Architecture

Query → column_array_cache (memory, decompressed) → ScanBytesFromMemory
      → column_data_cache (HybridCache)
          → memory layer (compressed) → ScanBytesFromDataCache
          → disk layer (compressed) → ScanBytesFromLocalDisk
      → Remote Storage → ScanBytesFromRemote

Example Output

├── bytes scanned: 120.00 B
├── read from remote: 864.00 B
├── read from data cache: 512.00 B
├── read from memory cache: 1.69 KiB

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

@BohuTANG BohuTANG changed the title Improve scan IO metrics and add replace-into auto increment helper Improve scan IO profile metrics Nov 15, 2025
@BohuTANG BohuTANG changed the title Improve scan IO profile metrics feat: improve scan IO profile metrics Nov 15, 2025
@github-actions github-actions bot added the pr-feature this PR introduces a new feature to the codebase label Nov 15, 2025
@BohuTANG BohuTANG added the ci-cloud Build docker image for cloud test label Nov 15, 2025
@github-actions
Copy link
Contributor

Docker Image for PR

  • tag: pr-18975-31804d8-1763177210

note: this image tag is only available for internal use.

… output

- Add bytes scanned from local cache to test expectations
- Make scan IO metrics (remote/local/memory) always display even when zero
  to ensure consistent output format for sqllogic tests
This ensures:
- Fuse table scans show all three scan IO metrics together (remote/local/memory)
- System tables (like numbers) don't show these metrics since they don't use Fuse storage
Use enum comparison instead of string comparison for more reliable matching
@BohuTANG BohuTANG force-pushed the pr18935 branch 5 times, most recently from 57dadaf to 22ab02c Compare December 4, 2025 01:59
@BohuTANG BohuTANG marked this pull request as ready for review December 4, 2025 03:11
@BohuTANG BohuTANG requested a review from zhang2014 December 4, 2025 03:11
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@BohuTANG BohuTANG force-pushed the pr18935 branch 3 times, most recently from dad93da to a509e43 Compare December 4, 2025 04:46
@BohuTANG
Copy link
Member Author

BohuTANG commented Dec 4, 2025

@codex review

chatgpt-codex-connector[bot]

This comment was marked as outdated.

@BohuTANG BohuTANG marked this pull request as draft December 4, 2025 07:19
The fix was applied to the correct file: physical_format.rs in databend-query.
The previous fix in sql/executor/format.rs was ineffective because that file
is not compiled (not declared in mod.rs).

When any of the three scan IO metrics (remote/local cache/memory cache) is
non-zero, all three will be displayed. This ensures consistent output in
EXPLAIN ANALYZE for Fuse table scans.
@BohuTANG
Copy link
Member Author

BohuTANG commented Dec 4, 2025

@codex review

@chatgpt-codex-connector
Copy link

Codex Review: Didn't find any major issues. Bravo.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@BohuTANG
Copy link
Member Author

BohuTANG commented Dec 4, 2025

@codex review

@chatgpt-codex-connector
Copy link

Codex Review: Didn't find any major issues. Keep it up!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@BohuTANG BohuTANG marked this pull request as ready for review December 4, 2025 08:21
@dantengsky dantengsky merged commit f081da6 into databendlabs:main Dec 4, 2025
99 of 102 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-cloud Build docker image for cloud test pr-feature this PR introduces a new feature to the codebase

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants