Skip to content

Conversation

@SkyFan2002
Copy link
Member

@SkyFan2002 SkyFan2002 commented Dec 1, 2025

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

The old runtime filter loop marched slot by slot with data-dependent branches, so the compiler couldn’t auto-vectorize the block mask checks. The rewrite explicitly packs mask bits into lanes and processes whole batches through dedicated SIMD helpers, removing the branchy scalar path so the hot loop now maps cleanly onto wide bitwise ops.

This PR reduces Bloom filter check overhead by 50%. Performance on several TPC-H queries improved by 10–30%.

╭────────────────────────────────────────────────────────────────────────────────────────────╮
│ query_id │     v12850_time_s     │     pr_time_s     │       diff_s      │   ratio_percent   │
│  String  │  Nullable(Float64) │  Nullable(Float64) │ Nullable(Float64) │ Nullable(Float64) │
├──────────┼────────────────────┼────────────────────┼───────────────────┼───────────────────┤
│ q21      │   66.053001403808657.39500045776367-8.66-13.11 │
│ q9       │  74.2450027465820368.04199981689453-6.2-8.35 │
│ q20      │ 10.0930004119873057.318999767303467-2.77-27.48 │
│ q10      │   33.990001678466831.881999969482422-2.11-6.2 │
│ q17      │ 14.03199958801269512.557999610900879-1.47-10.5 │
│ q14      │  15.6129999160766614.23900032043457-1.37-8.8 │
│ q5       │ 25.10600090026855523.993000030517578-1.11-4.43 │
│ q18      │ 30.01499938964843829.072999954223633-0.94-3.14 │
│ q15      │ 11.30900001525878910.425000190734863-0.88-7.82 │
│ q12      │ 14.24300003051757813.366999626159668-0.88-6.15 │
│ q8       │ 23.51199913024902323.042999267578125-0.47-1.99 │
│ q7       │ 22.86100006103515622.417999267578125-0.44-1.94 │
│ q3       │  42.2430000305175841.810001373291016-0.43-1.03 │
│ q22      │ 12.08800029754638711.833000183105469-0.26-2.11 │
│ q6       │  8.6549997329711918.503000259399414-0.15-1.76 │
│ q4       │ 31.78899955749511731.746000289916992-0.04-0.14 │
│ q16      │ 3.08699989318847663.09500002861022950.010.26 │
│ q2       │  5.9369997978210455.9499998092651370.010.22 │
│ q19      │ 15.32900047302246115.3900003433227540.060.4 │
│ q11      │  2.6259999275207522.72199988365173340.13.66 │
│ q1       │  12.3859996795654312.4840002059936520.10.79 │
│ q13      │ 21.88999938964843822.798000335693360.914.15 │
╰────────────────────────────────────────────────────────────────────────────────────────────╯

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

@github-actions github-actions bot added the pr-chore this PR only has small changes that no need to record, like coding styles. label Dec 1, 2025
@SkyFan2002 SkyFan2002 added the ci-cloud Build docker image for cloud test label Dec 1, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Dec 1, 2025

Docker Image for PR

  • tag: pr-19039-4a86ca6-1764555590

note: this image tag is only available for internal use.

@SkyFan2002 SkyFan2002 added ci-cloud Build docker image for cloud test and removed ci-cloud Build docker image for cloud test labels Dec 1, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Dec 1, 2025

Docker Image for PR

  • tag: pr-19039-692e980-1764575062

note: this image tag is only available for internal use.

@SkyFan2002 SkyFan2002 changed the title chore: test performance feat: improve runtime filter check via SIMD Dec 1, 2025
@github-actions github-actions bot added the pr-feature this PR introduces a new feature to the codebase label Dec 1, 2025
@SkyFan2002 SkyFan2002 marked this pull request as ready for review December 1, 2025 11:33
@zhang2014 zhang2014 merged commit 0b79a14 into databendlabs:main Dec 1, 2025
108 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-cloud Build docker image for cloud test pr-chore this PR only has small changes that no need to record, like coding styles. pr-feature this PR introduces a new feature to the codebase

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants