feat: improve runtime filter check via SIMD #19039

SkyFan2002 · 2025-12-01T01:20:27Z

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

The old runtime filter loop marched slot by slot with data-dependent branches, so the compiler couldn’t auto-vectorize the block mask checks. The rewrite explicitly packs mask bits into lanes and processes whole batches through dedicated SIMD helpers, removing the branchy scalar path so the hot loop now maps cleanly onto wide bitwise ops.

This PR reduces Bloom filter check overhead by 50%. Performance on several TPC-H queries improved by 10–30%.

╭────────────────────────────────────────────────────────────────────────────────────────────╮
│ query_id │     v12850_time_s     │     pr_time_s     │       diff_s      │   ratio_percent   │
│  String  │  Nullable(Float64) │  Nullable(Float64) │ Nullable(Float64) │ Nullable(Float64) │
├──────────┼────────────────────┼────────────────────┼───────────────────┼───────────────────┤
│ q21      │   66.0530014038086 │  57.39500045776367 │             -8.66 │            -13.11 │
│ q9       │  74.24500274658203 │  68.04199981689453 │              -6.2 │             -8.35 │
│ q20      │ 10.093000411987305 │  7.318999767303467 │             -2.77 │            -27.48 │
│ q10      │   33.9900016784668 │ 31.881999969482422 │             -2.11 │              -6.2 │
│ q17      │ 14.031999588012695 │ 12.557999610900879 │             -1.47 │             -10.5 │
│ q14      │  15.61299991607666 │  14.23900032043457 │             -1.37 │              -8.8 │
│ q5       │ 25.106000900268555 │ 23.993000030517578 │             -1.11 │             -4.43 │
│ q18      │ 30.014999389648438 │ 29.072999954223633 │             -0.94 │             -3.14 │
│ q15      │ 11.309000015258789 │ 10.425000190734863 │             -0.88 │             -7.82 │
│ q12      │ 14.243000030517578 │ 13.366999626159668 │             -0.88 │             -6.15 │
│ q8       │ 23.511999130249023 │ 23.042999267578125 │             -0.47 │             -1.99 │
│ q7       │ 22.861000061035156 │ 22.417999267578125 │             -0.44 │             -1.94 │
│ q3       │  42.24300003051758 │ 41.810001373291016 │             -0.43 │             -1.03 │
│ q22      │ 12.088000297546387 │ 11.833000183105469 │             -0.26 │             -2.11 │
│ q6       │  8.654999732971191 │  8.503000259399414 │             -0.15 │             -1.76 │
│ q4       │ 31.788999557495117 │ 31.746000289916992 │             -0.04 │             -0.14 │
│ q16      │ 3.0869998931884766 │ 3.0950000286102295 │              0.01 │              0.26 │
│ q2       │  5.936999797821045 │  5.949999809265137 │              0.01 │              0.22 │
│ q19      │ 15.329000473022461 │ 15.390000343322754 │              0.06 │               0.4 │
│ q11      │  2.625999927520752 │ 2.7219998836517334 │               0.1 │              3.66 │
│ q1       │  12.38599967956543 │ 12.484000205993652 │               0.1 │              0.79 │
│ q13      │ 21.889999389648438 │  22.79800033569336 │              0.91 │              4.15 │
╰────────────────────────────────────────────────────────────────────────────────────────────╯

Tests

Unit Test
Logic Test
Benchmark Test
No Test - Explain why

Type of change

Bug Fix (non-breaking change which fixes an issue)
New Feature (non-breaking change which adds functionality)
Breaking Change (fix or feature that could cause existing functionality not to work as expected)
Documentation Update
Refactoring
Performance Improvement
Other (please describe):

This change is

github-actions · 2025-12-01T02:21:17Z

Docker Image for PR

tag: pr-19039-4a86ca6-1764555590

note: this image tag is only available for internal use.

github-actions · 2025-12-01T07:46:02Z

Docker Image for PR

tag: pr-19039-692e980-1764575062

note: this image tag is only available for internal use.

check batch

e946235

github-actions bot added the pr-chore this PR only has small changes that no need to record, like coding styles. label Dec 1, 2025

SkyFan2002 added the ci-cloud Build docker image for cloud test label Dec 1, 2025

update

929569d

SkyFan2002 force-pushed the rf_perf branch from 31198aa to 929569d Compare December 1, 2025 06:56

Merge branch 'main' into rf_perf

9253d9b

SkyFan2002 added ci-cloud Build docker image for cloud test and removed ci-cloud Build docker image for cloud test labels Dec 1, 2025

SkyFan2002 changed the title ~~chore: test performance~~ feat: improve runtime filter check via SIMD Dec 1, 2025

github-actions bot added the pr-feature this PR introduces a new feature to the codebase label Dec 1, 2025

SkyFan2002 marked this pull request as ready for review December 1, 2025 11:33

SkyFan2002 requested review from BohuTANG and zhang2014 December 1, 2025 12:00

zhang2014 approved these changes Dec 1, 2025

View reviewed changes

zhang2014 merged commit 0b79a14 into databendlabs:main Dec 1, 2025
108 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: improve runtime filter check via SIMD #19039

feat: improve runtime filter check via SIMD #19039

Uh oh!

SkyFan2002 commented Dec 1, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Dec 1, 2025

Uh oh!

github-actions bot commented Dec 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: improve runtime filter check via SIMD #19039

feat: improve runtime filter check via SIMD #19039

Uh oh!

Conversation

SkyFan2002 commented Dec 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Tests

Type of change

Uh oh!

github-actions bot commented Dec 1, 2025

Docker Image for PR

Uh oh!

github-actions bot commented Dec 1, 2025

Docker Image for PR

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

SkyFan2002 commented Dec 1, 2025 •

edited

Loading