Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
match ruff:
case {"lint": {"select": x} | {"extend-select": x}} | {"select": x}:
...
match 42:
case [[x] | [x]] | x: ...
match 42:
case [[x | x] | [x]] | x: ...
10 changes: 10 additions & 0 deletions crates/ruff_python_parser/src/semantic_errors.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1841,13 +1841,23 @@ impl<'a, Ctx: SemanticSyntaxContext> MatchPatternVisitor<'a, Ctx> {
// case (x, (y | y)): ...
// case [a, _] | [a, _]: ...
// case [a] | [C(a)]: ...

// test_ok nested_alternative_patterns
// match ruff:
// case {"lint": {"select": x} | {"extend-select": x}} | {"select": x}:
// ...
// match 42:
// case [[x] | [x]] | x: ...
// match 42:
// case [[x | x] | [x]] | x: ...
SemanticSyntaxChecker::add_error(
self.ctx,
SemanticSyntaxErrorKind::DifferentMatchPatternBindings,
*range,
);
break;
}
self.names = visitor.names;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a good understanding of this part. So please disregard if not relevant or I'm just wrong ;)

Would it be better to return the union of all visitor.names here instead of just the last one in case we encountered a DifferentMatchPatternBindings error? Or is there a risk that this introduces other false positives? If so, should we return the intersection of all names instead?

(I suspect that either approach can lead to false positives depending on how the pattern are nested, so it might not be a case where there's no "better way" of doing this, it's just trade offs)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I think that's a good question. (This part was not intuitive to me at all, I was working on a more complicated fix when I realized this solved the issue)

I guess this could come into play with cases like this:

match 42:
    case [{1: x} | [x] | [y]] | [y]: ...

match 42:
    case [{1: x} | [x] | [y]] | [x]: ...

We currently emit two diagnostics for the first and only one for the second:

invalid-syntax: alternative patterns bind different names
  --> /tmp/s.py:11:10
   |
10 | match 42:
11 |     case [{1: x} | [x] | [y]] | [y]: ...
   |          ^^^^^^^^^^^^^^^^^^^^^^^^^^
12 |
13 | match 42:
   |

invalid-syntax: alternative patterns bind different names
  --> /tmp/s.py:11:11
   |
10 | match 42:
11 |     case [{1: x} | [x] | [y]] | [y]: ...
   |           ^^^^^^^^^^^^^^^^^^
12 |
13 | match 42:
   |

invalid-syntax: alternative patterns bind different names
  --> /tmp/s.py:14:11
   |
13 | match 42:
14 |     case [{1: x} | [x] | [y]] | [x]: ...
   |           ^^^^^^^^^^^^^^^^^^
   |

Hmm, we actually get the same answer here with the union, but we get two errors in both cases with the intersection.

Maybe that makes the intersection the best option for consistency's sake?

I think I'd consider these all true positives at least, but they may be a bit redundant. CPython only emits the innermost diagnostic, so we could also consider that.

This is also pretty hard to hit. Because of the break after emitting a DifferentMatchPatternBindings diagnostic, you need at least two inner patterns with the same name, followed by one with a different name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, the intersection breaks the tests, I guess because self.names is initially empty. So I guess we should either stick with replacing or the union.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we just not intersect until we visited at least one pattern? Like use an Option<FxHashSet> and only intersect when it's some?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be misunderstanding the suggestion, but I tried a patch like the one below, and it caused some other tests to fail (and without changing the new diagnostics from this PR). It's not just when self.names is initially empty as I said before, this also affects cases like:

match x:
     case [] | [a]: ...

where one of the branches is genuinely empty.

I initially included the .snap.new files in this patch, but they were way too big. We lost the diagnostics in the cases above.

I'm leaning toward sticking with the simple existing code unless I'm just missing something.

Patch

diff --git a/crates/ruff_python_parser/src/semantic_errors.rs b/crates/ruff_python_parser/src/semantic_errors.rs
index f35029d4b9..3312c72544 100644
--- a/crates/ruff_python_parser/src/semantic_errors.rs
+++ b/crates/ruff_python_parser/src/semantic_errors.rs
@@ -11,7 +11,7 @@ use ruff_python_ast::{
     visitor::{Visitor, walk_expr},
 };
 use ruff_text_size::{Ranged, TextRange, TextSize};
-use rustc_hash::{FxBuildHasher, FxHashSet};
+use rustc_hash::{FxBuildHasher, FxHashMap, FxHashSet};
 use std::fmt::Display;
 
 #[derive(Debug, Default)]
@@ -117,7 +117,7 @@ impl SemanticSyntaxChecker {
                 Self::irrefutable_match_case(match_stmt, ctx);
                 for case in &match_stmt.cases {
                     let mut visitor = MatchPatternVisitor {
-                        names: FxHashSet::default(),
+                        names: None,
                         ctx,
                     };
                     visitor.visit_pattern(&case.pattern);
@@ -1671,7 +1671,7 @@ impl Visitor<'_> for ReboundComprehensionVisitor<'_> {
 }
 
 struct MatchPatternVisitor<'a, Ctx> {
-    names: FxHashSet<&'a ast::name::Name>,
+    names: Option<FxHashSet<&'a ast::name::Name>>,
     ctx: &'a Ctx,
 }
 
@@ -1810,15 +1810,19 @@ impl<'a, Ctx: SemanticSyntaxContext> MatchPatternVisitor<'a, Ctx> {
                 let mut previous_names: Option<FxHashSet<&ast::name::Name>> = None;
                 for pattern in patterns {
                     let mut visitor = Self {
-                        names: FxHashSet::default(),
+                        names: None,
                         ctx: self.ctx,
                     };
                     visitor.visit_pattern(pattern);
                     let Some(prev) = &previous_names else {
-                        previous_names = Some(visitor.names);
+                        previous_names = visitor.names;
                         continue;
                     };
-                    if prev.symmetric_difference(&visitor.names).next().is_some() {
+                    if visitor
+                        .names
+                        .as_ref()
+                        .is_some_and(|names| prev.symmetric_difference(&names).next().is_some())
+                    {
                         // test_err different_match_pattern_bindings
                         // match x:
                         //     case [a] | [b]: ...
@@ -1857,7 +1861,13 @@ impl<'a, Ctx: SemanticSyntaxContext> MatchPatternVisitor<'a, Ctx> {
                         );
                         break;
                     }
-                    self.names = visitor.names;
+                    if let Some(names) = &self.names
+                        && let Some(other) = visitor.names
+                    {
+                        self.names = Some(names.intersection(&other).cloned().collect());
+                    } else {
+                        self.names = visitor.names;
+                    }
                 }
             }
         }
@@ -1866,7 +1876,12 @@ impl<'a, Ctx: SemanticSyntaxContext> MatchPatternVisitor<'a, Ctx> {
     /// Add an identifier to the set of visited names in `self` and emit a [`SemanticSyntaxError`]
     /// if `ident` has already been seen.
     fn insert(&mut self, ident: &'a ast::Identifier) {
-        if !self.names.insert(&ident.id) {
+        if self.names.is_none() {
+            self.names = Some(FxHashSet::default());
+        }
+        let names = self.names.as_mut().unwrap();
+
+        if !names.insert(&ident.id) {
             SemanticSyntaxChecker::add_error(
                 self.ctx,
                 SemanticSyntaxErrorKind::MultipleCaseAssignment(ident.id.clone()),

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not what I meant, I think?

I'd created a Option<FxHashSet> right outside the for pattern in patterns { rather than globally in the visitor

But i also think that what we have is fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have one of those too:

let mut previous_names: Option<FxHashSet<&ast::name::Name>> = None;
for pattern in patterns {

We need an outer one in the visitor (or at least somewhere else) for recursive cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll go ahead and land this for now then since it should resolve the bug!

}
}
}
Expand Down
Loading