-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Fix syntax error false positive on nested alternative patterns #21104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
MichaReiser
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you
| ); | ||
| break; | ||
| } | ||
| self.names = visitor.names; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a good understanding of this part. So please disregard if not relevant or I'm just wrong ;)
Would it be better to return the union of all visitor.names here instead of just the last one in case we encountered a DifferentMatchPatternBindings error? Or is there a risk that this introduces other false positives? If so, should we return the intersection of all names instead?
(I suspect that either approach can lead to false positives depending on how the pattern are nested, so it might not be a case where there's no "better way" of doing this, it's just trade offs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I think that's a good question. (This part was not intuitive to me at all, I was working on a more complicated fix when I realized this solved the issue)
I guess this could come into play with cases like this:
match 42:
case [{1: x} | [x] | [y]] | [y]: ...
match 42:
case [{1: x} | [x] | [y]] | [x]: ...We currently emit two diagnostics for the first and only one for the second:
invalid-syntax: alternative patterns bind different names
--> /tmp/s.py:11:10
|
10 | match 42:
11 | case [{1: x} | [x] | [y]] | [y]: ...
| ^^^^^^^^^^^^^^^^^^^^^^^^^^
12 |
13 | match 42:
|
invalid-syntax: alternative patterns bind different names
--> /tmp/s.py:11:11
|
10 | match 42:
11 | case [{1: x} | [x] | [y]] | [y]: ...
| ^^^^^^^^^^^^^^^^^^
12 |
13 | match 42:
|
invalid-syntax: alternative patterns bind different names
--> /tmp/s.py:14:11
|
13 | match 42:
14 | case [{1: x} | [x] | [y]] | [x]: ...
| ^^^^^^^^^^^^^^^^^^
|
Hmm, we actually get the same answer here with the union, but we get two errors in both cases with the intersection.
Maybe that makes the intersection the best option for consistency's sake?
I think I'd consider these all true positives at least, but they may be a bit redundant. CPython only emits the innermost diagnostic, so we could also consider that.
This is also pretty hard to hit. Because of the break after emitting a DifferentMatchPatternBindings diagnostic, you need at least two inner patterns with the same name, followed by one with a different name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, the intersection breaks the tests, I guess because self.names is initially empty. So I guess we should either stick with replacing or the union.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we just not intersect until we visited at least one pattern? Like use an Option<FxHashSet> and only intersect when it's some?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be misunderstanding the suggestion, but I tried a patch like the one below, and it caused some other tests to fail (and without changing the new diagnostics from this PR). It's not just when self.names is initially empty as I said before, this also affects cases like:
match x:
case [] | [a]: ...where one of the branches is genuinely empty.
I initially included the .snap.new files in this patch, but they were way too big. We lost the diagnostics in the cases above.
I'm leaning toward sticking with the simple existing code unless I'm just missing something.
Patch
diff --git a/crates/ruff_python_parser/src/semantic_errors.rs b/crates/ruff_python_parser/src/semantic_errors.rs
index f35029d4b9..3312c72544 100644
--- a/crates/ruff_python_parser/src/semantic_errors.rs
+++ b/crates/ruff_python_parser/src/semantic_errors.rs
@@ -11,7 +11,7 @@ use ruff_python_ast::{
visitor::{Visitor, walk_expr},
};
use ruff_text_size::{Ranged, TextRange, TextSize};
-use rustc_hash::{FxBuildHasher, FxHashSet};
+use rustc_hash::{FxBuildHasher, FxHashMap, FxHashSet};
use std::fmt::Display;
#[derive(Debug, Default)]
@@ -117,7 +117,7 @@ impl SemanticSyntaxChecker {
Self::irrefutable_match_case(match_stmt, ctx);
for case in &match_stmt.cases {
let mut visitor = MatchPatternVisitor {
- names: FxHashSet::default(),
+ names: None,
ctx,
};
visitor.visit_pattern(&case.pattern);
@@ -1671,7 +1671,7 @@ impl Visitor<'_> for ReboundComprehensionVisitor<'_> {
}
struct MatchPatternVisitor<'a, Ctx> {
- names: FxHashSet<&'a ast::name::Name>,
+ names: Option<FxHashSet<&'a ast::name::Name>>,
ctx: &'a Ctx,
}
@@ -1810,15 +1810,19 @@ impl<'a, Ctx: SemanticSyntaxContext> MatchPatternVisitor<'a, Ctx> {
let mut previous_names: Option<FxHashSet<&ast::name::Name>> = None;
for pattern in patterns {
let mut visitor = Self {
- names: FxHashSet::default(),
+ names: None,
ctx: self.ctx,
};
visitor.visit_pattern(pattern);
let Some(prev) = &previous_names else {
- previous_names = Some(visitor.names);
+ previous_names = visitor.names;
continue;
};
- if prev.symmetric_difference(&visitor.names).next().is_some() {
+ if visitor
+ .names
+ .as_ref()
+ .is_some_and(|names| prev.symmetric_difference(&names).next().is_some())
+ {
// test_err different_match_pattern_bindings
// match x:
// case [a] | [b]: ...
@@ -1857,7 +1861,13 @@ impl<'a, Ctx: SemanticSyntaxContext> MatchPatternVisitor<'a, Ctx> {
);
break;
}
- self.names = visitor.names;
+ if let Some(names) = &self.names
+ && let Some(other) = visitor.names
+ {
+ self.names = Some(names.intersection(&other).cloned().collect());
+ } else {
+ self.names = visitor.names;
+ }
}
}
}
@@ -1866,7 +1876,12 @@ impl<'a, Ctx: SemanticSyntaxContext> MatchPatternVisitor<'a, Ctx> {
/// Add an identifier to the set of visited names in `self` and emit a [`SemanticSyntaxError`]
/// if `ident` has already been seen.
fn insert(&mut self, ident: &'a ast::Identifier) {
- if !self.names.insert(&ident.id) {
+ if self.names.is_none() {
+ self.names = Some(FxHashSet::default());
+ }
+ let names = self.names.as_mut().unwrap();
+
+ if !names.insert(&ident.id) {
SemanticSyntaxChecker::add_error(
self.ctx,
SemanticSyntaxErrorKind::MultipleCaseAssignment(ident.id.clone()),There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not what I meant, I think?
I'd created a Option<FxHashSet> right outside the for pattern in patterns { rather than globally in the visitor
But i also think that what we have is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have one of those too:
ruff/crates/ruff_python_parser/src/semantic_errors.rs
Lines 1809 to 1811 in d38a529
| let mut previous_names: Option<FxHashSet<&ast::name::Name>> = None; | |
| for pattern in patterns { |
We need an outer one in the visitor (or at least somewhere else) for recursive cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll go ahead and land this for now then since it should resolve the bug!
* origin/main: (21 commits) [ty] Update "constraint implication" relation to work on constraints between two typevars (#21068) [`flake8-type-checking`] Fix `TC003` false positive with `future-annotations` (#21125) [ty] Fix lookup of `__new__` on instances (#21147) Fix syntax error false positive on nested alternative patterns (#21104) [`pyupgrade`] Fix false positive for `TypeVar` with default on Python <3.13 (`UP046`,`UP047`) (#21045) [ty] Reachability and narrowing for enum methods (#21130) [ty] Use `range` instead of custom `IntIterable` (#21138) [`ruff`] Add support for additional eager conversion patterns (`RUF065`) (#20657) [`ruff-ecosystem`] Fix CLI crash on Python 3.14 (#21092) [ty] Infer type of `self` for decorated methods and properties (#21123) [`flake8-bandit`] Fix correct example for `S308` (#21128) [ty] Dont provide goto definition for definitions which are not reexported in builtins (#21127) [`airflow`] warning `airflow....DAG.create_dagrun` has been removed (`AIR301`) (#21093) [ty] follow the breaking API changes made in salsa-rs/salsa#1015 (#21117) [ty] Rename `Type::into_nominal_instance` (#21124) [ty] Filter out "unimported" from the current module [ty] Add evaluation test for auto-import including symbols in current module [ty] Refactor `ty_ide` completion tests [ty] Render `import <...>` in completions when "label details" isn't supported [`refurb`] Preserve digit separators in `Decimal` constructor (`FURB157`) (#20588) ...
Summary -- Fixes #21360 by using the union of names instead of overwriting them, as Micha suggested originally on #21104. This avoids overwriting the `n` name in the `Subscript` by the empty set of names visited in the nested OR pattern before visiting the other arm of the outer OR pattern. Test Plan -- A new inline test case taken from the issue
Summary -- Fixes #21360 by using the union of names instead of overwriting them, as Micha suggested originally on #21104. This avoids overwriting the `n` name in the `Subscript` by the empty set of names visited in the nested OR pattern before visiting the other arm of the outer OR pattern. Test Plan -- A new inline test case taken from the issue
Summary
Fixes #21101 by storing the child visitor's names in the parent visitor. This makes sure that
visitor.nameson line 1818 isn't empty after we visit a nested OR pattern.Test Plan
New inline test cases derived from the issue, playground