Skip to content

Conversation

@danparizher
Copy link
Contributor

Summary

Fixes #18959

@github-actions
Copy link
Contributor

github-actions bot commented Sep 23, 2025

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

ℹ️ ecosystem check detected linter changes. (+4 -8 violations, +0 -0 fixes in 2 projects; 53 projects unchanged)

apache/superset (+0 -4 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview --select ALL

- superset/utils/date_parser.py:173:5: DOC502 Raised exceptions are not explicitly raised: `Relation to `time_range_lookup``, `- Example`
- superset/utils/date_parser.py:203:5: DOC502 Raised exceptions are not explicitly raised: `Relation to `time_range_lookup``, `- Example`
- superset/utils/date_parser.py:234:5: DOC502 Raised exceptions are not explicitly raised: `Relation to `time_range_lookup``, `- Example`
- superset/utils/date_parser.py:279:5: DOC502 Raised exceptions are not explicitly raised: `Relation to `time_range_lookup``, `- Example`

langchain-ai/langchain (+4 -4 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --no-fix --output-format concise --preview

+ libs/langchain/langchain_classic/chat_models/base.py:79:5: DOC502 Raised exceptions are not explicitly raised: `ValueError`, `ImportError`
- libs/langchain/langchain_classic/chat_models/base.py:79:5: DOC502 Raised exceptions are not explicitly raised: `ValueError`, `ImportError`, `o3_mini = init_chat_model("openai`, `claude_sonnet = init_chat_model("anthropic`, `"google_vertexai`, `"what's your name", config={"configurable"`, `config={"configurable"`, `"openai`, `"configurable"`, `"foo_model"`, `"foo_temperature"`, `same way that you would with a normal model`, `class GetWeather(BaseModel)`, `location`, `class GetPopulation(BaseModel)`, `location`, `"Which city is hotter today and which is bigger`, `"Which city is hotter today and which is bigger`, `config={"configurable"`
+ libs/langchain/langchain_classic/embeddings/base.py:133:5: DOC502 Raised exception is not explicitly raised: `ImportError`
- libs/langchain/langchain_classic/embeddings/base.py:133:5: DOC502 Raised exceptions are not explicitly raised: `ImportError`, `model = init_embeddings("openai`, `model = init_embeddings("openai`
+ libs/langchain_v1/langchain/chat_models/base.py:67:5: DOC502 Raised exceptions are not explicitly raised: `ValueError`, `ImportError`
- libs/langchain_v1/langchain/chat_models/base.py:67:5: DOC502 Raised exceptions are not explicitly raised: `ValueError`, `ImportError`, `o3_mini = init_chat_model("openai`, `claude_sonnet = init_chat_model("anthropic`, `gemini_2-5_flash = init_chat_model("google_vertexai`, `configurable_model.invoke("what's your name", config={"configurable"`, `config={"configurable"`, `"openai`, `"configurable"`, `"foo_model"`, `"foo_temperature"`, `same way that you would with a normal model`, `class GetWeather(BaseModel)`, `location`, `class GetPopulation(BaseModel)`, `location`, `"Which city is hotter today and which is bigger`, `"Which city is hotter today and which is bigger`, `config={"configurable"`
+ libs/langchain_v1/langchain/embeddings/base.py:129:5: DOC502 Raised exception is not explicitly raised: `ImportError`
- libs/langchain_v1/langchain/embeddings/base.py:129:5: DOC502 Raised exceptions are not explicitly raised: `ImportError`, `model = init_embeddings("openai`, `model = init_embeddings("openai`

Changes by rule (1 rules affected)

code total + violation - violation + fix - fix
DOC502 12 4 8 0 0

@danparizher
Copy link
Contributor Author

Note on ruff-ecosystem results: a net decrease in violations overall.

  • Instances where a “removal/addition” pair appears at the same location.
    • Those paired changes indicate message normalization rather than behavioral regression—the false positives are being eliminated where Sphinx directives follow a Raises section
  • Remaining diagnostics have cleaner messages without stray directive/markup fragments.

Net effect is reduced noise and improved precision

@MichaReiser MichaReiser requested a review from ntBre October 12, 2025 06:31
@MichaReiser MichaReiser added the rule Implementing or modifying a lint rule label Oct 12, 2025
Copy link
Contributor

@ntBre ntBre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I had an idea for a possibly simpler implementation more similar to the numpy version. Let me know what you think. We should keep a close eye on the ecosystem results to make sure they are still improved.

This new result actually looks incorrect to me:

Maybe my suggestion will help with that too.

};
let entry = potential[..colon_idx].trim();
entries.push(QualifiedName::user_defined(entry));

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about something more like this for this function?

fn parse_entries_google(content: &str) -> Vec<QualifiedName<'_>> {
    let mut entries: Vec<QualifiedName> = Vec::new();
    let mut lines = content.lines().peekable();
    let Some(first) = lines.peek() else {
        return entries;
    };
    let indentation = &first[..first.len() - first.trim_start().len()];
    for potential in lines {
        if let Some(entry) = potential.strip_prefix(indentation) {
            if let Some(first_char) = entry.chars().next() {
                if !first_char.is_whitespace() {
                    if let Some(colon_idx) = entry.find(':') {
                        let entry = entry[..colon_idx].trim();
                        if !entry.is_empty() {
                            entries.push(QualifiedName::user_defined(entry));
                        }
                    }
                }
            }
        }
    }
    entries
}

This is closer to the parse_entries_numpy down below and seems to work on the new test case when I tried it locally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is significantly better, thank you! We will have to see what the ecosystem results will look like.

@ntBre ntBre added docstring Related to docstring linting or formatting preview Related to preview mode features labels Oct 13, 2025
@danparizher danparizher requested a review from ntBre October 15, 2025 01:23
@ntBre
Copy link
Contributor

ntBre commented Oct 15, 2025

Could you take a look at the ecosystem results? It looks like there are still some issues. I think we may need a combination of the two approaches we've tried, especially breaking early if we find a dedented line like you had initially. We should extract some test cases from the erroneous ecosystem results too.

@danparizher
Copy link
Contributor Author

Just to clarify, for the "dedented line" breaking logic, should it:

  1. Break immediately when ANY line has less indentation than the first line
  2. Break only when a line has NO indentation at all
  3. Break when a line is dedented to the same level as the section header

@ntBre
Copy link
Contributor

ntBre commented Oct 15, 2025

I think I was picturing option 1, possibly excluding completely blank lines, assuming those are allowed. I'm looking at

for example. The - Example point under a different heading is being flagged, so I assumed breaking when the indentation decreases would help:

    Returns:
        str: The resulting expression for the start of the specified unit.

    Raises:
        ValueError: If the unit is not one of the valid options.

    Relation to `time_range_lookup`:
        - Handles the "start of" or "beginning of" modifiers in the first regex pattern.
        - Example: "start of this month"`DATETRUNC(DATETIME('today'), month)`.
    """

But that's just an idea.

Copy link
Contributor

@ntBre ntBre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! This looks great and shows a very clear improvement in the ecosystem check.

I'm tempted to go ahead and land this, but I think this could also be a problem for numpy-style docstrings (example). These seem a little trickier since the exceptions are already dedented to the same level as the sphinx directive, assuming I followed the docs correctly. We may just have to check for a leading .. in that case.

Do you want to try fixing that here too, or would it be better as a separate PR?

@danparizher
Copy link
Contributor Author

I'll try to fix that here too, thanks!

Updates the docstring parser to correctly terminate the numpy-style Raises section when encountering Sphinx directives (lines starting with '..'). Adds a regression test for this behavior to address issue astral-sh#18959.
Copy link
Contributor

@ntBre ntBre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again!

I just pushed one commit moving the .. check into the indentation-stripped branch. I don't think this is likely to be common, but I could at least imagine a case like the new test I added where a Sphinx directive was used in the description of an exception, so the outer break would have been too eager.

@ntBre ntBre enabled auto-merge (squash) November 12, 2025 21:36
@ntBre ntBre merged commit a6abd65 into astral-sh:main Nov 12, 2025
36 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docstring Related to docstring linting or formatting preview Related to preview mode features rule Implementing or modifying a lint rule

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DOC502: False positive if following text contains a colon

3 participants