Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
133 changes: 109 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,10 @@ design and specifications of [Black][black].
> `--diff` or `--check` options. See [Usage](#usage) for more details.

> [!IMPORTANT]
> **Recent Changes:**
> **Recent Changes:**
> 1. **Rule and module directives are now sorted by default:** `snakefmt` will automatically sort the order of directives inside rules (e.g. `input`, `output`, `shell`) and modules into a consistent order. You can opt out of this by using the `--no-sort` CLI flag.
> 2. **Black upgraded to v26:** The underlying `black` formatter has been upgraded to v26. You will see changes in how implicitly concatenated strings are wrapped (they are now collapsed onto a single line if they fit within the line limit) and other minor adjustments compared to previous versions.
>
>
> **Example of expected differences:**
> ```python
> # Before (Snakefmt older versions)
Expand All @@ -33,7 +33,7 @@ design and specifications of [Black][black].
> "b.txt",
> input:
> "a.txt",
>
>
> # After (Directives sorted, strings collapsed by Black 26)
> rule example:
> input:
Expand All @@ -56,13 +56,16 @@ design and specifications of [Black][black].
- [Usage](#usage)
- [Basic Usage](#basic-usage)
- [Full Usage](#full-usage)
- [Configuration](#configuration)
- [Directive Sorting](#directive-sorting)
- [Format Directives](#format-directives)
- [Configuration](#configuration)
- [Integration](#integration)
- [Editor Integration](#editor-integration)
- [Version Control Integration](#version-control-integration)
- [Github Actions](#github-actions)
- [Editor Integration](#editor-integration)
- [Version Control Integration](#version-control-integration)
- [GitHub Actions](#github-actions)
- [Plug Us](#plug-us)
- [Markdown](#markdown)
- [ReStructuredText](#restructuredtext)
- [Changes](#changes)
- [Contributing](#contributing)
- [Cite](#cite)
Expand Down Expand Up @@ -280,20 +283,6 @@ Options:
-v, --verbose Turns on debug-level logger.
```

## Configuration

`snakefmt` is able to read project-specific default values for its command line options
from a `pyproject.toml` file. In addition, it will also load any [`black`
configurations][black-config] you have in the same file.

By default, `snakefmt` will search in the parent directories of the formatted file(s)
for a file called `pyproject.toml` and use any configuration there.
If your configuration file is located somewhere else or called something different,
specify it using `--config`.

Any options you pass on the command line will take precedence over default values in the
configuration file.

### Directive Sorting

By default, `snakefmt` sorts rule and module directives (like `input`, `output`, `shell`, etc.) into a consistent order. This makes rules easier to read and allows for quicker cross-referencing between inputs, outputs, and the resources used by the execution command.
Expand All @@ -313,9 +302,104 @@ This ordering ensures that the directives most frequently used in execution bloc

You can disable this feature using the `--no-sort` flag.

### Format Directives

`snakefmt` supports comment directives to control formatting behaviour for specific regions of code.
Directives should appear as standalone comment lines, an inline occurrence (e.g. `input: # fmt: off`) is treated as a plain comment and has no effect.
All directives are scope-local: only the region they select is affected, while code before and after follows normal `snakefmt` formatting and spacing rules (equivalent to replacing the directive with a plain comment line).

#### `# fmt: off` / `# fmt: on`

Disables all formatting for the region between the two directives.
Both directives *must* appear at the same indentation level; a `# fmt: on` at a deeper indent than the matching `# fmt: off` has no effect.

```python
rule a:
input:
"a.txt",


# fmt: off
rule b:
input: "b.txt"
output:
"c.txt"
# fmt: on


rule c:
input:
"d.txt",
```

> **Note:** inside `run:` blocks and other Python contexts, `# fmt: off` / `# fmt: on` is passed through to [Black][black], which handles it natively.

#### `# fmt: off[sort]`

Disables directive sorting for the enclosed region while still applying all other formatting.
Directives between `# fmt: off[sort]` and `# fmt: on[sort]` are kept in their original order.
A plain `# fmt: on` also closes a `# fmt: off[sort]` region.

```python
# fmt: off[sort]
rule keep_my_order:
output:
"result.txt",
input:
"source.txt",
shell:
"cp {input} {output}"
# fmt: on[sort]
```

#### `# fmt: off[next]`

Disables formatting for the single next Snakemake keyword block (e.g. `rule`, `checkpoint`, `use rule`).
Only that block is left unformatted; all subsequent blocks are formatted normally.

```python
rule formatted:
input:
"a.txt",
output:
"b.txt",


# fmt: off[next]
rule unformatted:
input: "a.txt"
output: "b.txt"


rule also_formatted:
input:
"a.txt",
```

#### `# fmt: skip`

`# fmt: skip` preserves a single line exactly as written, without any formatting (see [Black's documentation][black-skip] for details).

> **Note:** `# fmt: skip` is not yet supported within Snakemake rule blocks.
> It currently applies only to plain Python lines outside of rules, checkpoints, and similar Snakemake constructs.

### Configuration

`snakefmt` is able to read project-specific default values for its command line options
from a `pyproject.toml` file. In addition, it will also load any [`black`
configurations][black-config] you have in the same file.

By default, `snakefmt` will search in the parent directories of the formatted file(s)
for a file called `pyproject.toml` and use any configuration there.
If your configuration file is located somewhere else or called something different,
specify it using `--config`.

Any options you pass on the command line will take precedence over default values in the
configuration file.

#### Example

`pyproject.toml`
[`pyproject.toml`][pyproject]

```toml
[tool.snakefmt]
Expand Down Expand Up @@ -415,13 +499,13 @@ in your project.

[![Code style: snakefmt](https://img.shields.io/badge/code%20style-snakefmt-000000.svg)](https://github.com/snakemake/snakefmt)

#### Markdown
### Markdown

```md
[![Code style: snakefmt](https://img.shields.io/badge/code%20style-snakefmt-000000.svg)](https://github.com/snakemake/snakefmt)
```

#### ReStructuredText
### ReStructuredText

```rst
.. image:: https://img.shields.io/badge/code%20style-snakefmt-000000.svg
Expand Down Expand Up @@ -459,6 +543,7 @@ See [CONTRIBUTING.md][contributing].
[snakemake]: https://snakemake.readthedocs.io/
[black]: https://black.readthedocs.io/en/stable/
[black-config]: https://github.com/psf/black#pyprojecttoml
[black-skip]: https://black.readthedocs.io/en/stable/usage_and_configuration/the_basics.html#ignoring-sections
[pyproject]: https://github.com/snakemake/snakefmt/blob/master/pyproject.toml
[contributing]: CONTRIBUTING.md
[changes]: CHANGELOG.md
Expand Down
120 changes: 106 additions & 14 deletions snakefmt/formatter.py
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,6 @@ def __init__(
self.result: str = ""
self.lagging_comments: str = ""
self.no_formatting_yet: bool = True
self.sort_directives = sort_directives
self.previous_result: str = ""
self.keyword_spec: list[str] = []
self.keywords: dict[str, str] = {} # cache to sort
Expand All @@ -75,7 +74,7 @@ def __init__(
if line_length is not None:
self.black_mode.line_length = line_length

super().__init__(snakefile) # Call to parse snakefile
super().__init__(snakefile, sort_directives=sort_directives)

def get_formatted(self) -> str:
return self.result
Expand All @@ -90,10 +89,13 @@ def flush_buffer(
from_python: bool = False,
final_flush: bool = False,
in_global_context: bool = False,
exiting_keywords: bool = False,
) -> None:
if len(self.buffer) == 0 or self.buffer.isspace():
self.result += self.buffer
self.buffer = ""
if exiting_keywords and self.no_formatting_yet and self.result.rstrip("\n"):
self.no_formatting_yet = False
return

if not from_python:
Expand All @@ -103,6 +105,9 @@ def flush_buffer(
else:
# Invalid python syntax, eg lone 'else:' between two rules, can occur.
# Below constructs valid code statements and formats them.
if self.fmt_off_expected_indent:
self.buffer += self.fmt_off_expected_indent
self.fmt_off_expected_indent = ""
re_match = contextual_matcher.match(self.buffer)
if re_match is not None:
callback_keyword = re_match.group(2)
Expand All @@ -119,11 +124,13 @@ def flush_buffer(
)
formatted = self.run_black_format_str(to_format, self.block_indent)
re_rematch = contextual_matcher.match(formatted)
if re_rematch is None:
raise ValueError(
"contextual_matcher failed to match for the given "
f"formatted string: {formatted}"
)
assert re_rematch, (
"This should always match as we just formatted it with the same "
"regex. If this error is raised, it's a bug in snakefmt's "
"handling of snakemake syntax. Please report this to the "
"developers with the code so we can fix it: "
"https://github.com/snakemake/snakefmt/issues"
)
if condition != "":
callback_keyword += re_rematch.group(3)
formatted = (
Expand Down Expand Up @@ -174,7 +181,7 @@ def process_keyword_param(
context=param_context,
)
param_formatted = self.format_params(param_context)
if self.sort_directives and not in_global_context and self.keyword_spec:
if self.sort_off_indent is None and not in_global_context and self.keyword_spec:
self.keywords[param_context.keyword_name] = self.result + param_formatted
self.result = ""
else:
Expand All @@ -188,13 +195,95 @@ def post_process_keyword(self):
for keyword in self.keyword_spec:
res = self.keywords.pop(keyword, "")
self.previous_result += res
if self.keywords:
raise InvalidParameterSyntax(
"Unexpected keywords when sorted keywords: "
+ (", ".join(self.keywords))
)
assert not self.keywords, (
"All directives should have been consumed; "
"if not, this is a bug in snakefmt's handling of snakemake syntax. "
"It must be the coder's fault, not the user's. "
"So please report this to the developers with the code so we can fix it: "
"https://github.com/snakemake/snakefmt/issues"
)
self.result = self.previous_result + self.result
self.previous_result = ""
# Keep no_formatting_yet when there is pending buffered content.
# This prevents premature separator insertion after fmt: off/on
# verbatim regions before the next flush occurs.
if self.no_formatting_yet and self.result.rstrip("\n") and not self.buffer:
self.no_formatting_yet = False

def flush_fmt_off_region(self, verbatim: str):
"""Blank-line rules:

applied before the verbatim block:
- At global indent (fmt_off[0] == 0) and result not empty:
result should end with exactly 2 blank lines (``\\n\\n\\n``)
(standard separation between top-level constructs).
- When the preceding Python code had a blank line before ``# fmt: off``
(``fmt_off_preceded_by_blank_line``):
result should end with >= 1 blank line.
- ``# fmt: off[next]`` nested inside a Python block:
another ``\\n`` is prepended to any lagging comment
so the following keyword gets its normal blank-line separator.

applied after the verbatim block:
- ``# fmt: off[next]``: sets ``no_formatting_yet := False``,
so the next formatted block gets its normal blank-line separator.
- Plain ``# fmt: off`` regions: sets ``no_formatting_yet := True``,
suppressing blank-line insertion in the next ``add_newlines`` call.
"""

if self.no_formatting_yet:
self.result = self.result.lstrip("\n")
self.result += self.buffer
self.buffer = ""
if self.fmt_off:
if self.fmt_off[0] == 0 and not self.no_formatting_yet:
while not self.result.endswith("\n\n\n"):
self.result += "\n"
# When fmt:off[next] is inside a Python block (e.g. `if 1:`), the
# directive ends up as a lagging_comment after flushing that block.
is_nested_next = self.fmt_off[1] == "next"
else:
is_nested_next = False
if self.lagging_comments:
# For nested fmt:off[next], add the same \n separator that
# process_keyword_context/add_newlines would normally provide
# before the first keyword inside the Python block.
if is_nested_next and not self.no_formatting_yet:
self.result += "\n"
self.result += self.lagging_comments
self.lagging_comments = ""
self.no_formatting_yet = not is_nested_next
if self.fmt_off_preceded_by_blank_line:
if self.result and not self.result.endswith("\n\n"):
self.result += "\n"
self.fmt_off_preceded_by_blank_line = False
self.result += verbatim
self.last_recognised_keyword = ""

def flush_sort_signal(self, verbatim):
"""
If "fmt: on sort" directive is in the keyword syntax, e.g.:

rule:
directive1: ...
# fmt: off[sort]
directive2: ...
# fmt: on[sort] <-
# other comments
directive3: ...

the "other comments" should be kept with directive3.
This function is called when "fmt: on[sort]" reached,
and it flushes the pending comments into self.result.
"""
if self.keywords:
pending = ""
for keyword in self.keyword_spec:
pending += self.keywords.pop(keyword, "")
self.previous_result += pending
self.previous_result += self.result + verbatim
self.result = ""
self.last_recognised_keyword = ""

def run_black_format_str(
self,
Expand All @@ -216,7 +305,6 @@ def run_black_format_str(
and len(string.strip().splitlines()) > 1
and not no_nesting
)

if artificial_nest:
string = f"if x:\n{textwrap.indent(string, TAB)}"

Expand Down Expand Up @@ -473,6 +561,10 @@ def add_newlines(
if comment_matches > 0:
self.lagging_comments = "\n".join(all_lines[comment_break:]) + "\n"
if final_flush:
# Preserve one intentional blank line before trailing
# comments at EOF (e.g. indented # fmt-like comments).
if comment_break > 0 and all_lines[comment_break - 1] == "":
self.result += "\n"
self.result += self.lagging_comments
else:
self.result += formatted_string
Expand Down
Loading
Loading