Skip to content

Add Yul AST comparator tool#16528

Open
clonker wants to merge 5 commits intodevelopfrom
yul_cmp_ast
Open

Add Yul AST comparator tool#16528
clonker wants to merge 5 commits intodevelopfrom
yul_cmp_ast

Conversation

@clonker
Copy link
Copy Markdown
Member

@clonker clonker commented Mar 17, 2026

Adds a Yul AST comparator tool yulASTComparator that structurally compares two Yul ASTs, treating variable and function names as equivalent if they correspond 1:1 via a scoped bidirectional map. This is useful for verifying that optimizer changes or internal renaming (e.g., switching name-generation schemes) preserve semantic equivalence. The tool reports the first point of divergence with a path and a reason.

@clonker clonker force-pushed the yul_cmp_ast branch 2 times, most recently from 05bedb8 to fc717f0 Compare March 17, 2026 13:08
@clonker clonker marked this pull request as ready for review March 17, 2026 13:13
Copy link
Copy Markdown
Contributor

@msooseth msooseth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very cool! Maybe we could also write a printer that allows us to pretty-print the AST and then we can diff it? May be helpful in the long run to debug/understand differences?

I am stupid, that's yul.

@clonker clonker force-pushed the yul_cmp_ast branch 3 times, most recently from af8eb53 to 55bcd2b Compare March 19, 2026 10:02
@clonker clonker requested a review from cameel March 25, 2026 15:47
Copy link
Copy Markdown
Collaborator

@cameel cameel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now I only checked the overall structure. I will need another pass to actually check the implementation for correctness.

Generally the tool looks like a good idea. Would be even better if this was a proper diff that does not stop at the first difference, but that's much more work. For what we need this is already good enough.

TBH I don't like how long the Python script is, but it still looks useful. I'd just prefer it to be less fuzzy with that it does. I bet it has tons of false positives and false negatives in general usage outside of the AST ID PR.

I left some initial comments, but that's not a full review yet.

)
target_link_libraries(libYulASTComparator PUBLIC solidity)

add_executable(yulASTComparator main.cpp)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
add_executable(yulASTComparator main.cpp)
add_executable(yul-ast-comparator main.cpp)

Or maybe we should give the executable a shorter name. E.g. yulcmp oryuldiff?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the goal is to have a tool that spits out an actual diff in the end then i think i like yuldiff :) otherwise yulcmp seems most appropriate

return object;
}

int main(int argc, char* argv[])
Copy link
Copy Markdown
Collaborator

@cameel cameel Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All our executables should have standard top-level error handlers to catch failed asserts and unhandled exceptions from lower layers. See for example solc/main.cpp.

print(f" Skipped: {len(skipped)}")
print("=" * 50)

if mismatches:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Truthy comparisons are evil.

Suggested change
if mismatches:
if len(mismatches) != 0:

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hehe fair fair, i wasn't sure we wanted to have this python script here at all, so i didn't put too much effort into it yet. since it seems like it is something we might want to have next to yuldiff (name tbd), i'll clean it up :)

equiv, msg = run_comparator(comparator, yul_a, yul_b)
if equiv:
continue
if "PARSE_ERROR" in msg or "TIMEOUT" in msg or "ERROR" in msg:
Copy link
Copy Markdown
Collaborator

@cameel cameel Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like how fuzzy these comparisons are. Note how "PARSE_ERROR" in msg overlaps with "ERROR" in msg. If not for the fact that you handle both the same way, they'd not be even usable, because these two errors are not distinguishable this way. Why isn't the kind of error just a separate return variable?

Comment on lines +160 to +165
if "cmdlineTests" in path:
if path.endswith("output.json"):
return "json"
if path.endswith("/output") or path.endswith("/err"):
return "text"
return None
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So e.g. input.json will not be classified as 'json'? Is that intentional?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The checks you have here are a bit too fuzzy for my taste.

You should at least use pathlib. For example your check will match somethingoutput.json while with pathlib a simple path.name == 'output.json' by default does the right thing. Same with matching the directory names - I'd not use in for those.

The fact that things that are not classified are just skipped is not great either.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So e.g. input.json will not be classified as 'json'? Is that intentional?

The naming is poor, what I wanted to match here was solely output.json produced from cmdline tests.

@clonker clonker force-pushed the yul_cmp_ast branch 3 times, most recently from 1d1c780 to e5dff4e Compare April 2, 2026 09:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants