Skip to content

Update Claude PR review instructions based on 6 months of review analysis#2998

Open
bcotrim wants to merge 3 commits intotrunkfrom
analyze-pr-review-feedback
Open

Update Claude PR review instructions based on 6 months of review analysis#2998
bcotrim wants to merge 3 commits intotrunkfrom
analyze-pr-review-feedback

Conversation

@bcotrim
Copy link
Copy Markdown
Contributor

@bcotrim bcotrim commented Apr 7, 2026

Related issues

  • Related to improving AI-assisted code review quality

How AI was used in this PR

Claude analyzed ~400 merged PRs from the last 6 months (Jul 2025 – Apr 2026), extracted inline review comments from ~65 PRs with the most review activity, and categorized the feedback patterns. It also explored the codebase across 5 dimensions (ESLint/linting, code patterns, docs/contribution guides, CLI/shared code, CI workflows). The final review instructions were written by Claude and iterated with human review.

Proposed Changes

Updates the @claude PR review workflow prompt from 6 generic categories to 14 data-driven review priorities, ordered by real-world frequency.

What changed:

  • 6 → 14 review categories, ordered by how often each type of feedback actually appears in Studio PRs
  • New sections: Scope & Intent, Simplicity, AI-Generated Code, Naming & Clarity, Type Safety & Schemas, Data Safety/File Locking, i18n, Architecture & Separation
  • Upgraded sections: Security now includes Zod validation and dependency review; Cross-Platform now includes socket paths and E2E platform info; Testing now references Vitest (not Jest) and co-located test patterns
  • Fixed inaccuracies: Removed references to docs/ai-instructions.md (now AGENTS.md), Jest → Vitest, removed "database query efficiency" (Studio uses JSON files, not a DB)
  • No positive reinforcement: AI reviewer won't pollute PRs with praise comments

What didn't change:

  • Workflow triggers, permissions, and claude_args remain the same
  • The @claude invocation pattern is unchanged

Testing Instructions

  1. Open a PR with some code changes
  2. Comment @claude to trigger the review
  3. Verify the review covers the new categories (simplicity, AI artifacts, naming, data safety, etc.)
  4. Verify no positive reinforcement comments are added

Pre-merge Checklist

  • Have you checked for TypeScript, React or other console errors?

Research: PR Review Feedback Analysis (Jul 2025 – Apr 2026)

Analyzed ~400 merged PRs, ~130 had inline review comments. Deep-dived into ~65 PRs with the most review activity.

Top Review Priorities (by frequency)

# Category Frequency Key Principle
1 Simplification & Reducing Complexity Very High Write the minimum code needed
2 Naming & Clarity Very High Names should be specific and self-documenting
3 Dead Code & AI Artifacts High Remove unused code; scrutinize AI output
4 Error Handling & Logging High Proportional to risk; never silently swallow
5 Type Safety & Schemas High Strict types, no duplicated schemas
6 Architecture & Separation High Code lives where it belongs
7 Reuse Existing Code High Check before writing new utils
8 i18n Medium All user-facing strings translatable with sprintf
9 Performance / React Medium Memoize, cleanup effects, avoid re-renders
10 Visual Consistency Medium Light/dark mode, RTL, interaction states
11 PR Scope Medium Keep focused, split follow-ups
12 Data Safety / Locking Medium Always lock appdata
13 Testing Quality Medium Value > coverage; mock at right level
14 Security Low-Medium Address automated findings
15 Cross-Platform Low-Medium Guard platform-specific logic

1. Code Simplification & Reducing Complexity (Most Common — ~25+ instances)

The single most frequent feedback category. Reviewers consistently push back on unnecessary complexity, extra abstractions, and over-engineering.

Examples:

  • "We copy-pasted the whole method just to add this comment. Can we implement this with a boolean flag instead?" (PR#1596)
  • "Could we group or simplify the logic for useAddSite? Destructuring all the elements and passing them as props makes the code a bit difficult to follow." (PR#1630)
  • "It seems like we're adding overhead here without super clear gains." (PR#2807)
  • "The performance drawbacks of barrel files don't come into play much in this context, but I would still argue that we shouldn't use them." (PR#2807)
  • We should be able to achieve the same behavior using the existing sequential util (PR#2850)

2. Naming & Clarity (Very Common — ~20+ instances)

  • "I think it adds clarity with a more verbose name here: removeSiteFromConfig" (PR#2731)
  • "I'd include 'auth' in the name somehow: StoredAuthToken" (PR#2821)
  • "NIT - I would keep the name more specific instead of a generic name: DEFAULT_SKILLS_REPO_BRANCH" (PR#2729)
  • "Since this function is used outside, I propose to be more specific: filterUnsupportedBlueprintFeatures" (PR#1810)
  • "What does it mean that we are 'stopping' the UI? What would be a clearer name for this operation?" (PR#2775)

3. Removing Dead Code, Leftovers & AI Artifacts (Common — ~15+ instances)

  • "Looks like AI nonsense" (PR#2836)
  • "This looks like an AI leftover. Let's remove it." (PR#2924)
  • "I hate the AI's added try/catches so much, it's too defensive" (PR#2775)
  • "We don't use it" / "We don't use this code" — repeated across many PRs
  • "Not sure why this color change was made..." (PR#2785)

4. Error Handling & Logging (Common — ~12+ instances)

  • "The catch {} swallows all errors. Consider only ignoring ENOENT and logging/propagating other errors." (PR#2924)
  • "Promise.allSettled never rejects so any potential errors will go unnoticed. Consider some logging." (PR#2862)
  • "I hate the AI's added try/catches so much, it's too defensive" (PR#2775)

5. Type Safety & Schema Design (Common — ~12+ instances)

  • "Should we be more intentional about the status + pid options? We could use a union type." (PR#2712)
  • "The zod schema parsing should fail unless the version field matches what we expect. Use z.literal instead." (PR#2821)
  • "We should have a single definition for that [schema]." (PR#2821)

6–15. Additional categories

Architecture & Separation (~10+), Reuse Existing Code (~10+), i18n (~8+), Performance/React (~8+), UX/Visual Consistency (~8+), PR Scope (~7+), Testing (~7+), Data Safety/Locking (~5+), Security (~10+ automated), Cross-Platform (~5+).

Key Reviewers & Their Focus Areas

Reviewer Primary Focus
fredrikekelund Architecture, schema design, zod patterns, error recovery, scope
youknowriad Simplification, over-engineering pushback, AI code quality, naming
nightnei Dead code removal, naming specificity, unused exports, code clarity
sejas i18n, type safety, component reuse, error logging, UX
bcotrim Type unions, existing util reuse, cross-platform, feature flags
wojtekn Platform-specific bugs, upstream fixes, data integrity, release notes
epeicher Error logging, React performance (useMemo), readability
katinthehatsite i18n decisions, dark mode consistency, UI testing

Research: Codebase Patterns & Best Practices Analysis

Compiled from 5 parallel deep-dives into ESLint/linting, code patterns, docs/contribution guides, CLI/shared code, and CI workflows.

Enforced Rules (Automated)

Rule Severity Purpose
@typescript-eslint/no-floating-promises Error Must handle all Promises
@typescript-eslint/no-explicit-any Error No any types (except rest args)
@typescript-eslint/no-unused-vars Warn Unused vars must be prefixed _ or removed
import-x/order Error Import grouping and alphabetization
studio/require-lock-before-save Error Custom rule — save ops must be wrapped in lock/try/finally/unlock
no-restricted-globals (CLI) Error Forbids __dirname/__filename in ESM
prettier/prettier Error Code must pass Prettier formatting

CI Quality Gates

Gate Platform Blocks Merge
Lint macOS Yes
Unit Tests macOS + Windows Yes
E2E Tests macOS-arm64 + Windows-x64 Yes (skipped on drafts)
Performance Metrics macOS No (info only)

Key Architecture Patterns

  • Electron 3-process: Main → Preload → Renderer
  • IPC handlers: async, return Promises, names in constants.ts
  • Module structure: modules/<feature>/ with components/, hooks/, lib/
  • File locking: lock/try/finally/unlock for all config saves (ESLint enforced)
  • Zod schemas: z.literal() for versions, safeParse(), shared in tools/common/
  • Testing: Vitest + React Testing Library (unit), Playwright (E2E)
  • i18n: __() + sprintf(), 22 locales, never string concatenation

Critical Pitfalls

Pitfall Severity
Editing WordPress core files CRITICAL — PHP WASM, mods don't persist
Missing file locks on appdata CRITICAL — ESLint enforced
Running CLI without building IMPORTANT
Main process hot reload limits IMPORTANT — new IPC handlers need full restart
Synchronous IPC handlers ERROR — must return Promises
Direct Node.js in renderer ERROR — must use IPC

Final Review Instructions (full version)

Review Priorities

Ordered by real-world frequency — the first categories catch the most issues.

1. Scope & Intent

  • Does the PR do what it says? Flag unrelated changes, accidental file inclusions (e.g., .vscode settings, unintended lockfile diffs).
  • Is the scope right-sized? Suggest splitting if the PR crosses multiple architectural boundaries without justification.
  • Are follow-ups tracked? If the PR author deferred something, check that a Linear issue exists.

2. Simplicity

The single most common review feedback in Studio. Push back on:

  • Unnecessary abstractions — helpers, utilities, or wrappers for one-time operations. Three similar lines > a premature abstraction.
  • Over-engineering — extra configurability, barrel files, generic solutions when a specific one suffices.
  • Existing utilities ignored — check if tools/common/lib/ already has what's being written.
  • Over-defensive code — excessive try/catch blocks, especially AI-generated ones.

3. AI-Generated Code

  • Dead code / leftovers — empty catch blocks, unused imports, commented-out code, redundant type assertions.
  • Unnecessary defensiveness — try/catch wrapping code that can't throw, null checks on values guaranteed by types.
  • Unexplained changes — color changes, dependency reordering, or refactors with no clear purpose.
  • Redundant comments that describe what code does rather than why.

4. Naming & Clarity

  • Prefer removeSiteFromConfig over remove, filterUnsupportedBlueprintFeatures over filterFeatures.
  • Constants should include context: DEFAULT_SKILLS_REPO_BRANCH not DEFAULT_BRANCH.

5. Type Safety & Schemas

  • Prefer union types over optionals when data has clear states.
  • Zod schemas should use z.literal() for version fields.
  • Don't duplicate schemas — check tools/common/lib/.

6. Error Handling & Logging

  • Too little: empty catch blocks, Promise.allSettled without inspecting rejections, missing Sentry capture.
  • Too much: try/catch wrapping internal code that doesn't throw, defensive null checks on typed values.

7. Security

  • Electron: Node integration disabled, context isolation enabled, IPC sender validation.
  • CSP: no unsafe-eval/unsafe-inline without justification.
  • Input validation at system boundaries using Zod schemas.
  • No hardcoded secrets; credentials must not appear in logs.

8. Data Safety

  • File locking: save operations MUST be wrapped in lock/try/finally/unlock.
  • Atomic writes via the atomically package.
  • Studio app owns data migrations, not the standalone CLI.

9. Architecture & Separation

  • IPC handlers must be async, return Promises, names in constants.ts.
  • Renderer cannot access Node.js APIs directly.
  • Shared code belongs in tools/common/, not duplicated.

10. Cross-Platform Compatibility

  • path.join(), never hardcoded separators.
  • Platform guards for sockets, environment variables, file system differences.
  • Flag code needing testing on both platforms.

11. i18n

  • All user-facing strings: __() with sprintf(), never string concatenation.

12. React & Performance

  • useMemo/useCallback for objects passed as props/deps.
  • Effects need cleanup functions. Cancel strategies for async operations.
  • Bundle size: flag large new dependencies. Memory leaks: verify cleanup on unmount.

13. Visual & UX Consistency

  • Light and dark mode. RTL layout. Rapid user interaction handling.
  • Accessibility: watch for low color contrast, missing aria labels, and keyboard navigation issues.

14. Testing

  • Real value > coverage. Mock at the right level. Test error recovery paths.
  • Vitest for unit tests, Playwright for E2E.

Quick Checklist

Scope
[ ] PR does what the title/description says, nothing more
[ ] No accidental file inclusions or unrelated changes
[ ] AI usage is disclosed and output appears reviewed

Code Quality
[ ] No simpler way to achieve the same result
[ ] No dead code, unused imports, or AI leftovers
[ ] Names are specific and self-documenting
[ ] No duplicated schemas, types, or utilities (check tools/common/)

Safety & Security
[ ] File locking around all config save operations
[ ] IPC handlers are async and return Promises
[ ] Error handling is proportional
[ ] No hardcoded ports, paths, platform assumptions, or secrets

Cross-Platform
[ ] Paths use path.join(), not string concatenation
[ ] Platform-specific code has runtime guards

i18n & UX
[ ] User-facing strings use __() with sprintf()
[ ] UI works in light mode, dark mode, and RTL (if applicable)

Testing
[ ] Relevant tests exist and test meaningful behavior
[ ] CI passes (lint, typecheck, unit tests, E2E)

@bcotrim bcotrim self-assigned this Apr 7, 2026
@bcotrim bcotrim requested a review from a team April 7, 2026 13:59
Copy link
Copy Markdown
Contributor

@epeicher epeicher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the approach of using existing code review patterns, great idea @bcotrim! I have reviewed it and it looks good to me, as usual, this should be a live document that we should update as soon as we find some comments unnecessary, but I would say to give a try and iterate.

I would add a line asking for conciseness, for example, the following from Anthropic review-pr command

Instruct each to only provide noteworthy feedback. Once they finish, review the feedback and post only the feedback that you also deem noteworthy.


Provide feedback using inline comments for specific issues. <-- Already added 👍 
Use top-level comments for general observations or praise.. <-- Already added 👍 


Keep feedback concise.

LGTM!

13. **Visual & UX Consistency**
- Changes must look correct in both light and dark mode
- Check RTL layout if touching layout/positioning code
- Interactive states (loading, disabled, error) must handle rapid user interaction
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also add there something regarding accessibility e.g. watch for low color contast would be the most straightforward one

Copy link
Copy Markdown
Contributor

@katinthehatsite katinthehatsite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good to me, I added one small suggestion 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants