Skip to content

perf(hybridcloud): Skip correlated subqueries in user.serialize_many#112500

Merged
scttcper merged 4 commits intomasterfrom
scttcper/skip-extra-subqueries
Apr 9, 2026
Merged

perf(hybridcloud): Skip correlated subqueries in user.serialize_many#112500
scttcper merged 4 commits intomasterfrom
scttcper/skip-extra-subqueries

Conversation

@scttcper
Copy link
Copy Markdown
Member

@scttcper scttcper commented Apr 8, 2026

def base_query(self, select_related: bool = True) -> QuerySet[User]:
if not select_related:
return User.objects.all()
return User.objects.extra(
select={
"permissions": "select array_agg(permission) from sentry_userpermission where user_id=auth_user.id",
"roles": """
SELECT array_agg(permissions)
FROM sentry_userrole
JOIN sentry_userrole_users
ON sentry_userrole_users.role_id=sentry_userrole.id
WHERE user_id=auth_user.id""",
"useremails": "select array_agg(row_to_json(sentry_useremail)) from sentry_useremail where user_id=auth_user.id",
"authenticators": "SELECT array_agg(row_to_json(auth_authenticator)) FROM auth_authenticator WHERE user_id=auth_user.id",
"useravatar": "SELECT array_agg(row_to_json(sentry_useravatar)) FROM sentry_useravatar WHERE user_id = auth_user.id",
}
)

base_query() adds 5 correlated subqueries via .extra() (permissions, roles, useremails, authenticators, useravatar) that are only consumed by serialize_rpc_user() in the get_many path. serialize_many goes through the API serializer which re-queries the same data in get_attrs(), so the extra subqueries were pure overhead - the data was fetched, attached to the User objects, and then ignored.

Threads select_related through FilterQueryDatabaseImpl.serialize_many as an opt-in parameter and has the user service pass False. The select_related=False escape hatch already exists (added after #48088) and is used by get_many_ids and get_many_profiles, but serialize_many wasn't using it.

Context: #45564 moved base_query into FilterQueryDatabaseImpl, #48088 added the eager-loaded subqueries to fix N+1s in the serialize_rpc path.

base_query() adds 5 correlated subqueries via .extra() (permissions,
roles, useremails, authenticators, useravatar) that are only consumed by
serialize_rpc_user() in the get_many path. serialize_many goes through
the API serializer which re-queries the same data via get_attrs(), so
the extra subqueries were pure overhead.

Pass select_related=False so serialize_many gets a plain queryset.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Apr 8, 2026
Instead of changing the default for all FilterQueryDatabaseImpl
subclasses, expose select_related as a parameter on serialize_many and
have the user service opt in. Safer since only the user service is
affected.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@scttcper scttcper requested a review from a team April 8, 2026 19:06
@scttcper scttcper marked this pull request as ready for review April 8, 2026 19:06
@scttcper scttcper requested a review from a team as a code owner April 8, 2026 19:06
@scttcper scttcper changed the title perf(hybridcloud): Skip correlated subqueries in serialize_many perf(hybridcloud): Skip correlated subqueries in user.serialize_many Apr 8, 2026
Copy link
Copy Markdown
Member

@markstory markstory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find 👏

@scttcper scttcper merged commit 49d44f8 into master Apr 9, 2026
79 checks passed
@scttcper scttcper deleted the scttcper/skip-extra-subqueries branch April 9, 2026 15:42
george-sentry pushed a commit that referenced this pull request Apr 9, 2026
…112500)

https://github.com/getsentry/sentry/blob/a8fcbce0cc94e981c42ef26c033fb9eb56b1b03d/src/sentry/users/services/user/impl.py#L358-L375

`base_query()` adds 5 correlated subqueries via `.extra()` (permissions,
roles, useremails, authenticators, useravatar) that are only consumed by
`serialize_rpc_user()` in the `get_many` path. `serialize_many` goes
through the API serializer which re-queries the same data in
`get_attrs()`, so the extra subqueries were pure overhead - the data was
fetched, attached to the User objects, and then ignored.

Threads `select_related` through
`FilterQueryDatabaseImpl.serialize_many` as an opt-in parameter and has
the user service pass `False`. The `select_related=False` escape hatch
already exists (added after #48088) and is used by `get_many_ids` and
`get_many_profiles`, but `serialize_many` wasn't using it.

Context: #45564 moved `base_query` into `FilterQueryDatabaseImpl`,
#48088 added the eager-loaded subqueries to fix N+1s in the
`serialize_rpc` path.

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants