Skip to content

feat: optimize build list query and add database indexes for search#1203

Open
andrewlukoshko wants to merge 1 commit intomasterfrom
optimize-build-query
Open

feat: optimize build list query and add database indexes for search#1203
andrewlukoshko wants to merge 1 commit intomasterfrom
optimize-build-query

Conversation

@andrewlukoshko
Copy link
Copy Markdown
Member

@andrewlukoshko andrewlukoshko commented Apr 7, 2026

Summary

  • Make the BuildTaskArtifact LEFT OUTER JOIN conditional in get_builds() — only applied when RPM filter params are provided, avoiding unnecessary JOIN + DISTINCT on every request
  • Move the Pulp get_rpm_packages() call outside generate_query() so it executes once instead of twice when paginating (data + count queries)
  • Reduce eager loading for paginated list queries (minimal=True) — skip linked_builds, performance_stats, and sign_tasks not needed for list view
  • Add Alembic migration with GIN trigram indexes (pg_trgm) on build_task_refs.url/git_ref and errata title fields for LIKE search, plus B-tree indexes on commonly filtered columns (owner_id, released, signed, finished_at, platform_id, href, errata platform_id/release_status/issued_date/cve_id)

Test plan

  • Full test suite passes (82 passed, 11 skipped)
  • Verify build list page loads faster without RPM filters
  • Verify build search by project name/ref uses trigram index (check EXPLAIN ANALYZE)
  • Verify RPM-filtered search still works correctly
  • Verify single build detail view still loads all relationships
  • Verify alembic upgrade head applies migration cleanly
  • Verify pg_trgm extension is created on fresh database

The build list endpoint (GET /builds/) has several performance
issues that make the frontend slow, especially when filtering
by project name, ref, or RPM parameters.

Query optimizations in get_builds():

- Make the BuildTaskArtifact LEFT OUTER JOIN conditional: only
  applied when RPM filter params (name, epoch, version, release,
  arch) are provided. Previously every request paid the cost of
  this JOIN plus a DISTINCT to deduplicate the multiplied rows.

- Move the Pulp API call (get_rpm_packages) outside of
  generate_query() so it executes once instead of twice when
  paginating (data query + count query both called generate_query
  independently).

- Reduce eager loading for paginated list queries: skip
  linked_builds, test_tasks.performance_stats,
  build_task.performance_stats, and sign_tasks which are not
  needed for the list view. Single build detail view still loads
  all relationships.

Database indexes (Alembic migration):

- GIN trigram indexes (pg_trgm) on build_task_refs.url and
  git_ref to accelerate LIKE '%pattern%' queries used for
  project and ref search. Regular B-tree indexes cannot help
  with infix LIKE patterns.

- B-tree indexes on builds (owner_id, released, signed,
  finished_at), build_tasks.platform_id, and
  build_artifacts.href for commonly used WHERE filters.

- GIN trigram indexes on new_errata_records title and
  original_title for errata title search.

- B-tree indexes on new_errata_records (platform_id,
  release_status, issued_date) and a GIN trigram + B-tree
  index on new_errata_references.cve_id for CVE search.

Note: the migration requires the pg_trgm PostgreSQL extension
which is created automatically if not already present.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant