Skip to content

Startup race condition: awx-task may start before awx-ee populates /etc/receptor/receptor.conf #2091

@Vergiley

Description

@Vergiley

Please confirm the following

  • I agree to follow this project's code of conduct.
  • I have checked the current issues for duplicates.
  • I understand that the AWX Operator is open source software provided for free and that I might not receive a timely response.

Bug Summary

Description
When deploying AWX using awx-operator on Kubernetes, we sometimes observe a startup race condition during cold pod starts.
In some cases, the awx-server-task container starts before the awx-server-ee container has created /etc/receptor/receptor.conf.
This results in a transient failure in awx-server-task with the following error:
FileNotFoundError: [Errno 2] No such file or directory: '/etc/receptor/receptor.conf'

Root Cause
The awx-ee container entrypoint populates the runtime config:
if [ ! -f /etc/receptor/receptor.conf ]; then cp /etc/receptor/receptor-default.conf /etc/receptor/receptor.conf sed -i "s/HOSTNAME/$HOSTNAME/g" /etc/receptor/receptor.conf fi exec receptor --config /etc/receptor/receptor.conf
Because Kubernetes does not guarantee startup order for regular containers, awx-server-task may execute awx-manage provision_instance before awx-server-ee has copied the receptor configuration into the shared EmptyDir volume.

Impact
The error typically happens once during cold startup.
The pod usually recovers automatically after a restart.

Expected Behavior
awx-server-task should not attempt to read /etc/receptor/receptor.conf until the file is guaranteed to exist.

AWX Operator version

2.19.1

AWX version

24.1.0

Kubernetes platform

kubernetes

Kubernetes/Platform version

v1.32.5

Modifications

yes

Steps to reproduce

  1. Deploy AWX via awx-operator on Kubernetes.
  2. Trigger pod recreation.
  3. Occasionally observe FileNotFoundError: /etc/receptor/receptor.conf missing during awx-task startup.

Expected results

awx-task should wait until receptor.conf exists or handle missing config gracefully.

Actual results

During pod startup, awx-server-task fails with:

FileNotFoundError: [Errno 2] No such file or directory: '/etc/receptor/receptor.conf'

Additional information

We are using a customized awx-ee image with many additional modules installed.
The resulting image size is ~2GB.

This increases image pull time on cold starts and may make the startup race condition.

Operator Logs

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions