fix: show host CPU/RAM in System Overview instead of Arcane container limits (#1110)#2343
fix: show host CPU/RAM in System Overview instead of Arcane container limits (#1110)#2343GiulioSavini wants to merge 2 commits intogetarcaneapp:mainfrom
Conversation
✅ Snyk checks have passed. No issues have been found so far.
💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse. |
|
The issue with this is LXC containers need the cgroup logic in order to detect the correct limits. |
…er (getarcaneapp#1110) The System Overview dashboard clamped CPU cores and memory totals to the Arcane container's own cgroup limits (set by the operator via --cpus / --memory in docker-compose). This caused nonsense readings like "Memory Usage 4.69 GB / 512 MB" and "2 CPUs" on a 16-core host. Root cause: applyCgroupLimits() and the GetDockerInfo cgroup block applied the container's artificial scheduling limits as if they were hardware facts, overwriting the correct host values gopsutil and the Docker daemon already return. The fix must preserve cgroup-limit application for LXC containers. In LXC, gopsutil reads the physical host's /proc (which may show 128 GB RAM) while the LXC guest was only allocated 8 GB — the cgroup limits are the correct values to display. Docker is the opposite: its limits are operator-defined throttles, not hardware reality. Detection: Docker always creates /.dockerenv; LXC does not. A secondary check on /proc/self/cgroup patterns covers edge cases. This is exposed as docker.IsDockerContainer() in cgroup_utils.go and used in two places: - ws_handler.go applyCgroupLimits(): returns early (no-op) when in Docker, allowing gopsutil host values through unchanged. Applies cgroup limits for LXC and other cgroup-managed environments as before. - system.go GetDockerInfo(): same guard around the DetectCgroupLimits block — Docker daemon NCPU/MemTotal are already host values in Docker; in LXC the daemon may report the full physical machine so the LXC cgroup budget must still be applied. Fixes getarcaneapp#1110
6ceb719 to
4fc6400
Compare
|
Good catch, @kmendell. The previous version removed I've reworked the fix to be environment-aware:
Both |
|
Two things I'd like your eyes on before this merges: 1. If Arcane and Docker Engine are both running inside the same LXC guest (the common case), The original PR removed this block entirely for 2. Podman can create |
|
I will have to test on a LXC container later today |
|
alrght : ) |
Summary
Fixes #1110 (originally reported as #1052). The System Overview dashboard reports CPU cores and memory totals clamped to the Arcane container's own cgroup limits. When Arcane runs in a container with e.g.
cpus: 2andmemory: 512MB, the dashboard shows things likeMemory Usage 4.69 GB / 512 MB—memUsedis read from the host whilememTotalhas been overwritten with the container limit — and2 CPUseven when the host has many more. The referenced earlier fix (80ccc9b...) did not remove the override that caused the regression on both code paths that feed the dashboard.Fix
Two surgical deletions, one per path, both leaving the surrounding infrastructure intact.
backend/internal/api/ws_handler.go— drop theapplyCgroupLimitscall fromcollectSystemStatsand delete the now-orphaned helper.collectSystemStatsnow returns what gopsutil reports for the host (cpu.Counts(true),mem.VirtualMemory()) unchanged.cgroupLimitsDetector,getCachedCgroupLimitsInternal, the static cache sampler) stays put — still exercised byTestWebSocketHandler_GetCachedCgroupLimitsInternal_DeduplicatesRefresh. If someone wants to add a future "Arcane container stats" view that intentionally shows the container's own limits, the infra is there to consume; the regression just isn't wired into the host dashboard any more.backend/internal/huma/handlers/system.go— drop theDetectCgroupLimitsblock inGetDockerInfoand its now-unuseddockerutilimport.info.NCPUandinfo.MemTotalfrom the Docker daemon (which already reflect the host the daemon is running on) are passed through unchanged.A short comment explains the intent in
GetDockerInfoso a future refactor doesn't re-introduce the clamp.Why this is safe
applyCgroupLimitshad one caller (collectSystemStats) — confirmed by repo-wide grep. The helper is deleted cleanly, no dangling references.docker "github.com/getarcaneapp/arcane/backend/pkg/dockerutil"alias insystem.gowas only used inside the removed cgroup block — confirmed by the same grep. Removing it keeps imports tidy.dockerutil.DetectCgroupLimitsfunction itself and the wholebackend/pkg/dockerutilpackage are untouched.Tests
go vet ./internal/api/... ./internal/huma/handlers/...— clean.go test ./internal/api/... -count=1—okin ~3.4s. All existingws_handlertests pass, including the one that exercisesgetCachedCgroupLimitsInternal(which is still around and still reachable).go test ./internal/huma/handlers/... -count=1—okin ~0.2s.Out of scope
#995(container update behavior) — intentionally not touched here, that one is a design discussion andkmendell'scompose-restart/updater-restartexperimental images already explore it.Fixes #1110
Disclaimer Greptiles Reviews use AI, make sure to check over its work.
To better help train Greptile on our codebase, if the comment is useful and valid Like the comment, if its not helpful or invalid Dislike
To have Greptile Re-Review the changes, mention
greptileai.Greptile Summary
This PR fixes a regression where the System Overview dashboard showed CPU/RAM clamped to the Arcane container's cgroup limits instead of the actual host values. The fix is two surgical deletions: removing the
applyCgroupLimitscall and its method fromws_handler.go, and removing theDetectCgroupLimitsoverride block (plus its now-unused import) fromsystem.go.Confidence Score: 5/5
Safe to merge — changes are pure deletions of an incorrect override with no regressions introduced.
Both changes are straightforward removals of a faulty cgroup-clamping path; no new logic is introduced, the remaining infrastructure is untouched, all existing tests pass, and the fix directly targets the reported bug.
No files require special attention.
Reviews (1): Last reviewed commit: "fix: show host CPU/RAM in System Overvie..." | Re-trigger Greptile