Skip to content

Fix NVPTX derivative codegen to use __nv_* device math functions instead of host libcalls#2777

Draft
Copilot wants to merge 2 commits intomainfrom
copilot/fix-flog-error-in-cuda-code
Draft

Fix NVPTX derivative codegen to use __nv_* device math functions instead of host libcalls#2777
Copilot wants to merge 2 commits intomainfrom
copilot/fix-flog-error-in-cuda-code

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 7, 2026

On Windows NVPTX (CUDA), differentiating functions like pow, sin, cos, tanh, sinh, cosh fails with errors like no libcall available for flog because Enzyme generates derivative helper calls using host math names (cosh, llvm.log.f64, etc.) that don't exist as CUDA device functions.

Root causes

  1. SameTypesFunc path (tanh→cosh, sinh→cosh, cosh→sinh): SameTypesFunc<"cosh"> unconditionally inserts @cosh. When differentiating @__nv_tanh the derivative must call @__nv_cosh.

  2. LLVM intrinsic path (pow→log, sin→cos, cos→sin): Derivatives generate llvm.log.f64 / llvm.cos.f64 intrinsics. On NVPTX these must be replaced by __nv_log / __nv_cos via ReplaceFunctionImplementation, but on Windows the __nv_* functions are often absent from the module, so the substitution never fires and the NVPTX backend errors trying to lower the bare intrinsic.

Changes

  • enzyme-tblgen.cppSameTypesFunc codegen: Generated code now checks whether the original call target carries the implements2 attribute (set by PreserveNVVM on all __nv_* functions). If present, it first searches the module for the corresponding implementing function (e.g. __nv_cosh where implements2="cosh"); if not found and the target triple contains nvptx, it falls back to declaring __nv_<funcname> directly.

  • FunctionUtils.cppReplaceFunctionImplementation: Before the main substitution loop, on NVPTX targets the function now scans for LLVM math intrinsics (llvm.log.f64, llvm.sin.f32, etc.) present in the module and auto-declares their __nv_* counterparts (with implements/implements2/enzyme_math attributes) when they are missing. This allows the subsequent loop to correctly replace intrinsic uses with the proper device function.

  • PreserveNVVM.h: Exports isTargetNVPTX for use in FunctionUtils.cpp.

  • test/Enzyme/ReverseMode/nvvm_tanh.ll: New lit test verifying that differentiating @__nv_tanh on an NVPTX module produces a call to @__nv_cosh rather than @cosh.

Copilot AI linked an issue Apr 7, 2026 that may be closed by this pull request
…f plain libcalls

Agent-Logs-Url: https://github.com/EnzymeAD/Enzyme/sessions/d74331c1-24b3-4d41-ad79-ac6fad2beec2

Co-authored-by: minansys <149007967+minansys@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix error: no libcall available for flog in CUDA code Fix NVPTX derivative codegen to use __nv_* device math functions instead of host libcalls Apr 7, 2026
Copilot AI requested a review from minansys April 7, 2026 20:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

error: no libcall available for flog error

2 participants