Merged two runtimes by cehongwang · Pull Request #4164 · pytorch/TensorRT

cehongwang · 2026-04-04T01:22:03Z

Description

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.

Fixes # (issue)

Type of change

Please delete options that are not relevant and/or add your own.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

Checklist:

My code follows the style guidelines of this project (You can use the linters)
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas and hacks
I have made corresponding changes to the documentation
I have added tests to verify my fix or my feature
New and existing unit tests pass locally with my changes
I have added the relevant labels to my PR in so that relevant reviewers are notified

Signed-off-by: Torch-TensorRT Github Bot <torch-tensorrt.github.bot@nvidia.com>

narendasan

Just reviewed the core stuff for now. I think this mostly is not really solving the issue. The core idea is that we want to have a Python implementation of Torchbind endpoints (execute_engine / TRTEngine) that lets us run the same programs with either standard torch-trt or python only rather than just having two implementations that are kind of mixed together

narendasan · 2026-04-06T15:41:56Z

py/torch_tensorrt/dynamo/runtime/meta_ops/register_meta_ops.py

        )


+@torch.library.register_fake("tensorrt::execute_engine_python")  # type: ignore


Why do we need a seperate operator for this arent we just changing the implementation of TRTEngine to either be python or C++?

The problem is that if its a separate op then you can't interchange between C++ and Python only builds

narendasan · 2026-04-06T15:44:38Z

py/torch_tensorrt/dynamo/runtime/_PythonTRTEngine.py

+# ---------------------------------------------------------------------------
+
+
+class PythonTRTEngine:


I think this class should be "TRTEngine" and only "registered" if the C++ runtime is unavailable. It should also be a valid script object so that the same operator works with the Python and C++ versions of the objects and it should uses the exact same APIs as the ones we expose in the JIT_hooks file

narendasan · 2026-04-06T15:45:40Z

py/torch_tensorrt/dynamo/runtime/_PythonTRTEngine.py

+register_opaque_type(PythonTRTEngine, typ="reference")
+
+
+@torch.library.custom_op(  # type: ignore[misc]


Same thing here. this operator should only get registered if the C++ library is not available and it should take the name of the C++ op

narendasan · 2026-04-06T15:46:25Z

py/torch_tensorrt/dynamo/runtime/_PythonTRTEngine.py

+def execute_engine_python(
+    input_tensors: List[torch.Tensor], engine: PythonTRTEngine
+) -> List[torch.Tensor]:
+    outputs = engine.execute(input_tensors)


Would rather use a struct + function design rather than some masked call to a method similar to the c++ structure

narendasan · 2026-04-06T15:47:44Z

py/torch_tensorrt/dynamo/runtime/_RuntimeBackendSelection.py

Its cool that we have this but we should look into if there is a way to drop / mask registrations to change the runtime implementation rather than relying on distinct graph constructions

narendasan · 2026-04-06T15:54:28Z