Conversation
878f4b4 to
27703d5
Compare
Signed-off-by: Torch-TensorRT Github Bot <torch-tensorrt.github.bot@nvidia.com>
27703d5 to
ef0662c
Compare
narendasan
left a comment
There was a problem hiding this comment.
Just reviewed the core stuff for now. I think this mostly is not really solving the issue. The core idea is that we want to have a Python implementation of Torchbind endpoints (execute_engine / TRTEngine) that lets us run the same programs with either standard torch-trt or python only rather than just having two implementations that are kind of mixed together
| ) | ||
|
|
||
|
|
||
| @torch.library.register_fake("tensorrt::execute_engine_python") # type: ignore |
There was a problem hiding this comment.
Why do we need a seperate operator for this arent we just changing the implementation of TRTEngine to either be python or C++?
There was a problem hiding this comment.
The problem is that if its a separate op then you can't interchange between C++ and Python only builds
| # --------------------------------------------------------------------------- | ||
|
|
||
|
|
||
| class PythonTRTEngine: |
There was a problem hiding this comment.
I think this class should be "TRTEngine" and only "registered" if the C++ runtime is unavailable. It should also be a valid script object so that the same operator works with the Python and C++ versions of the objects and it should uses the exact same APIs as the ones we expose in the JIT_hooks file
| register_opaque_type(PythonTRTEngine, typ="reference") | ||
|
|
||
|
|
||
| @torch.library.custom_op( # type: ignore[misc] |
There was a problem hiding this comment.
Same thing here. this operator should only get registered if the C++ library is not available and it should take the name of the C++ op
| def execute_engine_python( | ||
| input_tensors: List[torch.Tensor], engine: PythonTRTEngine | ||
| ) -> List[torch.Tensor]: | ||
| outputs = engine.execute(input_tensors) |
There was a problem hiding this comment.
Would rather use a struct + function design rather than some masked call to a method similar to the c++ structure
There was a problem hiding this comment.
Its cool that we have this but we should look into if there is a way to drop / mask registrations to change the runtime implementation rather than relying on distinct graph constructions
| return | ||
|
|
||
| if self._is_python_runtime: | ||
| self.engine = PythonTRTEngine( |
There was a problem hiding this comment.
Yeah we should be trying to monitor torch bind registration and register a class if there is no C++ api rather than two code paths
| metadata = pickle.loads(dumped_metadata) | ||
| return metadata | ||
| def decode_metadata(encoded_metadata: bytes | str) -> Any: | ||
| if isinstance(encoded_metadata, str): |
There was a problem hiding this comment.
Why was this rewritten?
| ) | ||
|
|
||
| def set_extra_state(self, state: SerializedTorchTensorRTModuleFmt) -> None: | ||
| def set_extra_state(self, state: TorchTensorRTModuleExtraState) -> None: |
There was a problem hiding this comment.
Why are we changing any of this it should be the same
| metadata["output_tensors_are_unowned"] | ||
| ) | ||
|
|
||
| def __del__(self) -> None: |
There was a problem hiding this comment.
Is this necessary, cant we just use del in the actual TRTEngine class?
There was a problem hiding this comment.
Generally theres a lot of changes in this file I dont really understand why we are changing. Isnt the entire point of this feature to detect when the C++ runtime is not available and drop in compatible Python implementations of C++ runtime APIs? rather than just fold two separate implementations into one class.
cc7d4b6 to
5b1bde7
Compare
694755f to
1755001
Compare
| SerializedTensorRTEngineFmt = List[ | ||
| Union[str, bytes] | ||
| ] # Aligned with //core/runtime/register_jit_hooks.cpp | ||
| SerializedTorchTensorRTModuleFmt = Tuple[ |
There was a problem hiding this comment.
This should be with the other serialization info
|
|
||
| engine_info: List[str | bytes] = [""] * SERIALIZATION_LEN | ||
| engine_info[ABI_TARGET_IDX] = torch.ops.tensorrt.ABI_VERSION() | ||
| engine_info[ABI_TARGET_IDX] = ( |
There was a problem hiding this comment.
We should centralize this so its like generate_engine_info and will choose the right source internally
| return ( | ||
| self.name, | ||
| self.engine.__getstate__(), | ||
| engine_info, |
There was a problem hiding this comment.
Does python self.engine not have getstate()?
| list(input_tensors), self.engine | ||
| ) | ||
|
|
||
| outputs = list(torch.ops.tensorrt.execute_engine(input_tensors, self.engine)) |
There was a problem hiding this comment.
Why do we need this casting?
| input_tensors: List[torch.Tensor] = [ | ||
| (i if isinstance(i, torch.Tensor) else torch.tensor(i).cuda()) | ||
| for i in inputs | ||
| (value if isinstance(value, torch.Tensor) else torch.tensor(value).cuda()) |
There was a problem hiding this comment.
Is this just renaming a variable?
There was a problem hiding this comment.
Yes I feel like this is better than "i" but if you think it is unnecessary I can change it back
| engine_bytes = engine_info[ENGINE_IDX] | ||
| engine_info[ENGINE_IDX] = base64.b64encode(engine_bytes).decode("utf-8") | ||
| trt_node = gm.graph.call_function( | ||
| torch.ops.tensorrt.no_op_placeholder_for_execute_engine.default, |
There was a problem hiding this comment.
Why do we need to use this? This means that we break compatibility with torch.export.load
There was a problem hiding this comment.
Also does this even work if you try to load and run with a python only build?
There was a problem hiding this comment.
I use this to decouple the dependencies on the C++ build. When serializing a C++ engine and loading it in the Python runtime.
This works with python only build, python built engine and load in C++ runtime, C++ build engine and load in Python runtime.
There was a problem hiding this comment.
Let me look into the opaque object
|
Like why can we use the c++ version of this https://github.com/pytorch/pytorch/blob/e2584b2554d11fda4998a8d2be6145b0eded5049/torch/_library/opaque_object.py#L73 like you do in python and then align the serialization so we dont need the no op |
Description
Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.
Fixes # (issue)
Type of change
Please delete options that are not relevant and/or add your own.
Checklist: