【WIP】【Hackathon 10th Spring No.2】Add schema parser and related functionality for Torch compatibility#77938
Conversation
|
你的PR提交成功,感谢你对开源项目的贡献! |
There was a problem hiding this comment.
Pull request overview
This PR adds comprehensive schema parser functionality to the TORCH_LIBRARY registration mechanism, enabling PyTorch-compatible operator definitions with full type information, default values, keyword arguments, and alias annotations. This is part of the Hackathon 10th Spring No.2 task and significantly enhances the compatibility layer between PaddlePaddle and PyTorch.
Changes:
- Implemented schema parsing infrastructure including FunctionSchemaParser and SchemaTypeParser to parse PyTorch-style operator schemas
- Extended FunctionArgs to support named/keyword arguments and automatic argument normalization based on parsed schemas
- Added comprehensive type system (FunctionSchema, Argument, AliasInfo, Type hierarchy) to represent parsed schema information
Reviewed changes
Copilot reviewed 18 out of 18 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| test/cpp/compat/torch_library_test.cc | Added 400+ lines of comprehensive tests for schema parsing, including default values, keyword arguments, optional types, tuples, and variadic functions |
| test/cpp/compat/schema_parser_type_test.cc | New test file (206 lines) specifically for testing schema type parser edge cases and torchcodec-like schemas |
| test/cpp/compat/CMakeLists.txt | Moved torch_library_test from GPU-only (nv_test) to general (cc_test) and added schema_parser_type_test |
| paddle/phi/api/include/compat/torch/library.h | Added schema binding to CppFunction, keyword argument support in FunctionArgs, and normalize_args_by_schema logic |
| paddle/phi/api/include/compat/torch/library.cpp | Updated OperatorRegistry to parse and bind schemas to implementations |
| paddle/phi/api/include/compat/torch/csrc/jit/function_schema_parser.h/cpp | New 574-line schema parser implementation handling full PyTorch schema grammar |
| paddle/phi/api/include/compat/torch/csrc/jit/schema_type_parser.h/cpp | New 240-line type parser for parsing schema types including tuples, optionals, and alias annotations |
| paddle/phi/api/include/compat/torch/csrc/jit/schema_parser_defs.h | Constants and macros for schema parsing (type names, keywords, characters) |
| paddle/phi/api/include/compat/ATen/core/function_schema.h/cpp | New FunctionSchema and Argument classes to represent parsed schemas |
| paddle/phi/api/include/compat/ATen/core/jit_type_base.h | Base Type class hierarchy with SingletonOrSharedTypePtr for type system |
| paddle/phi/api/include/compat/ATen/core/jit_type.h | Concrete type implementations (TensorType, IntType, FloatType, etc.) and schema-specific types |
| paddle/phi/api/include/compat/ATen/core/alias_info.h | AliasInfo class for representing aliasing metadata in schemas |
| paddle/phi/api/include/compat/ATen/core/type_ptr.h | SingletonTypePtr wrapper for singleton type instances |
| paddle/phi/api/include/compat/ATen/core/functional.h | Utility functions (fmap, filter) for working with collections |
| paddle/phi/api/include/compat/CMakeLists.txt | Updated build to include new schema parser source files |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| const size_t idx = it->second; | ||
| if (assigned[idx]) { | ||
| throw std::runtime_error("Argument `" + name + "` is already provided"); |
There was a problem hiding this comment.
The error message "Argument name is already provided" could be more helpful by specifying whether the argument was provided both positionally and by keyword. Consider changing to: "Argument " + name + " provided both positionally and as keyword argument"
| throw std::runtime_error("Argument `" + name + "` is already provided"); | |
| throw std::runtime_error("Argument `" + name + | |
| "` provided both positionally and as keyword argument"); |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 18 out of 18 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| std::move(*alias_info)) | ||
| : nullptr), | ||
| kwarg_only_(kwarg_only) { | ||
| // this is an softly-enforced invariant for out arguments. |
There was a problem hiding this comment.
The comment contains a grammatical error. It should be "This is a softly-enforced invariant" instead of "this is an softly-enforced invariant". The article "an" should be "a" before "softly".
| // this is an softly-enforced invariant for out arguments. | |
| // This is a softly-enforced invariant for out arguments. |
|
@gouzil 测试哈,这个 PR 当下 🐁 |
| @@ -153,6 +192,10 @@ class FunctionArgs { | |||
|
|
|||
There was a problem hiding this comment.
[gh-llm test] second inline review comment: named args API 这里建议后续补充顺序稳定性说明。
| std::ostringstream oss; | ||
| oss << "FunctionArgs["; | ||
| for (size_t i = 0; i < args_.size(); ++i) { | ||
| if (i > 0) oss << ", "; |
There was a problem hiding this comment.
[gh-llm test] third inline review comment: to_string() 输出 kwargs 时可考虑 deterministic 排序,方便测试断言。
Codecov Report❌ Patch coverage is ❌ Your patch status has failed because the patch coverage (67.82%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## develop #77938 +/- ##
==========================================
Coverage ? 67.82%
==========================================
Files ? 11
Lines ? 805
Branches ? 0
==========================================
Hits ? 546
Misses ? 259
Partials ? 0 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| // distributed under the License is distributed on an "AS IS" BASIS, | ||
| // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| // See the License for the specific language governing permissions and | ||
| // limitations under the License. |
There was a problem hiding this comment.
参考 torch 的代码记得添加声明
// #The file has been adapted from pytorch project
// #Licensed under BSD-style license -
// https://github.com/pytorch/pytorch/blob/main/LICENSE
参考 ivalue.h
PR Category
Execute Infrastructure
PR Types
Improvements
Description
PaddlePaddle/community#1210 的实现
TORCH_LIBRARY注册机制添加 schema 支持, 修改了OperatorRegistry的 schema 数据结构单测调整
暂时还不行,原因可以查看 https://github.com/PaddlePaddle/Paddle/actions/runs/22081133427/job/63808517292?pr=77938 运行日志torch_library_test移出需要 gpuschema_parser_type_test用于测试 schema 解析器TODO
paddle/pir/src/core/utils.cc都用到了相同的 hash_combine 函数,可以考虑提到一个通用的地方还未验证 https://github.com/PFCCLab/paddlecodec
是否引起精度变化
否