APEX Upstreaming patch series#240
Conversation
75d8b13 to
0cfd94c
Compare
MichielDerhaeg
left a comment
There was a problem hiding this comment.
Does the dg-lto-error patch need to include some tests?
We can probably submit this asap but then we need to prove that it works by itself.
Is there a precedent for adding tests to validate the testint ifra?
It makes sense. I will create test cases for this patch only. |
93eadb1 to
349226f
Compare
The LTO testsuite lacked support for testing expected link-time errors and examining assembly output from the ltrans phase. This made it difficult to write tests for LTO features that detect conflicts at link time or need to verify code generation in specific LTRANS units. This patch adds two capabilities to the LTO testsuite: The dg-lto-error directive allows tests to specify expected link-time errors. When present, lto.exp expects the link to fail and marks the test as PASS if it fails with matching diagnostics, or FAIL if the link unexpectedly succeeds. This parallels the existing dg-lto-warning and dg-lto-message directives. The new scan-ltrans-assembler procedures provide a way to examine the assembly output (.ltrans*.s files) generated during the ltrans phase. The scan-ltrans-assembler procedure checks for pattern presence, while scan-ltrans-assembler-times verifies exact occurrence counts. Future patches will make use of this support for link-time diagnostics and ltrans code generation testing. gcc/testsuite/ChangeLog: * lib/lto.exp (lto-link-and-maybe-run): Add dg_lto_has_error handling to expect link failures when dg-lto-error is present. Pass test when link fails as expected, fail when link succeeds despite errors. (lto-can-handle-directive): Add dg-lto-error to recognized directive list. (lto-execute-1): Initialize dg_lto_has_error flag. * lib/scanltrans.exp (scan-ltrans-assembler): New procedure to scan for patterns in .ltrans*.s assembly files. (scan-ltrans-assembler-times): New procedure to verify pattern count across all ltrans assembly files. * g++.dg/lto/scan-ltrans-asm-1_0.C: New test. * g++.dg/lto/scan-ltrans-asm-1_1.C: New test. * gcc.dg/lto/dg-lto-error-1_0.c: New test. * gcc.dg/lto/dg-lto-error-1_1.c: New test. * gcc.dg/lto/dg-lto-error-2_0.c: New test. * gcc.dg/lto/dg-lto-error-2_1.c: New test. * gcc.dg/lto/scan-ltrans-asm-1_0.c: New test. * gcc.dg/lto/scan-ltrans-asm-1_1.c: New test. Signed-off-by: Luis Silva <luiss@synopsys.com>
349226f to
74d9636
Compare
|
d586196 patch sent upstream: https://gcc.gnu.org/pipermail/gcc-patches/2026-May/716427.html |
dd11ae8 to
5ed749d
Compare
5ed749d to
5b8f54d
Compare
|
Should the last patch not just be part of the LTO patch? The LTO patch would be incorrect without it. Patches are fine to send save for that. |
I also thought about that. Perhaps would make more sense. |
This patch implements the backend machine description patterns and constraints required for APEX custom instruction code generation. The implementation provides RTL patterns covering all APEX instruction signatures through define_insn and define_expand patterns in arcv-apex.md. Each pattern uses UNSPEC/UNSPEC_VOLATILE to prevent unwanted optimization of side-effecting instructions while allowing pure operations to be optimized normally. Mode iterators (APEX_DEST, APEX_SRC0, APEX_SRC1) generate variants for SI/DI/SF/DF modes from a single pattern template, reducing code duplication. Constraint predicates (xAVpXD, xAVpXS, xAVpXI, xAVpXC) validate that the opcode index matches the instruction format at instruction selection time, ensuring arcv_apex_asm_mnemonic can generate correct mnemonics with format-specific suffixes when needed. gcc/ChangeLog: * config.gcc: Add arcv-builtins.o to extra_objs for RISC-V. * config/riscv/constraints.md (Os08): New constraint for 8-bit signed immediates. (xAVpXD, xAVpXS, xAVpXI, xAVpXC): New constraints for APEX opcode validation per instruction format. * config/riscv/iterators.md (APEX_DEST, APEX_SRC0, APEX_SRC1): New mode iterators for APEX instruction operands. * config/riscv/riscv-protos.h (arcv_apex_asm_mnemonic): Declare. (arcv_apex_format_enabled_p): Declare. * config/riscv/riscv.h (RISCV_APEX_H): New header guard. (enum apex_insn_format): New enum for APEX instruction formats. * config/riscv/riscv.md: Include arcv-apex.md. * config/riscv/t-riscv: Add build rules for arcv-builtins.o. * config/riscv/arcv-apex.md: New file. * config/riscv/arcv-builtins.cc: New file. Signed-off-by: Luis Silva <luiss@synopsys.com>
This patch implements the builtin registration and RTL expansion logic for APEX custom instructions, connecting the #pragma intrinsic frontend to the backend machine description patterns. Format inference analyzes function signatures to determine which encoding formats (XD, XS, XI, XC) are valid based on parameter count and types. The arcv_apex_infer_format function checks operand flags against opcode field widths to compute compatible formats, while arcv_apex_validate_format verifies user-provided formats match the inferred set. Builtin expansion follows the standard GCC pattern: arcv_apex_expand_builtin dispatches to arcv_apex_expand_builtin_direct, which prepares operands via arcv_apex_prepare_builtin_arg (checking immediate ranges for XS/XI/XC formats) before calling arcv_apex_expand_builtin_insn to emit the selected RTL pattern. Assembly .extInstruction directives are emitted at registration time, allowing the assembler to recognize and encode APEX mnemonics. gcc/ChangeLog: * config.gcc: Add arcv.o to extra_objs for RISC-V. * config/riscv/arcv-builtins.cc (arcv_apex_infer_operand_flags): New function to infer APEX_VOID/APEX_NO_SRC0/APEX_NO_SRC1 from signature. (arcv_apex_infer_format): New function to compute valid formats from operand flags and opcode. (arcv_apex_validate_format): New function to check user-provided formats against inferred formats. (arcv_apex_icode): New function to map format to insn_code. (arcv_apex_register_builtin): New function to register APEX builtin and emit assembly directive. (arcv_apex_immediate_argument_valid_p): New function to check immediate ranges for XS/XI/XC formats. (arcv_apex_prepare_builtin_arg): New function to prepare and validate builtin arguments. (arcv_apex_expand_builtin_insn): New function to emit RTL pattern. (arcv_apex_expand_builtin_direct): New function to expand direct builtin by preparing operands and emitting insn. (arcv_apex_expand_builtin): New function to dispatch builtin expansion. * config/riscv/riscv-builtins.cc (riscv_builtin_decl): Add RISCV_BUILTIN_APEX case with sorry diagnostic for LTO. (riscv_gimple_fold_builtin): Add RISCV_BUILTIN_APEX case. (riscv_expand_builtin): Likewise. * config/riscv/riscv-c.cc (riscv_check_builtin_call): Likewise. (riscv_resolve_overloaded_builtin): Likewise. * config/riscv/riscv-protos.h (enum riscv_builtin_class): Likewise. (arcv_apex_expand_builtin): Declare. (arcv_apex_emit_ext_directive): Likewise. * config/riscv/riscv-vector-builtins.h: (RVV_EXT_PARTITION_SHIFT): Increase from 1 to 2 to match RISCV_BUILTIN_SHIFT. Update partition encoding comment to reflect new bit layout. * config/riscv/riscv.h (enum APEX_OPCODE_FIELD_MAX): (APEX_FORMAT_MASK): New macro to extract format bits. (enum apex_signature_mask): New enum for operand signature bits. * config/riscv/t-riscv: Add build rules for arcv.o. * config/riscv/arcv.cc: New file. Signed-off-by: Luis Silva <luiss@synopsys.com>
This patch implements the C frontend support for APEX custom instructions
via the #pragma intrinsic directive. This allows users to bind existing
C function declarations to APEX custom instruction opcodes at compile time.
Syntax:
#pragma intrinsic(fn_name, "mnemonic", opcode, "format"...)
Where:
- fn_name: Previously declared C function to register as intrinsic
- mnemonic: Assembly instruction name (normalized to lowercase)
- opcode: Instruction opcode (0-255 for XD, 0-63 for XS, 0-31 for XI/XC)
- format: One or more of "XD", "XS", "XI", "XC", or "side_effect"
The pragma parser validates identifier syntax, looks up the function
declaration via lookup_name, extracts format flags from the pragma
arguments, and delegates to arcv_apex_register_builtin for actual
registration.
Format classes:
- XD: Default register-register format (up to 3 registers, 8-bit opcode)
- XS: Two-operand with 8-bit signed immediate (6-bit opcode)
- XI: One-operand with 12-bit signed immediate (5-bit opcode)
- XC: Accumulator format where destination is also source0 (5-bit opcode)
- side_effect: Marks instruction as volatile to prevent optimization
If no format is specified, the compiler infers it from the opcode value
and function signature (see arcv_apex_infer_format in the middle-end
patch).
Example:
int custom_add (int a, int b);
#pragma intrinsic(custom_add, "myadd", 42, "XD")
gcc/ChangeLog:
* config/riscv/riscv-c.cc (arcv_apex_lookup_function): New function
to lookup function declarations for pragma processing.
(arcv_apex_valid_identifier_p): New function to validate APEX
mnemonic identifiers.
(arcv_apex_pragma_intrinsic): New function to parse #pragma intrinsic
and register APEX builtins.
(riscv_register_pragmas): Register #pragma intrinsic.
* config/riscv/riscv-protos.h (arcv_apex_register_builtin): Declare.
* doc/extend.texi: New subsection documenting #pragma intrinsic for
APEX custom instructions.
gcc/testsuite/ChangeLog:
* g++.target/riscv/apex/apex.exp: New test driver for APEX C++ tests.
* g++.target/riscv/apex/arcv-apex-test1.C: New test.
* g++.target/riscv/apex/arcv-apex-test2.C: New test.
* g++.target/riscv/apex/arcv-apex-test3.C: New test.
* g++.target/riscv/apex/arcv-apex-test4.C: New test.
* g++.target/riscv/apex/arcv-apex-test5.C: New test.
* gcc.target/riscv/apex/apex.exp: New test driver for APEX C tests.
* gcc.target/riscv/apex/arcv-apex-err1.c: New test.
* gcc.target/riscv/apex/arcv-apex-err10.c: New test.
* gcc.target/riscv/apex/arcv-apex-err11.c: New test.
* gcc.target/riscv/apex/arcv-apex-err12.c: New test.
* gcc.target/riscv/apex/arcv-apex-err13.c: New test.
* gcc.target/riscv/apex/arcv-apex-err14.c: New test.
* gcc.target/riscv/apex/arcv-apex-err15.c: New test.
* gcc.target/riscv/apex/arcv-apex-err16-32.c: New test.
* gcc.target/riscv/apex/arcv-apex-err16-64.c: New test.
* gcc.target/riscv/apex/arcv-apex-err17-32.c: New test.
* gcc.target/riscv/apex/arcv-apex-err17-64.c: New test.
* gcc.target/riscv/apex/arcv-apex-err18.c: New test.
* gcc.target/riscv/apex/arcv-apex-err19.c: New test.
* gcc.target/riscv/apex/arcv-apex-err2.c: New test.
* gcc.target/riscv/apex/arcv-apex-err20.c: New test.
* gcc.target/riscv/apex/arcv-apex-err3.c: New test.
* gcc.target/riscv/apex/arcv-apex-err4.c: New test.
* gcc.target/riscv/apex/arcv-apex-err5.c: New test.
* gcc.target/riscv/apex/arcv-apex-err6.c: New test.
* gcc.target/riscv/apex/arcv-apex-err7.c: New test.
* gcc.target/riscv/apex/arcv-apex-err8.c: New test.
* gcc.target/riscv/apex/arcv-apex-err9.c: New test.
* gcc.target/riscv/apex/arcv-apex-test1.c: New test.
* gcc.target/riscv/apex/arcv-apex-test10.c: New test.
* gcc.target/riscv/apex/arcv-apex-test11.c: New test.
* gcc.target/riscv/apex/arcv-apex-test12.c: New test.
* gcc.target/riscv/apex/arcv-apex-test13.c: New test.
* gcc.target/riscv/apex/arcv-apex-test14.c: New test.
* gcc.target/riscv/apex/arcv-apex-test15.c: New test.
* gcc.target/riscv/apex/arcv-apex-test16.c: New test.
* gcc.target/riscv/apex/arcv-apex-test2.c: New test.
* gcc.target/riscv/apex/arcv-apex-test3.c: New test.
* gcc.target/riscv/apex/arcv-apex-test4.c: New test.
* gcc.target/riscv/apex/arcv-apex-test5.c: New test.
* gcc.target/riscv/apex/arcv-apex-test6.c: New test.
* gcc.target/riscv/apex/arcv-apex-test7.c: New test.
* gcc.target/riscv/apex/arcv-apex-test8.c: New test.
* gcc.target/riscv/apex/arcv-apex-test9.c: New test.
Signed-off-by: Luis Silva <luiss@synopsys.com>
This patch adds Link-Time Optimization support for APEX custom instructions by serializing pragma-registered builtin metadata across compilation units. Since APEX intrinsics are registered dynamically via #pragma intrinsic rather than predefined in the compiler, their metadata (name, mnemonic, opcode, format) must be explicitly streamed to survive LTO. The implementation adds a custom LTO section (LTO_section_riscv_apex) and hooks into the LTO write/read phases. During output (arcv_apex_lto_write_section), referenced APEX builtins are serialized to the LTO stream. During input (arcv_apex_lto_read_section), builtin metadata from all compilation units is deserialized and re-registered. The arcv_apex_lto_lookup_builtin function detects conflicting definitions across translation units and reports errors. However, LTO exposes a latent hash-equality inconsistency in fold-const.cc for BUILT_IN_MD and BUILT_IN_FRONTEND builtins: operand_equal_p considers two builtin FUNCTION_DECLs equal when they share DECL_BUILT_IN_CLASS and DECL_UNCHECKED_FUNCTION_CODE, but hash_operand only canonicalizes BUILT_IN_NORMAL builtins and falls back to hashing DECL_UID for other classes. When APEX builtins are re-registered across TUs with distinct FUNCTION_DECL nodes but identical builtin metadata, verify_hash_value asserts due to the hash/equality mismatch. This patch also fixes the issue by hashing non-BUILT_IN_NORMAL builtins using (DECL_BUILT_IN_CLASS, DECL_UNCHECKED_FUNCTION_CODE), matching what operand_equal_p compares. gcc/ChangeLog: * config/riscv/arcv-builtins.cc (arcv_apex_lto_lookup_builtin): New function to find existing builtin and detect conflicts. (arcv_apex_lto_register_builtin): New function to register builtin during LTO read, skipping duplicates and reporting conflicts. (arcv_apex_lto_write_section): New function to serialize referenced APEX builtins to LTO stream. (arcv_apex_lto_read_section): New function to deserialize and re-register APEX builtins from all LTO input files. * config/riscv/riscv-builtins.cc (riscv_builtin_decl): Remove RISCV_BUILTIN_APEX sorry diagnostic for LTO. * config/riscv/riscv.h (TARGET_RISCV_APEX): Define. * lto-section-in.cc (lto_section_name): Add "riscv.apex" section name. * lto-streamer-out.cc (produce_asm_for_decls): Call arcv_apex_lto_write_section. * lto-streamer.h (enum lto_section_type): Add LTO_section_riscv_apex. (arcv_apex_lto_write_section): Declare. (arcv_apex_lto_read_section): Declare. gcc/lto/ChangeLog: * lto-common.cc (read_cgraph_and_symbols): Call arcv_apex_lto_read_section after reading all input files. gcc/testsuite/ChangeLog: * gcc.target/riscv/apex/apex.exp: Add LTO test support. * gcc.target/riscv/apex/arcv-apex-lto-err1_0.c: New test. * gcc.target/riscv/apex/arcv-apex-lto-err1_1.c: New test. * gcc.target/riscv/apex/arcv-apex-lto-err2_0.c: New test. * gcc.target/riscv/apex/arcv-apex-lto-err2_1.c: New test. * gcc.target/riscv/apex/arcv-apex-lto-err3_0.c: New test. * gcc.target/riscv/apex/arcv-apex-lto-err3_1.c: New test. * gcc.target/riscv/apex/arcv-apex-lto-err4_0.c: New test. * gcc.target/riscv/apex/arcv-apex-lto-err4_1.c: New test. * gcc.target/riscv/apex/arcv-apex-lto-test1_0.c: New test. * gcc.target/riscv/apex/arcv-apex-lto-test1_1.c: New test. * gcc.target/riscv/apex/arcv-apex-lto-test2_0.c: New test. * gcc.target/riscv/apex/arcv-apex-lto-test2_1.c: New test. * gcc.target/riscv/apex/arcv-apex-lto-test3_0.c: New test. * gcc.target/riscv/apex/arcv-apex-lto-test3_1.c: New test. * gcc.target/riscv/apex/arcv-apex-lto-test4_0.c: New test. * gcc.target/riscv/apex/arcv-apex-lto-test4_1.c: New test. * gcc.target/riscv/apex/arcv-apex-lto-test5_0.c: New test. * gcc.target/riscv/apex/arcv-apex-lto-test5_1.c: New test. Signed-off-by: Luis Silva <luiss@synopsys.com>
5b8f54d to
d79379d
Compare
| [(set (match_operand:APEX_DEST 0 "register_operand" "=r,r,r") | ||
| (unspec_volatile:APEX_DEST [(match_operand:SI 1 "const_int_operand" "xAVpXS,xAVpXC,xAVpXD") | ||
| (match_operand:APEX_SRC0 2 "register_operand" "r,0,r") | ||
| (match_operand:APEX_SRC1 3 "nonmemory_operand" "Os08,I,r")] |
There was a problem hiding this comment.
try to generate some apex insn that has operand 3 a value that is large. I think this will then crash the compiler. in the expand is advised to have large predicates but in the define_insn the predicate = reunion of the constrants.
There was a problem hiding this comment.
after you do this then check how the compiler fixes the case where the immediate is large.
| [(set_attr "type" "arith,arith,arith")] | ||
| ) | ||
|
|
||
| (define_expand "riscv_arcv_apex_dest_ftype_src0_src1_v" |
There was a problem hiding this comment.
can merged with riscv_arcv_apex_dest_ftype_src0_src1
| (define_insn "riscv_arcv_apex_<APEX_DEST:mode>_ftype_<APEX_SRC0:mode>_<APEX_SRC1:mode>" | ||
| [(set (match_operand:APEX_DEST 0 "register_operand" "=r,r,r") | ||
| (unspec:APEX_DEST [(match_operand:SI 1 "const_int_operand" "xAVpXS,xAVpXC,xAVpXD") | ||
| (match_operand:APEX_SRC0 2 "register_operand" "r,0,r") |
There was a problem hiding this comment.
Here (and where else you use input operand 0) is a problem.
after reload_completed there may not be possible to have operand 0 same as operand 0 of your instruction - imagine the case where the input operand 2 is used later by another instruction. But at the same time you overwrite it as your operand 0 is output.
You need to make a split and in that split next to your instruction to add a move from reg 2 to reg 0. as this move is a set will be later optimized away if not needed.
You can look to my push for target arc64 where i have several define_insn_and_split
Signed-off-by: Luis Silva <luis.silva@globalfoundries.com>
No description provided.