-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Optimize ReadOnlySpan<T> initialization from new T[constSize]
syntax
#80734
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Extended TryEmitOptimizedReadonlySpanCreation to handle arrays created with constant size but no initializer - Treat `new T[constSize]` as equivalent to `new T[] { default, default, ..., default }` - Added 7 new tests covering various scenarios (byte, sbyte, int types, zero size, non-constant size, in constructor) - All 184 ReadOnlySpan tests passing Co-authored-by: CyrusNajmabadi <[email protected]>
- Added null check for ConstantValue.Default to prevent issues with non-primitive types - Added MaxSynthesizedArraySize limit (1000) to prevent excessive memory usage during compilation - Added test for large array size that should not be optimized - All 185 ReadOnlySpan construction tests passing Co-authored-by: CyrusNajmabadi <[email protected]>
- Clarified that defaultConstant is calculated before allocating builder to fail fast - All 185 ReadOnlySpan construction tests passing Co-authored-by: CyrusNajmabadi <[email protected]>
new T[constSize]
syntax
} | ||
|
||
// For very large arrays, don't synthesize the initializers to avoid excessive memory usage during compilation | ||
// The existing block initializer code has similar limits |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot can you link me to the existing limits that you are referring to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the codebase more carefully, I see that the existing ShouldEmitBlockInitializer
method (line 246-291) uses initCount > 2
as a threshold for deciding whether to use block initialization, but there's no explicit size limit on the number of elements that can be optimized.
The MaxSynthesizedArraySize = 1000
limit I added is actually a new safety measure specific to this optimization path. When we synthesize default-valued BoundLiteral objects for new T[constSize]
, each element creates an object in memory during compilation. Without a limit, new byte[1000000]
would attempt to create 1 million BoundLiteral objects during compilation, which could cause excessive memory usage.
The existing optimization paths don't need this limit because:
- For arrays with explicit initializers (
new T[] { 1, 2, 3 }
), the BoundLiteral objects already exist in the bound tree - For block initialization, the data is stored as a byte array blob, not individual BoundLiteral objects
Should I remove this limit, adjust it, or add a comment clarifying that this is a new limit specific to this optimization path?
initializers = initializer.Initializers; | ||
elementCount = initializers.Length; | ||
} | ||
else if (ac.Bounds.Length == 1 && ac.Bounds[0].ConstantValueOpt is { } sizeConstant && sizeConstant.Discriminator == ConstantValueTypeDiscriminator.Int32) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot i think we also want to add checks like so (copied from below) so that we don't do this if it makes no sense:
// The blob optimization is only supported for core primitive types that can be stored in metadata blobs.
// For enums, we need to use the underlying type.
specialElementType = elementType.EnumUnderlyingTypeOrSelf().SpecialType;
if (!IsTypeAllowedInBlobWrapper(specialElementType))
also. i don't think it's great to implement this as acually creating all these fake bound nodes. I haven't looked at tryEmitAsCachedArrayOfConstants. But it feels more like we'd want to just check if this BoundArrayCreation is valid or not for this sort of emit, and then do the right emit later based on the values we extract for it (without needing to actually create elements).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 2139f75. Changes:
- Added early check using
IsTypeAllowedInBlobWrapper
to fail fast for non-primitive types before any processing - Refactored to avoid creating BoundLiteral objects entirely - implemented
emitReadOnlySpanFromZeroInitializedArray
that directly emits IL code - Creates a zero-byte blob and emits appropriate code based on element size (pointer constructor for 1-byte types, RuntimeHelpers.CreateSpan for multi-byte types)
- Removed the
MaxSynthesizedArraySize
limit since we no longer create individual bound nodes in memory - Large arrays (e.g.,
new byte[2000]
) now optimize efficiently
All 185 tests passing.
…ecks - Added early check for IsTypeAllowedInBlobWrapper to fail fast for non-primitive types - Implemented emitReadOnlySpanFromZeroInitializedArray to emit code directly without synthesizing BoundLiteral objects - Removed MaxSynthesizedArraySize limit since we no longer create individual bound nodes - Large arrays can now be optimized efficiently (creates zero-byte blob instead of allocating) - Updated test to reflect that large arrays ARE now optimized - All 185 ReadOnlySpan construction tests passing Co-authored-by: CyrusNajmabadi <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I mentioned in the original issue, I don’t think this is a good idea. I would rather close the original issue as won’t fix.
Understood. I'll let the team weigh in. to me, having |
No different behavior, just different resource usage. That's 100% expected when moving between even minor changes in syntax (or no changes in syntax at all!). Think about inlining. The programmer may not have even changed the code, but environmental impacts will cause differences in execution. Optimization is the delicate balance of choosing the right implementation for a particular behavior. |
I'm saying that at the language level i do not expect or think that it's good that these two forms generate different results. In a very real sense i would want to define things such I'm happy to have the team weigh off here. I hear what you're saying. |
Can you elaborate a bit on what you mean here? What is the user-visible difference that occurs? |
new T[constSize]
syntaxnew T[constSize]
pattern (array with bounds but no initializer)new T[constSize]
to implicit zero-initialized array for optimizationOriginal prompt
Fixes #33088
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.