Skip to content

Commit 7ff22d0

Browse files
Fix ModelProvider base-ctor virtual-dispatch anti-pattern (microsoft#10628)
Fixes microsoft#10626. ## Problem `ModelProvider`'s base constructor was reaching back into derived-class state via virtual dispatch — the classic [CA2214](https://learn.microsoft.com/dotnet/fundamentals/code-analysis/quality-rules/ca2214) anti-pattern. C# guarantees derived ctor bodies run only after `base(...)` returns, so any code reachable from `ModelProvider..ctor` that overrides `BuildBaseType()` (or other virtuals) and reads a derived-class field gets a `NullReferenceException` deep in framework code, with no obvious link back to "your derived ctor body has not run yet." This is especially harmful for explicit extension points like `BuildBaseType` — extension authors discover the contract only via runtime NREs. Two distinct call chains in `ModelProvider..ctor` triggered this anti-pattern: ### Site 1 — `AddTypeToKeep(this)` (filed in microsoft#10626) ``` ModelProvider..ctor └─ CodeModelGenerator.AddTypeToKeep(this) └─ this.Type └─ this.BaseType └─ virtual BuildBaseType() ← dispatches onto partially-constructed derived class ``` ### Site 2 — `EnsureDiscriminatorValueExpression()` Surfaced while validating the fix end-to-end against the `Azure.Provisioning.Cdn` migration. After fixing site 1 above, regen still NRE'd: ``` ModelProvider..ctor └─ if (_inputModel.BaseModel is not null) DiscriminatorValueExpression = EnsureDiscriminatorValueExpression() └─ BaseModelProvider └─ BuildBaseModelProvider └─ get_BaseType └─ virtual BuildBaseType() ← same anti-pattern, different path ``` Real stack trace from a `ProvisioningResourceProvider` derived from `ProvisioningModelProvider : ModelProvider`: ``` at Azure.Generator.Provisioning.Providers.ProvisioningResourceProvider.BuildBaseType() at TypeProvider.get_BaseType() at ModelProvider.BuildBaseModelProvider() at ModelProvider.get_BaseModelProvider() at ModelProvider.EnsureDiscriminatorValueExpression() at ModelProvider..ctor(InputModelType inputModel) ``` ## Fix Three complementary changes that together remove every ctor-time reach-into-derived-state in `ModelProvider`: ### 1. Move keep-set registration out of the constructor (site 1, root cause) Removed the `AddTypeToKeep(this)` call from `ModelProvider..ctor`. `TypeFactory.CreateModel` now performs the registration **after** `CreateModelCore` and `PreVisitModel` have completed: ```csharp // TypeFactory.cs modelProvider = CreateModelCore(model); foreach (var visitor in Visitors) modelProvider = visitor.PreVisitModel(model, modelProvider); InputTypeToModelProvider[model] = modelProvider; if (modelProvider != null) { if (model.Access == "public") CodeModelGenerator.Instance.AddTypeToKeep(modelProvider); // ← moved here ... } ``` This mirrors the existing `EnumProvider` / `TypeFactory.CreateEnum` lifecycle, where the same `if (Access == "public") AddTypeToKeep(...)` registration is already performed post-construction. ### 2. Defer FQN resolution in `AddTypeToKeep(TypeProvider)` (defense in depth) `AddTypeToKeep(TypeProvider)` previously called `type.Type.FullyQualifiedName` immediately, which is what triggered the virtual `BuildBaseType` dispatch. Now it stores the `TypeProvider` reference and resolves the FQN lazily when `AdditionalRootTypes` / `NonRootTypes` is enumerated. The keep sets are only consumed by `GeneratedCodeWorkspace.PostProcessAsync` (and unit tests), well after every provider has been fully constructed. This hardens the API against any future ctor-time caller — including third-party extensions that subclass other `TypeProvider` types and call `AddTypeToKeep` from their own constructors. The public API surface of `AddTypeToKeep(string)` and `AddTypeToKeep(TypeProvider)` is unchanged; the storage is split internally and unioned at read time. ### 3. Defer `DiscriminatorValueExpression` evaluation (site 2) Converted `DiscriminatorValueExpression` from an eager-init property assigned in the ctor to a `Lazy<ValueExpression?>` materialized on first read: ```csharp // before public ValueExpression? DiscriminatorValueExpression { get; init; } if (_inputModel.BaseModel is not null) DiscriminatorValueExpression = EnsureDiscriminatorValueExpression(); // virtual dispatch via BaseModelProvider // after private readonly Lazy<ValueExpression?> _discriminatorValueExpression; public ValueExpression? DiscriminatorValueExpression => _discriminatorValueExpression.Value; _discriminatorValueExpression = new Lazy<ValueExpression?>(() => _inputModel.BaseModel is not null ? EnsureDiscriminatorValueExpression() : null); ``` All `DiscriminatorValueExpression` consumers (`ModelProvider` ctor body for non-primary constructors, `ModelFactoryProvider`) read it during emission/serialization, far after construction. No external callers write to the property — verified across `microsoft/typespec` and `Azure/azure-sdk-for-net`. ## Tests Two regression tests in `ModelProviderTests`, both using a `DerivedModelProviderReadingOwnField` fixture whose `BuildBaseType()` override reads a derived-class field — would NRE before the fix: - `DerivedModelProviderConstructionDoesNotForceTypeEvaluation` — covers sites 1 & 2 (registration + lazy FQN). Constructs the derived provider, calls `AddTypeToKeep(TypeProvider)` explicitly, verifies FQN appears in `AdditionalRootTypes`. - `DerivedModelProviderConstructionDoesNotForceDiscriminatorEvaluation` — covers site 3. Constructs a derived provider for a model with a base + discriminator value, asserts no NRE during ctor and that `DiscriminatorValueExpression` is still computable on demand. Existing `PublicModelsAreIncludedInAdditionalRootTypes` / `InternalModelsAreNotIncludedInAdditionalRootTypes` tests continue to pass — they query the keep set after `MockHelpers.LoadMockGenerator(...)` builds the full output library, which goes through the new `TypeFactory.CreateModel` registration path. Test results: - `Microsoft.TypeSpec.Generator.Tests`: **1452 / 1452 passing** - `Microsoft.TypeSpec.Generator.ClientModel.Tests`: **1323 / 1323 passing** End-to-end validation: re-ran the `Azure.Provisioning.Cdn` regen (which initially exposed both NRE sites) with this PR's binaries overlaid. Generation now completes cleanly. ## Behavioral notes - Anyone instantiating `ModelProvider` directly via `new ModelProvider(...)` instead of `TypeFactory.CreateModel(...)` will no longer get the auto-keep behavior. This matches `EnumProvider`'s existing contract and is consistent with the framework's intended factory entry point. None of the in-tree generators (mgmt / provisioning / azure / scm) rely on the bypass path; the only direct `new ModelProvider(...)` calls are in unit tests, which assert behavior that does not depend on keep-set membership. - `DiscriminatorValueExpression` is now computed on first access. Result is identical for all current call sites; the only observable difference is timing, which is irrelevant because all readers run during emission. --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent f1d33ef commit 7ff22d0

4 files changed

Lines changed: 177 additions & 16 deletions

File tree

packages/http-client-csharp/generator/Microsoft.TypeSpec.Generator/src/CodeModelGenerator.cs

Lines changed: 66 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -160,9 +160,41 @@ public virtual void AddSharedSourceDirectory(string sharedSourceDirectory)
160160
_sharedSourceDirectories.Add(sharedSourceDirectory);
161161
}
162162

163-
internal HashSet<string> AdditionalRootTypes { get; } = [];
163+
private record KeptTypesInfo(HashSet<string> TypeNames, HashSet<TypeProvider> TypeProviders);
164164

165-
internal HashSet<string> NonRootTypes { get; } = [];
165+
private readonly KeptTypesInfo _additionalRootTypeInfo = new([], []);
166+
private readonly KeptTypesInfo _nonRootTypeInfo = new([], []);
167+
168+
private HashSet<string>? _additionalRootTypes;
169+
private HashSet<string>? _nonRootTypes;
170+
171+
/// <summary>
172+
/// The set of fully qualified type names to keep as roots. Resolved lazily so that
173+
/// <see cref="TypeProvider"/> entries added via <see cref="AddTypeToKeep(TypeProvider, bool)"/>
174+
/// are not forced to materialize their <see cref="TypeProvider.Type"/> at registration time
175+
/// (which would dispatch virtual <c>Build*</c> methods on partially constructed providers).
176+
/// </summary>
177+
internal HashSet<string> AdditionalRootTypes => _additionalRootTypes ??= MaterializeKeepSet(_additionalRootTypeInfo);
178+
179+
/// <summary>
180+
/// The set of fully qualified type names to keep as non-roots. Resolved lazily; see
181+
/// <see cref="AdditionalRootTypes"/> for rationale.
182+
/// </summary>
183+
internal HashSet<string> NonRootTypes => _nonRootTypes ??= MaterializeKeepSet(_nonRootTypeInfo);
184+
185+
private static HashSet<string> MaterializeKeepSet(KeptTypesInfo info)
186+
{
187+
if (info.TypeProviders.Count == 0)
188+
{
189+
return info.TypeNames;
190+
}
191+
var result = new HashSet<string>(info.TypeNames);
192+
foreach (var provider in info.TypeProviders)
193+
{
194+
result.Add(provider.Type.FullyQualifiedName);
195+
}
196+
return result;
197+
}
166198

167199
/// <summary>
168200
/// Adds a type to the list of types to keep.
@@ -174,21 +206,50 @@ public void AddTypeToKeep(string typeName, bool isRoot = true)
174206
{
175207
if (isRoot)
176208
{
177-
AdditionalRootTypes.Add(typeName);
209+
if (_additionalRootTypeInfo.TypeNames.Add(typeName))
210+
{
211+
_additionalRootTypes = null;
212+
}
178213
}
179214
else
180215
{
181-
NonRootTypes.Add(typeName);
216+
if (_nonRootTypeInfo.TypeNames.Add(typeName))
217+
{
218+
_nonRootTypes = null;
219+
}
182220
}
183221
}
184222

185223
/// <summary>
186224
/// Adds a type to the list of types to keep.
187225
/// </summary>
226+
/// <remarks>
227+
/// The provider's fully qualified name is resolved lazily, when the keep list is consumed during
228+
/// post-processing. This makes it safe to call this method from a <see cref="TypeProvider"/>
229+
/// constructor (including base constructors that run before the derived constructor body), since
230+
/// it does not force evaluation of <see cref="TypeProvider.Type"/> — which would dispatch virtual
231+
/// <c>Build*</c> methods on a not-yet-fully-constructed instance.
232+
/// </remarks>
188233
/// <param name="type">The type provider representing the type.</param>
189234
/// <param name="isRoot">Whether to treat the type as a root type. Any dependencies of root types will
190235
/// not have their accessibility changed regardless of the 'unreferenced-types-handling' value.</param>
191-
public void AddTypeToKeep(TypeProvider type, bool isRoot = true) => AddTypeToKeep(type.Type.FullyQualifiedName, isRoot);
236+
public void AddTypeToKeep(TypeProvider type, bool isRoot = true)
237+
{
238+
if (isRoot)
239+
{
240+
if (_additionalRootTypeInfo.TypeProviders.Add(type))
241+
{
242+
_additionalRootTypes = null;
243+
}
244+
}
245+
else
246+
{
247+
if (_nonRootTypeInfo.TypeProviders.Add(type))
248+
{
249+
_nonRootTypes = null;
250+
}
251+
}
252+
}
192253

193254
/// <summary>
194255
/// Writes additional output files (e.g. configuration schemas) after the main code generation is complete.

packages/http-client-csharp/generator/Microsoft.TypeSpec.Generator/src/Providers/ModelProvider.cs

Lines changed: 6 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -72,22 +72,17 @@ public ModelProvider(InputModelType inputModel) : base(inputModel)
7272
_inputModel = inputModel;
7373
_isMultiLevelDiscriminator = ComputeIsMultiLevelDiscriminator();
7474
_useObjectAdditionalProperties = new Lazy<bool>(ShouldUseObjectAdditionalProperties);
75-
76-
if (_inputModel.BaseModel is not null)
77-
{
78-
DiscriminatorValueExpression = EnsureDiscriminatorValueExpression();
79-
}
80-
81-
if (_inputModel.Access == "public")
82-
{
83-
CodeModelGenerator.Instance.AddTypeToKeep(this);
84-
}
8575
}
8676

8777
public bool IsUnknownDiscriminatorModel => _inputModel.IsUnknownDiscriminatorModel;
8878

8979
public string? DiscriminatorValue => _inputModel.DiscriminatorValue;
90-
public ValueExpression? DiscriminatorValueExpression { get; init; }
80+
81+
private ValueExpression? _discriminatorValueExpression;
82+
public ValueExpression? DiscriminatorValueExpression =>
83+
_inputModel.BaseModel is not null
84+
? _discriminatorValueExpression ??= EnsureDiscriminatorValueExpression()
85+
: null;
9186

9287
private IReadOnlyList<ModelProvider>? _derivedModels;
9388
public IReadOnlyList<ModelProvider> DerivedModels => _derivedModels ??= BuildDerivedModels();

packages/http-client-csharp/generator/Microsoft.TypeSpec.Generator/src/TypeFactory.cs

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -191,6 +191,11 @@ protected internal TypeFactory()
191191

192192
if (modelProvider != null)
193193
{
194+
if (model.Access == "public")
195+
{
196+
CodeModelGenerator.Instance.AddTypeToKeep(modelProvider);
197+
}
198+
194199
CSharpTypeMap[modelProvider.Type] = modelProvider;
195200
TypeProvidersByName[modelProvider.Type.Name] = modelProvider;
196201
}

packages/http-client-csharp/generator/Microsoft.TypeSpec.Generator/test/Providers/ModelProviders/ModelProviderTests.cs

Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1480,6 +1480,106 @@ public void InternalModelsAreNotIncludedInAdditionalRootTypes()
14801480
Assert.IsFalse(rootTypes.Contains("Sample.Models.MockInputModel"));
14811481
}
14821482

1483+
[Test]
1484+
public void KeepSetsReflectTypeProvidersAddedAfterFirstAccess()
1485+
{
1486+
var inputModel = InputFactory.Model("MockInputModel", access: "public");
1487+
MockHelpers.LoadMockGenerator(inputModelTypes: [inputModel]);
1488+
var provider = new DerivedModelProviderReadingOwnField(inputModel);
1489+
1490+
_ = CodeModelGenerator.Instance.AdditionalRootTypes;
1491+
_ = CodeModelGenerator.Instance.NonRootTypes;
1492+
1493+
CodeModelGenerator.Instance.AddTypeToKeep(provider);
1494+
CodeModelGenerator.Instance.AddTypeToKeep(provider, isRoot: false);
1495+
1496+
var fullyQualifiedName = provider.Type.FullyQualifiedName;
1497+
Assert.IsTrue(CodeModelGenerator.Instance.AdditionalRootTypes.Contains(fullyQualifiedName));
1498+
Assert.IsTrue(CodeModelGenerator.Instance.NonRootTypes.Contains(fullyQualifiedName));
1499+
}
1500+
1501+
// Regression test for two complementary fixes:
1502+
//
1503+
// 1. ModelProvider no longer registers itself with AddTypeToKeep from its constructor;
1504+
// registration is performed by TypeFactory.CreateModel after construction completes.
1505+
// This mirrors the EnumProvider lifecycle and prevents a virtual call chain
1506+
// (AddTypeToKeep -> TypeProvider.Type -> BaseType -> virtual BuildBaseType()) from
1507+
// being dispatched on a partially-constructed derived ModelProvider whose override
1508+
// reads derived-class fields that are still uninitialized.
1509+
//
1510+
// 2. AddTypeToKeep(TypeProvider) defers FQN resolution until the keep set is consumed,
1511+
// so even ctor-time callers cannot force premature TypeProvider.Type evaluation.
1512+
[Test]
1513+
public void DerivedModelProviderConstructionDoesNotForceTypeEvaluation()
1514+
{
1515+
var inputModel = InputFactory.Model("MockInputModel", access: "public");
1516+
MockHelpers.LoadMockGenerator(inputModelTypes: [inputModel]);
1517+
1518+
// (1) Constructing a derived ModelProvider whose BuildBaseType reads a derived field
1519+
// must not throw.
1520+
DerivedModelProviderReadingOwnField? provider = null;
1521+
Assert.DoesNotThrow(() => provider = new DerivedModelProviderReadingOwnField(inputModel));
1522+
1523+
// (2) AddTypeToKeep(TypeProvider) must not throw and the provider's FQN must appear
1524+
// once the keep set is materialized.
1525+
Assert.DoesNotThrow(() => CodeModelGenerator.Instance.AddTypeToKeep(provider!));
1526+
var rootTypes = CodeModelGenerator.Instance.AdditionalRootTypes;
1527+
Assert.IsTrue(rootTypes.Contains(provider!.Type.FullyQualifiedName));
1528+
}
1529+
1530+
private sealed class DerivedModelProviderReadingOwnField : ModelProvider
1531+
{
1532+
private readonly InputModelType _derivedInputModel;
1533+
1534+
public DerivedModelProviderReadingOwnField(InputModelType inputModel) : base(inputModel)
1535+
{
1536+
_derivedInputModel = inputModel;
1537+
}
1538+
1539+
protected override CSharpType? BuildBaseType()
1540+
{
1541+
// Reading a derived-class field that base(...) cannot have populated yet.
1542+
// If the framework forces Type evaluation during base ctor, this NREs.
1543+
_ = _derivedInputModel.DiscriminatorValue;
1544+
return base.BuildBaseType();
1545+
}
1546+
}
1547+
1548+
// Regression for the second virtual-call-in-ctor offender: ModelProvider..ctor used to
1549+
// eagerly compute DiscriminatorValueExpression, which read BaseModelProvider and thus
1550+
// virtually dispatched BuildBaseType()/BuildBaseModel() onto a partially-constructed
1551+
// derived class. Surfaced while validating the Cdn provisioning migration (the keep-set
1552+
// fix alone was not sufficient when the model has a base + discriminator value).
1553+
[Test]
1554+
public void DerivedModelProviderConstructionDoesNotForceDiscriminatorEvaluation()
1555+
{
1556+
var discriminatorEnum = InputFactory.StringEnum("kindEnum", [("One", "one"), ("Two", "two")]);
1557+
var baseInputModel = InputFactory.Model(
1558+
"BaseModel",
1559+
properties:
1560+
[
1561+
InputFactory.Property("kind", discriminatorEnum, isRequired: false, isDiscriminator: true),
1562+
]);
1563+
var derivedInputModel = InputFactory.Model(
1564+
"DerivedModel",
1565+
baseModel: baseInputModel,
1566+
discriminatedKind: "one",
1567+
properties:
1568+
[
1569+
InputFactory.Property("kind", InputFactory.EnumMember.String("One", "one", discriminatorEnum), isRequired: true, isDiscriminator: true),
1570+
]);
1571+
MockHelpers.LoadMockGenerator(inputModelTypes: [baseInputModel, derivedInputModel]);
1572+
1573+
// Constructing a derived ModelProvider whose BuildBaseType reads a derived field
1574+
// must not throw, even when the input model has a base + discriminator value.
1575+
DerivedModelProviderReadingOwnField? provider = null;
1576+
Assert.DoesNotThrow(() => provider = new DerivedModelProviderReadingOwnField(derivedInputModel));
1577+
1578+
// The discriminator expression must still be available once consumed lazily
1579+
// (callers under emission/serialization rely on it).
1580+
Assert.DoesNotThrow(() => { _ = provider!.DiscriminatorValueExpression; });
1581+
}
1582+
14831583
[TestCase(true, true, InputModelTypeUsage.Output, true, false)]
14841584
[TestCase(true, false, InputModelTypeUsage.Output, true, false)]
14851585
[TestCase(false, true, InputModelTypeUsage.Output, true, false)]

0 commit comments

Comments
 (0)