Enhancing VectorStoreWriter for better RAG support by Copilot · Pull Request #7396 · dotnet/extensions

Copilot · 2026-03-15T08:44:07Z

📍 Connect Copilot coding agent with Jira, Azure Boards or Linear to delegate work to Copilot in one click without leaving your project management tool.

…tedChunkRecord base type Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>

…toreWriter refactoring Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>

…ion property names Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>

Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>

adamsitnik

@copilot address my feedback

src/Libraries/Microsoft.Extensions.DataIngestion/Writers/IngestedChunkRecord.cs

src/Libraries/Microsoft.Extensions.DataIngestion/Writers/VectorStoreWriter.cs

...sions.AI.Templates/templates/AIChatWeb-CSharp/AIChatWeb-CSharp.Web/Services/IngestedChunk.cs

…al, throw in SetMetadata, make consts protected, remove JSON attributes Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>

adamsitnik

@copilot please address my feedback

src/Libraries/Microsoft.Extensions.DataIngestion/Writers/IngestedChunkRecord.cs

src/Libraries/Microsoft.Extensions.DataIngestion/Writers/VectorStoreWriterOptions.cs

test/Libraries/Microsoft.Extensions.DataIngestion.Tests/Writers/TestChunkRecordWithMetadata.cs

test/Libraries/Microsoft.Extensions.DataIngestion.Tests/Writers/VectorStoreWriterTests.cs

…oc reference, remove JsonPropertyName, add custom schema test Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>

Copilot · 2026-03-16T12:33:56Z

@copilot please address my feedback

All four feedback items addressed in c3c57e0: storage name consts made private, VectorStoreWriterOptions doc fixed, [JsonPropertyName] removed from test, and new CanWriteChunksWithCustomDefinition test added.

adamsitnik

@copilot address my feedback

src/Libraries/Microsoft.Extensions.DataIngestion/Writers/IngestedChunkRecord.cs

test/Libraries/Microsoft.Extensions.DataIngestion.Tests/Writers/VectorStoreWriterTests.cs

… custom storage names in test Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>

adamsitnik

@copilot please address my feedback

src/Libraries/Microsoft.Extensions.DataIngestion/Writers/IngestedChunkRecord.cs

… for consistency Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>

adamsitnik

LGTM, @roji the PR is ready for review.

adamsitnik · 2026-03-16T15:47:45Z

src/Libraries/Microsoft.Extensions.DataIngestion/Writers/IngestedChunkRecord.cs

+/// </summary>
+/// <typeparam name="TChunk">The type of the chunk content.</typeparam>
+/// <remarks>
+/// When the vector dimension count is not known at compile time, use the <see cref="CreateCollectionDefinition"/>


To other reviewers: this is very important. My idea was following:

introduce a non-sealed IngestedChunkRecord<TChunk> type that comes with all the default properties. It gives us the ability to perform the query to get chunks that point to the same document but also allows the users for an easy RAG (they don't need to provide their own type)

those who don't need any custom schema, can just use this type and call CreateCollectionDefinition to create the definition (it's required, because they need so somehow provide the dimensionCount).

those who need a custom schema, can create it and pass to VectoreStoreCollection ctor

the other way to customize the schema is to derive from IngestedChunkRecord<TChunk> and override selected properties.

OK... It has been a while since we discussed all this so I may have forgotten some considerations we talked about.

So If I understand correctly, this moves away from the previous dynamic approach (Dictionary<string, object?>) to a typed approach. Some comments:

Importantly, typed mapping with MEVD is (currently) not trimming (and therefore NativeAOT)-compatible... Serializing/deserializing a dictionary is easy enough, but doing it with a .NET type requires a source generator which we don't yet have. I think that's going to be a problem.

I admit it took me quite a while to understand how custom metadata is supposed to work (with SetMetadata()) and why; having a partly typed, partly dynamic story seems to introduce quite a bit of complexity/weirdness. Here's my understanding:

A user that wants custom metadata must extend IngestedChunkRecord (this already feels a bit heavy compared to just having a Dictionary<string, object?> as before).

On their CustomIngestedChunkRecord, they add .NET properties for the extra metadata.

But they must also override SetMetadata(), to copy the dynamic metadata properties from the incoming IngestionChunk to the strongly-typed .NET properties on CustomIngestedChunkRecord. That's some various boilerplate-y, tedious glue between the dynamic nature of IngestionChunk and the static nature of IngestedChunkRecord.

(aside from all the above, they must also call CreateCollectionDefinition() to get the VectorStoreCollectionDefinition, and mutate that to add their custom properties. But that's unrelated)

Note that their code in SetMetadata() can get out of sync... Like I can see someone adding a new static property on CustomIngestedChunkRecord, but then forgetting to update SetMetadata() (and then we silently write empty properties).
* In other words, users need to keep (1) the incoming IngestionChunk's metadata (wherever it's populated), (2) CustomIngestedChunkRecord's .NET properties, (3) SetMetadata() and (4) the record definition in sync, which seems quite brittle... With the previous, fully dynamic model only (1) and (4) needed to be kept in sync (which is the absolute minimum)

Because of all this, I'm trying to understand the value we get out of this partly static/partly dynamic design, compared to simply continuing to map Dictionary<string, object?> as before, and whether it's worth it...

src/Libraries/Microsoft.Extensions.DataIngestion/Writers/IngestedChunkRecord.cs

adamsitnik · 2026-03-16T15:50:03Z

src/Libraries/Microsoft.Extensions.DataIngestion/Writers/VectorStoreWriterOptions.cs

-    /// <summary>
-    /// Gets or sets the name of the collection. When not provided, "chunks" will be used.
-    /// </summary>
-    public string CollectionName


To other reviewers: These settings are no longer configured by the writer, they are part of the collection creation process.

adamsitnik · 2026-03-16T15:50:40Z

...sions.AI.Templates/templates/AIChatWeb-CSharp/AIChatWeb-CSharp.Web/Services/IngestedChunk.cs

-    [VectorStoreVector(VectorDimensions, DistanceFunction = VectorDistanceFunction, StorageName = "embedding")]
-    [JsonPropertyName("embedding")]
-    public string? Vector => Text;
+    [VectorStoreVector(VectorDimensions, DistanceFunction = VectorDistanceFunction, StorageName = EmbeddingStorageName)]


To other reviewers: this is the simplest way of configuring the dimension count (overriding virtual property and annotating it with the right attribute)

test/Libraries/Microsoft.Extensions.DataIngestion.Tests/Writers/TestChunkRecordWithMetadata.cs

adamsitnik · 2026-03-16T15:51:57Z

test/Libraries/Microsoft.Extensions.DataIngestion.Tests/Writers/VectorStoreWriterTests.cs

+
+        // User creates their own definition without using CreateCollectionDefinition,
+        // using custom storage names to prove they can map to a pre-existing collection schema.
+        VectorStoreCollectionDefinition definition = new()


To other reviewers: this is an example of providing custom schema (without the need to provide a dedicated mapper)

adamsitnik · 2026-03-16T15:52:50Z

src/Libraries/Microsoft.Extensions.DataIngestion/Writers/IngestedChunkRecord.cs

+/// When the vector dimension count is known at compile time, derive from this class and add
+/// the <see cref="VectorStoreVectorAttribute"/> to the <see cref="Embedding"/> property.
+/// </remarks>
+public class IngestedChunkRecord<TChunk>


To other reviewers: I am going to work on removing this generic argument very soon (I want IngestionChunk to be able to represent any input without using generic argument). But it's out of the scope of this PR.

I know you're going to remove the generic argument, but FYI the TChunk name is causing me a bit of confusion, also in IngestionChunk<TChunk> (as if it's a chunk over itself). When reading this code I wasn't sure if with IngestedChunkRecord<TChunk>, TChunk should be string or IngestionChunk<string>.

So maybe consider renaming TChunk to just T (or TContent) everywhere.

Yes, I really want to remove it and now I think I even know how ( #7404)

Please keep in mind it's Preview2 branch, so whatever gets merged does not automatically gets released to nuget.org. So I would prefer to keep TChunk here and just remove it completely in next PR.

Is the plan to simply replace TChunk with AIContent? If so, it might be good to think about this a bit together.

For one thing, IEmbeddingGenerator doesn't restrict its input to be AIContent - it can be any TInput, so restricting IngestionChunk to be over AIContent may hinder some scenarios. For example, one of the design ideas behind IEmbeddingGenerator was to allow user-specific types (e.g. Product) to be passed as input, and then the custom IEmbeddingGenerator implementation knows how to generate embeddings from the Product's different fields. I'm not sure how relevant that is for the ingestion pipeline, but maybe worth thinking about (obviously a Product cannot just be written via MEVD in any case).

That also brings the question of whether we want to store (or allow storing) the media type for binary data in the vector database. For example, for an ingestion pipeline handling images, presumably chunks will have a DataContent as their content, which has a MediaType (to distinguish PNG vs. JPG). It would make sense to have that media type in the database, so that when the image is loaded on the consumption side, we know what the raw bytes actually represent...

Anyway, some things to think about...

Is the plan to simply replace TChunk with AIContent?

I don't have a very clear plan in mind. I know that if we just replace TChunk with AIContent, plenty of things are going to stop working (for example, the embedding generation done by MEVD during upserts because I suppose it gets IEmbeddingGenerator<string> from vector store write options).

All of that could be done with some extra mapping, but one thing that is not clear to me is whether we can support a chunker that returns chunks of different types (for example text or image) and then store it into a single vector property. I suspect that text and image would use different models or at least having different dimension count.

And overall supporting the ability to return chunks of different types is one of our preview 2 goals.

roji

Hey @adamsitnik here are some first thoughts/questions about the new design... We should probably get past these before me reviewing the rest of the PR in detail. Feels like maybe we should jump into a call to discuss this stuff.

roji · 2026-03-19T11:18:03Z

src/Libraries/Microsoft.Extensions.DataIngestion/Writers/IngestedChunkRecord.cs

+/// When the vector dimension count is known at compile time, derive from this class and add
+/// the <see cref="VectorStoreVectorAttribute"/> to the <see cref="Embedding"/> property.
+/// </remarks>
+public class IngestedChunkRecord<TChunk>


I know you're going to remove the generic argument, but FYI the TChunk name is causing me a bit of confusion, also in IngestionChunk<TChunk> (as if it's a chunk over itself). When reading this code I wasn't sure if with IngestedChunkRecord<TChunk>, TChunk should be string or IngestionChunk<string>.

So maybe consider renaming TChunk to just T (or TContent) everywhere.

src/Libraries/Microsoft.Extensions.DataIngestion/Writers/IngestedChunkRecord.cs

roji · 2026-03-19T11:48:13Z

src/Libraries/Microsoft.Extensions.DataIngestion/Writers/IngestedChunkRecord.cs

+/// </summary>
+/// <typeparam name="TChunk">The type of the chunk content.</typeparam>
+/// <remarks>
+/// When the vector dimension count is not known at compile time, use the <see cref="CreateCollectionDefinition"/>


OK... It has been a while since we discussed all this so I may have forgotten some considerations we talked about.

So If I understand correctly, this moves away from the previous dynamic approach (Dictionary<string, object?>) to a typed approach. Some comments:

Importantly, typed mapping with MEVD is (currently) not trimming (and therefore NativeAOT)-compatible... Serializing/deserializing a dictionary is easy enough, but doing it with a .NET type requires a source generator which we don't yet have. I think that's going to be a problem.

I admit it took me quite a while to understand how custom metadata is supposed to work (with SetMetadata()) and why; having a partly typed, partly dynamic story seems to introduce quite a bit of complexity/weirdness. Here's my understanding:

A user that wants custom metadata must extend IngestedChunkRecord (this already feels a bit heavy compared to just having a Dictionary<string, object?> as before).

On their CustomIngestedChunkRecord, they add .NET properties for the extra metadata.

But they must also override SetMetadata(), to copy the dynamic metadata properties from the incoming IngestionChunk to the strongly-typed .NET properties on CustomIngestedChunkRecord. That's some various boilerplate-y, tedious glue between the dynamic nature of IngestionChunk and the static nature of IngestedChunkRecord.

(aside from all the above, they must also call CreateCollectionDefinition() to get the VectorStoreCollectionDefinition, and mutate that to add their custom properties. But that's unrelated)

Note that their code in SetMetadata() can get out of sync... Like I can see someone adding a new static property on CustomIngestedChunkRecord, but then forgetting to update SetMetadata() (and then we silently write empty properties).
* In other words, users need to keep (1) the incoming IngestionChunk's metadata (wherever it's populated), (2) CustomIngestedChunkRecord's .NET properties, (3) SetMetadata() and (4) the record definition in sync, which seems quite brittle... With the previous, fully dynamic model only (1) and (4) needed to be kept in sync (which is the absolute minimum)

Because of all this, I'm trying to understand the value we get out of this partly static/partly dynamic design, compared to simply continuing to map Dictionary<string, object?> as before, and whether it's worth it...

adamsitnik · 2026-03-19T13:16:14Z

Because of all this, I'm trying to understand the value we get out of this partly static/partly dynamic design, compared to simply continuing to map Dictionary<string, object?> as before, and whether it's worth it...

So far, there was only one schema (created by us), so we knew how to do the mapping. If we continue to map to Dictionary<string, object?> but also accept a custom schema, the user needs to tell us how exactly we should perform the mapping.

And once they want to do RAG, they need to re-map the dictionary using these magic names.

With strongly typed approach, they don't need to. And RAG is much easier. But at a cost of complicated metadata story (which is not very common scenario).

Let's have a call later today and discuss it.

roji · 2026-03-19T14:25:13Z

If we continue to map to Dictionary<string, object?> but also accept a custom schema, the user needs to tell us how exactly we should perform the mapping.

Isn't that still the case even after this PR? The vector collection definition still needs to be tweaked by the user to include the custom metadata properties, and these must correspond 100% to what will actually be coming in on IngestionChunk, right?

With strongly typed approach, they don't need to. And RAG is much easier. But at a cost of complicated metadata story (which is not very common scenario).

Right, I can see that.

The basic problem here - and I think the source of the complexity - is that we have a dynamic metadata scheme on the ingestion framework side (on IngestionChunk), and we're trying to shoehorn that into a static, typed scheme for MEVD.

Maybe an alternative here is to say that the built-in VectorStoreWriter only works with the universal/built-in fields (no custom metadata), and if you want custom metadata you need to do your own VectorStoreWriter; this effectively moves the complexity (like SetMetadata()) from IngestedChunkRecord into the writer. We may be able to think of making VectorStoreWriter extensible with hooks for this (again, instead of having an extensible IngestedChunkRecord with its SetMetadata() hook).

Just some ideas I haven't fully thought through yet. Let's discuss.

adamsitnik

@copilot please address my feedback

src/Libraries/Microsoft.Extensions.DataIngestion/Writers/IngestedChunkRecord.cs

…n, move SetMetadata to non-sealed VectorStoreWriter Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>

…n with provided dimension count

src/Libraries/Microsoft.Extensions.DataIngestion/Writers/VectorStoreWriter.cs

adamsitnik · 2026-03-19T17:19:12Z

src/Libraries/Microsoft.Extensions.DataIngestion/Writers/VectorStoreExtensions.cs

+    /// <returns>A vector store collection configured for ingested chunk records.</returns>
+    [RequiresDynamicCode("This API is not compatible with NativeAOT. You can implement your own IngestionChunkWriter that uses dynamic mapping via VectorStore.GetCollectionDynamic().")]
+    [RequiresUnreferencedCode("This API is not compatible with trimming. You can implement your own IngestionChunkWriter that uses dynamic mapping via VectorStore.GetCollectionDynamic().")]
+    public static VectorStoreCollection<Guid, TRecord> GetIngestionRecordCollection<TRecord, TChunk>(this VectorStore vectorStore,


To other reviewers: This is an alternative to exposing entire schema. We can just expose a factory method that does the right thing.

The advantages;

one method call instead of two

clear message what needs to happen when you need NativeAOT.

The disadvantages:

you need to know that it exists (other docs point to this method so I hope it won't be a big problem)

using two generic arguments. In the near future ([MEDI] Make the IngestionChunk non-generic #7404) it will be only one.

adamsitnik · 2026-03-19T17:23:16Z

src/Libraries/Microsoft.Extensions.DataIngestion/Writers/VectorStoreWriter.cs

+    /// Override this method in derived classes to store metadata as typed properties with
+    /// <see cref="VectorStoreDataAttribute"/> attributes.
+    /// </remarks>
+    protected virtual void SetMetadata(TRecord record, string key, object? value)


To other reviewers: So far, we were optimized for very easy ingestion. Now, the RAG is way simpler but when you need to use metadata, you need to create a derived type and handle it on your own. We throw here to avoid silent errors.

Thanks.

Below are some design thoughts I'm dumping here, not necessarily actionable; just stuff that's in my mind that we can discuss if we want.

There's still a slight nagging question in my head - if we need/want the custom metadata to be strongly-typed, does it make sense to consider making it strongly typed IngestionChunk as well? In other words, it's slightly weird that before the writing phase, the ingestion pipeilne treats metadata as a dynamic Dictionary property bag, but then at the very end suddenly needs to convert it to a static .NET type via a custom user-provided hook (especially since I'm assuming most usages of MEDI will end in VectorStoreWriter).

Though to argue against myself, since we're proposing to have two distinct types - IngestionChunk (in the pipeline) and IngestedChunkRecord (only in the MEVD writer, representing the database record), there would need to be some custom hook in any case, to transfer the custom metadata from the former to the latter. So at that point it maybe doesn't matter whether the input (IngestionChunk) is dynamic or not.

The only way this becomes really simple, is if we have a single (strongly-typed, inheritable) IngestionChunk type rather than two; this would allow us to both use it in the ingestion pipeline and to directly map it to the database in the MEVD writer. At that point no more custom data transformation hook is needed between the two types.

We need to provide an abstraction and a default implementation. Having two different types makes it possible and if for some reason our default implementation is not enough for given scenario, the user can implement their own writer.

So the idea is that all the chunk processors use IngestionChunk and just insert any metadata they want and it's the job of the writer to somehow persist this information.

roji

Here you go @adamsitnik, we can chat again on some of the stuff below if you want.

roji · 2026-03-21T07:54:26Z

src/Libraries/Microsoft.Extensions.DataIngestion/Writers/IngestedChunkRecord.cs

+/// When the vector dimension count is known at compile time, derive from this class and add
+/// the <see cref="VectorStoreVectorAttribute"/> to the <see cref="Embedding"/> property.
+/// </remarks>
+public class IngestedChunkRecord<TChunk>


Is the plan to simply replace TChunk with AIContent? If so, it might be good to think about this a bit together.

For one thing, IEmbeddingGenerator doesn't restrict its input to be AIContent - it can be any TInput, so restricting IngestionChunk to be over AIContent may hinder some scenarios. For example, one of the design ideas behind IEmbeddingGenerator was to allow user-specific types (e.g. Product) to be passed as input, and then the custom IEmbeddingGenerator implementation knows how to generate embeddings from the Product's different fields. I'm not sure how relevant that is for the ingestion pipeline, but maybe worth thinking about (obviously a Product cannot just be written via MEVD in any case).

That also brings the question of whether we want to store (or allow storing) the media type for binary data in the vector database. For example, for an ingestion pipeline handling images, presumably chunks will have a DataContent as their content, which has a MediaType (to distinguish PNG vs. JPG). It would make sense to have that media type in the database, so that when the image is loaded on the consumption side, we know what the raw bytes actually represent...

Anyway, some things to think about...

src/Libraries/Microsoft.Extensions.DataIngestion/Writers/IngestedChunkRecord.cs

src/Libraries/Microsoft.Extensions.DataIngestion/Writers/VectorStoreExtensions.cs

src/Libraries/Microsoft.Extensions.DataIngestion/Writers/IngestionChunkVectorRecord.cs

src/Libraries/Microsoft.Extensions.DataIngestion/Writers/VectorStoreWriter.cs

roji · 2026-03-21T08:49:05Z

src/Libraries/Microsoft.Extensions.DataIngestion/Writers/VectorStoreWriter.cs

+    /// Override this method in derived classes to store metadata as typed properties with
+    /// <see cref="VectorStoreDataAttribute"/> attributes.
+    /// </remarks>
+    protected virtual void SetMetadata(TRecord record, string key, object? value)


Thanks.

Below are some design thoughts I'm dumping here, not necessarily actionable; just stuff that's in my mind that we can discuss if we want.

There's still a slight nagging question in my head - if we need/want the custom metadata to be strongly-typed, does it make sense to consider making it strongly typed IngestionChunk as well? In other words, it's slightly weird that before the writing phase, the ingestion pipeilne treats metadata as a dynamic Dictionary property bag, but then at the very end suddenly needs to convert it to a static .NET type via a custom user-provided hook (especially since I'm assuming most usages of MEDI will end in VectorStoreWriter).

Though to argue against myself, since we're proposing to have two distinct types - IngestionChunk (in the pipeline) and IngestedChunkRecord (only in the MEVD writer, representing the database record), there would need to be some custom hook in any case, to transfer the custom metadata from the former to the latter. So at that point it maybe doesn't matter whether the input (IngestionChunk) is dynamic or not.

The only way this becomes really simple, is if we have a single (strongly-typed, inheritable) IngestionChunk type rather than two; this would allow us to both use it in the ingestion pipeline and to directly map it to the database in the MEVD writer. At that point no more custom data transformation hook is needed between the two types.

adamsitnik

@copilot please address my feedback

src/Libraries/Microsoft.Extensions.DataIngestion/Writers/IngestedChunkRecord.cs

src/Libraries/Microsoft.Extensions.DataIngestion/Writers/IngestionChunkVectorRecord.cs

src/Libraries/Microsoft.Extensions.DataIngestion/Writers/VectorStoreExtensions.cs

src/Libraries/Microsoft.Extensions.DataIngestion/Writers/VectorStoreWriter.cs

…ied overload, fix typo, update README - Renamed IngestedChunkRecord<TChunk> to IngestionChunkVectorRecord<TChunk> across all source, test, template, and snapshot files - Fixed typo 'additiveDefintion' -> 'additiveDefinition' in VectorStoreExtensions.cs - Added simplified non-generic GetIngestionRecordCollection overload for the common IngestionChunkVectorRecord<string> case - Added XML remarks to generic GetIngestionRecordCollection method documenting usage patterns (simple, custom metadata, custom schema) - Extended README.md with code samples for basic usage, custom metadata, and custom collection schema Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

adamsitnik

@copilot please address my feedback

src/Libraries/Microsoft.Extensions.DataIngestion/README.md

Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com> Agent-Logs-Url: https://github.com/dotnet/extensions/sessions/4dce2c93-8680-43ae-a663-bcd22316c92b

Copilot AI and others added 5 commits March 14, 2026 20:50

Initial plan

ea56d33

Refactor VectorStoreWriter to accept VectorStoreCollection, add Inges…

581bce0

…tedChunkRecord base type Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>

Update all test files, snapshot files, and template files for VectorS…

727bcd5

…toreWriter refactoring Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>

Fix warnings and test failures - correct VectorStoreCollectionDefinit…

e48fa9d

…ion property names Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>

Address code review feedback - fix error message wording

e9aa7fe

Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>

Copilot AI assigned Copilot and adamsitnik Mar 15, 2026

github-actions bot added the area-ai-templates Microsoft.Extensions.AI.Templates label Mar 15, 2026

adamsitnik added area-data-ingestion and removed area-ai-templates Microsoft.Extensions.AI.Templates labels Mar 15, 2026

adamsitnik reviewed Mar 16, 2026

View reviewed changes

Copilot stopped work on behalf of adamsitnik due to an error March 16, 2026 11:57
Job was cancelled

Copilot started work on behalf of adamsitnik March 16, 2026 11:58 View session

Address reviewer feedback: remove TKey generic, make properties virtu…

56fa7ad

…al, throw in SetMetadata, make consts protected, remove JSON attributes Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>

Copilot AI requested a review from adamsitnik March 16, 2026 12:09

Copilot finished work on behalf of adamsitnik March 16, 2026 12:09

adamsitnik reviewed Mar 16, 2026

View reviewed changes

Copilot started work on behalf of adamsitnik March 16, 2026 12:30 View session

Address second round of reviewer feedback: make consts private, fix d…

c3c57e0

…oc reference, remove JsonPropertyName, add custom schema test Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>

Copilot AI requested a review from adamsitnik March 16, 2026 12:36

Copilot finished work on behalf of adamsitnik March 16, 2026 12:36

adamsitnik reviewed Mar 16, 2026

View reviewed changes

Copilot started work on behalf of adamsitnik March 16, 2026 12:41 View session

Rename EmbeddingPropertyName to EmbeddingStorageName (protected), use…

b3153bf

… custom storage names in test Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>

Copilot AI requested a review from adamsitnik March 16, 2026 12:45

Copilot finished work on behalf of adamsitnik March 16, 2026 12:45

adamsitnik reviewed Mar 16, 2026

View reviewed changes

src/Libraries/Microsoft.Extensions.DataIngestion/Writers/IngestedChunkRecord.cs Outdated Show resolved Hide resolved

Copilot started work on behalf of adamsitnik March 16, 2026 12:50 View session

Rename private storage name consts from *PropertyName to *StorageName…

550781c

… for consistency Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>

adamsitnik marked this pull request as ready for review March 16, 2026 15:42

adamsitnik requested a review from a team as a code owner March 16, 2026 15:42

adamsitnik approved these changes Mar 16, 2026

View reviewed changes

adamsitnik requested a review from roji March 16, 2026 15:56

adamsitnik mentioned this pull request Mar 18, 2026

[MEDI] Allow Pipeline and Reader to work with any Source #7090

Closed

roji reviewed Mar 19, 2026

View reviewed changes

adamsitnik reviewed Mar 19, 2026

View reviewed changes

src/Libraries/Microsoft.Extensions.DataIngestion/Writers/IngestedChunkRecord.cs Outdated Show resolved Hide resolved

src/Libraries/Microsoft.Extensions.DataIngestion/Writers/IngestedChunkRecord.cs Outdated Show resolved Hide resolved

Copilot started work on behalf of adamsitnik March 19, 2026 15:57 View session

Rename CreateCollectionDefinition to CreateDefaultCollectionDefinitio…

f20bb41

…n, move SetMetadata to non-sealed VectorStoreWriter Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>

Copilot AI requested a review from adamsitnik March 19, 2026 16:08

Copilot finished work on behalf of adamsitnik March 19, 2026 16:08

provide a VectorStore extension method to create VectorStoreCollectio…

4558e6c

…n with provided dimension count

adamsitnik reviewed Mar 19, 2026

View reviewed changes

src/Libraries/Microsoft.Extensions.DataIngestion/Writers/VectorStoreWriter.cs Show resolved Hide resolved

adamsitnik reviewed Mar 19, 2026

View reviewed changes

adamsitnik requested a review from roji March 20, 2026 11:04

roji reviewed Mar 21, 2026

View reviewed changes

adamsitnik reviewed Mar 21, 2026

View reviewed changes

Copilot started work on behalf of adamsitnik March 21, 2026 11:47 View session

Copilot AI requested a review from adamsitnik March 21, 2026 12:01

Copilot finished work on behalf of adamsitnik March 21, 2026 12:01

adamsitnik reviewed Mar 21, 2026

View reviewed changes

src/Libraries/Microsoft.Extensions.DataIngestion/README.md Outdated Show resolved Hide resolved

Copilot started work on behalf of adamsitnik March 21, 2026 12:11 View session

Use explicit type names instead of var in README.md code samples

f34bce2

Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com> Agent-Logs-Url: https://github.com/dotnet/extensions/sessions/4dce2c93-8680-43ae-a663-bcd22316c92b

Copilot AI requested a review from adamsitnik March 21, 2026 12:13

Copilot finished work on behalf of adamsitnik March 21, 2026 12:13

adamsitnik requested a review from roji March 21, 2026 13:47

Conversation

Copilot AI commented Mar 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adamsitnik left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

adamsitnik left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI commented Mar 16, 2026

Uh oh!

adamsitnik left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

adamsitnik left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

adamsitnik left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

adamsitnik Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adamsitnik Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

roji left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adamsitnik commented Mar 19, 2026

Uh oh!

roji commented Mar 19, 2026

Uh oh!

adamsitnik left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

adamsitnik Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Copilot AI commented Mar 15, 2026 •

edited

Loading

adamsitnik Mar 16, 2026 •

edited

Loading

adamsitnik Mar 19, 2026 •

edited

Loading

adamsitnik Mar 19, 2026 •

edited

Loading