Skip to content

Add BuildHost APIs for loading an in-memory project #78303

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Apr 28, 2025

Conversation

RikkiGibson
Copy link
Member

@RikkiGibson RikkiGibson commented Apr 24, 2025

Part of dotnet/sdk#48176
Related to #78208

We need to be able to create a project (XML) in memory and run a design time build on it. Adding these APIs is a prerequisite for us doing that.

@ghost ghost added Area-IDE untriaged Issues and PRs which have not yet been triaged by a lead labels Apr 24, 2025
@RikkiGibson RikkiGibson marked this pull request as ready for review April 24, 2025 22:18
@RikkiGibson RikkiGibson requested a review from a team as a code owner April 24, 2025 22:18
@RikkiGibson
Copy link
Member Author

I'm wondering if this PR is not correctly handling batch builds, or whether that scenario is relevant.

}

// load project file
var stream = new MemoryStream(Encoding.UTF8.GetBytes(projectContent));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to tell MSBuild somewhere else that we're still encoding with UTF8? Or will we have a BOM there for it to figure out?

Copy link
Member Author

@RikkiGibson RikkiGibson Apr 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to tell MSBuild somewhere else that we're still encoding with UTF8?

That is not clear to me. It is also not clear to me how that would be happening in the ordinary on-disk project case.

Or will we have a BOM there for it to figure out?

It looks like the Unicode standard (ctrl+f for "neither required nor recommended") does not recommend using a BOM with UTF8. I think that GetBytes does not include a BOM.

The overload of XmlReader.Create we are using says:

The XmlReader scans the first bytes of the stream looking for a byte order mark or other sign of encoding. When encoding is determined, the encoding is used to continue reading the stream, and processing continues parsing the input as a stream of (Unicode) characters.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So should we have some way to thread the encoding through to avoid mojibake if the XmlReader doesn't infer it correctly?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or should we have a UTF-8 BOM precisely so it can infer in this case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The XML standard (that sure is a nice anchor name..) says:

In the absence of information provided by an external transport protocol (e.g. HTTP or MIME), it is a fatal error for an entity including an encoding declaration to be presented to the XML processor in an encoding other than that named in the declaration, or for an entity which begins with neither a Byte Order Mark nor an encoding declaration to use an encoding other than UTF-8. Note that since ASCII is a subset of UTF-8, ordinary ASCII entities do not strictly need an encoding declaration.

In other words, when the document lacks a BOM, and lacks <?xml encoding='something'?>, the standard requires UTF-8 be used to decode it.

Therefore there should be no need to add a BOM or encoding tag. I will add a comment.

@jasonmalinowski
Copy link
Member

I'm wondering if this PR is not correctly handling batch builds, or whether that scenario is relevant.

It might not be -- I think we use the batch build case if we're loading an entire solution at once, and that would only happen in an OpenSolutionAsync() case. So that might not be needed right now.

But that said, at some point we'll probably need to ask how we're going to add support to MSBuildWorkspace for these projects too. And you'll need it then.

@RikkiGibson RikkiGibson force-pushed the buildhost-in-memory-project branch from 2e973ae to 883d36b Compare April 25, 2025 22:26
@RikkiGibson
Copy link
Member Author

I was struggling to deal with the number of overloads with the same name here. So I tried to improve organization marginally by separating the public and private methods.

@@ -33,6 +33,16 @@ public async Task<RemoteProjectFile> LoadProjectFileAsync(string projectFilePath
return new RemoteProjectFile(_client, remoteProjectFileTargetObject);
}

/// <summary>Permits loading a project file which only exists in-memory, for example, for file-based program scenarios.</summary>
/// <param name="projectFilePath">A path to a project file which may or may not exist on disk. Note that an extension that is known by MSBuild, such as .csproj or .vbproj, should be used here.</param>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent warning to add! 😄

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...except this should go on the interface since that's what's actually seen by consumers.

Copy link
Member Author

@RikkiGibson RikkiGibson Apr 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good place to document it. This is the API which the project system will call. But, I'm copying to doc to IBuildHost.

Copy link
Member

@jasonmalinowski jasonmalinowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved with one concern that I'm still not entirely sure how encoding is going to be handled when we convert the string to a UTF8 byte array. Having a test that ensures we round-trip that might be good to add. I'm OK though if we want to defer checking that and merging this, with the promise we file a bug and promise to fix it. 😄

}

// load project file
var stream = new MemoryStream(Encoding.UTF8.GetBytes(projectContent));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So should we have some way to thread the encoding through to avoid mojibake if the XmlReader doesn't infer it correctly?

}

// load project file
var stream = new MemoryStream(Encoding.UTF8.GetBytes(projectContent));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or should we have a UTF-8 BOM precisely so it can infer in this case?

@@ -27,4 +29,18 @@ public async Task<ProjectFile> LoadProjectFileAsync(string path, ProjectBuildMan

return this.CreateProjectFile(project, buildManager, log);
}

public ProjectFile LoadProjectFile(string path, string projectContent, ProjectBuildManager buildManager)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public ProjectFile LoadProjectFile(string path, string projectContent, ProjectBuildManager buildManager)
public ProjectFile LoadProject(string path, string projectContent, ProjectBuildManager buildManager)

Just to match your naming elsewhere.

_ => throw ExceptionUtilities.UnexpectedValue(languageName)
};

_logger.LogInformation($"Loading {projectFilePath}");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_logger.LogInformation($"Loading {projectFilePath}");
_logger.LogInformation($"Loading an in-memory project with the path {projectFilePath}");

@@ -187,6 +196,23 @@ private async Task<int> LoadProjectFileCoreAsync(string projectFilePath, string
return _server.AddTarget(projectFile);
}

[MethodImpl(MethodImplOptions.NoInlining)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy the comment for the NoInlining use from the method above.

var log = new DiagnosticLog();
try
{
var projectCollection = new MSB.Evaluation.ProjectCollection(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I admit I don't know the significance of this not using the batch project collection, or whether this creates problems around the loggers conflicting like you saw. That said, I'm not sure how to answer that question either. If we already had an existing bug there, then perhaps that can be counted as tracking the issue and this is fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RikkiGibson RikkiGibson enabled auto-merge (squash) April 28, 2025 21:14
@RikkiGibson RikkiGibson merged commit 02a8b35 into dotnet:main Apr 28, 2025
24 of 25 checks passed
@dotnet-policy-service dotnet-policy-service bot added this to the Next milestone Apr 28, 2025
@RikkiGibson RikkiGibson deleted the buildhost-in-memory-project branch April 29, 2025 00:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-IDE untriaged Issues and PRs which have not yet been triaged by a lead VSCode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants