Skip to content

AltChunk containing docx file corrupts document #1670

@kemsky

Description

@kemsky

NPOI Version

2.7.5

File Type

  • DOCX

Reproduce Steps

If you create AltChunk you need to provide content type, for docx file it should be application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml.

This is the same content type that is used for word/document.xml part.

ContentTypeManager has the following code ContentTypeManager.cs:

        public void AddContentType(PackagePartName partName, String contentType)
        {
            bool defaultCTExists = false;
            String extension = partName.Extension.ToLower();
            if ((extension.Length == 0)
                    || (this.defaultContentType.ContainsKey(extension) && !(defaultCTExists = this.defaultContentType
                            .ContainsValue(contentType))))
                this.AddOverrideContentType(partName, contentType);
            else if (!defaultCTExists)
                this.AddDefaultContentType(extension, contentType);
        }

POI reference: ContentTypeManager.java

When package.CreatePart is called it registers default content type application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml for extension docx, then when document is saved AddContentType is called for word/document.xml part and it gets discarded because defaultContentType contains the same content type.

As a result, [Content_Types].xml does not contain <Override PartName="/word/document.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml" /> and file can not be opened.


Minimal reproduction:

    private void AppendAltChunk_Npoi(string source, string target)
    {
        using var inputStream = new FileStream(source, FileMode.Open, FileAccess.Read, FileShare.Read);
        using var xwpfDocument = new XWPFDocument(inputStream);

        var package = xwpfDocument.Package;

        var partName = new PackagePartName(new Uri("/word/word.docx", UriKind.Relative), true);

        var part = package.CreatePart(partName, XWPFRelation.DOCUMENT.ContentType);

        using var fileStream = new FileStream(target, FileMode.Create, FileAccess.ReadWrite, FileShare.Read);

        xwpfDocument.Write(fileStream);
    }

This is working example that uses DocumentFormat.OpenXml:

    private void AppendAltChunk_OpenXml(string targetFilename, string tailFilename)
    {
        using (var myDoc = WordprocessingDocument.Open(targetFilename, true))
        {
            var altChunkId = "_" + Guid.NewGuid().ToString("d");

            var mainPart = myDoc.MainDocumentPart!;

            var chunk = mainPart.AddAlternativeFormatImportPart(AlternativeFormatImportPartType.WordprocessingML, altChunkId);

            using (var fileStream = File.Open(tailFilename, FileMode.Open))
            {
                chunk.FeedData(fileStream);
            }

            var altChunk = new AltChunk();
            altChunk.Id = altChunkId;
            mainPart.Document.Body!.InsertAfter(altChunk, mainPart.Document.Body.Elements().Last());
            mainPart.Document.Save();
        }
    }

Hackish workaround: change case in mime type: Application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions