Skip to content

Updating transformer priority for libreoffice updates all officeToImageViaPdf target media priorities #1155

@morgan-patou

Description

@morgan-patou

# Issue summary

When trying to update the priority of the libreoffice docx-to-pdf transformation path, it overrides the priority for all officeToImageViaPdf transformation paths as well.

From my point of view, there are 2 sub-problems with that issue:

  1. Setting a priority on a TransformerA shouldn't update the configuration of another TransformerB
  2. Setting a priority on a very specific source and target media type shouldn't cause other media types to be updated

(I assume officeToImageViaPdf uses libreoffice in its pipeline, but still)

And there are 2 other sub-sub-problems that I could observe during the testing:

  1. It's apparently not possible to use overrideSupported for officeToImageViaPdf
  2. supportedDefaults is ignored when deployed on ACS side it seems, while it's taken into account with the same json deployed on the Transform side

# ACS & Transform versions tested

The same behavior/issue is seen on both of these stacks:

  • Alfresco Community Repo 23.4.1 // Transform Core AIO 5.1.5
  • Alfresco Community Repo 25.2.0 // Transform Core AIO 5.2.2

Most probably it is affecting a lot of other versions before and in the middle of these two.

# Environment details

To replicate the issue, I used the Alfresco ACS Deployment GitHub Project:

  1. Pulled the latest updates from the project
  2. Created my own image to include the Order Of The Bee v1.2.3.0 addons, no other changes, simply adding this 1 addon in the custom image
  3. Adding a volume for the ACS and Transform Core AIO, to be able to deploy my configuration for the transformers
  4. Updating the transformer configuration file (c.f. Cases below) and restarting/redeploying to have the config applied

Below, you will see that I tried to deploy the transformer configuration on both ACS and Transform sides.

When deploying on ACS side, the docker compose was modified with a new volume:

services:
  alfresco:
    #image: docker.io/alfresco/alfresco-content-repository-community:25.2.0
    image: dbi/alfresco-community:25.2.0
    ...
    volumes:
      - ./repository/1-transform-custom.json:/usr/local/tomcat/shared/classes/alfresco/extension/transform/pipelines/1-transform-custom.json

When deploying on Transform side, the docker compose was modified with a new environment variable and a new volume:

services:
  ...
  transform-core-aio:
    image: alfresco/alfresco-transform-core-aio:5.2.2
    environment:
      TRANSFORMER_ROUTES_ADDITIONAL_PDFOVERRIDES: "/2-transform-custom.json"
    ...
    volumes:
      - ./transform/2-transform-custom.json:/2-transform-custom.json

When one side was enabled (e.g. ACS), the other side was disabled/commented (e.g. Transform) and vice-versa.

# Issue background

At a customer using ACS 23.4.1, there was a pretty strange behavior in regards to the transformers. For some specific documents (docx), the Share previews were never generated properly after using the Edit in Microsoft Office action. For these documents, the Repository was always trying to use the officeToImageViaPdf transformer. For that transform pipeline, it started with using the libreoffice (docx-pdf), and the it was using pdfrenderer (pdf-png) before finally trying to use imagemagick (png-pdf) to go back to a PDF (so doing docx-to-pdf-to-png-to-pdf). That last step on imagemagick was always failing since it's not a supported source and target.

Inside the same repository, if you tried to download the impacted document and created a new node with the exact same source file as well as for other documents (same mimetype, confirmed via the JavaScript Console dump && Node Browser details), it was properly using the libreoffice transformer, doing only the docx-to-pdf transformation and stopping there.

So, while trying to debug that issue, I wanted to change the priority of the libreoffice docx-to-pdf path, so that I could try to force all transformations for the Share previews to use libreoffice only. But then I reached the issue I'm describing here.

# Issue details

As mentioned, the behavior is the same when using 23.4.1 or 25.2.0. I will list here multiple configuration and show the actual result vs my expected one.

### Case N°1 - overrideSupported on specific source and target for libreoffice

{
  "overrideSupported": [
    {
      "transformerName": "libreoffice",
      "sourceMediaType": "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
      "targetMediaType": "application/pdf",
      "priority": 49,
      "maxSourceSizeBytes": -1
    }
  ]
}

Result for libreoffice:

libreoffice:
  From: application/vnd.openxmlformats-officedocument.wordprocessingml.document
    To: application/pdf, Priority: 49, Size limit: -1
    To: application/msword, Priority: 50, Size limit: -1
    To: application/rtf, Priority: 50, Size limit: -1
    To: text/html, Priority: 50, Size limit: -1
    To: application/vnd.oasis.opendocument.text, Priority: 50, Size limit: -1

Expected? Yes

Result for officeToImageViaPdf:

officeToImageViaPdf:
  From: application/vnd.openxmlformats-officedocument.wordprocessingml.document
    To: image/ief, Priority: 49, Size limit: -1
    To: image/cgm, Priority: 49, Size limit: -1
    ...
    To: image/png, Priority: 49, Size limit: -1
    ...
    To: image/x-raw-fuji, Priority: 49, Size limit: -1
    To: image/x-raw-panasonic, Priority: 49, Size limit: -1

Expected? No... All 32 target media are set with "priority: 49" while they should have the default (50)

### Case N°2 - overrideSupported on specific source and target for libreoffice and officeToImageViaPdf

{
  "overrideSupported": [
    {
      "transformerName": "libreoffice",
      "sourceMediaType": "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
      "targetMediaType": "application/pdf",
      "priority": 49,
      "maxSourceSizeBytes": -1
    },
    {
      "transformerName": "officeToImageViaPdf",
      "sourceMediaType": "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
      "targetMediaType": "image/png",
      "priority": 51,
      "maxSourceSizeBytes": -1
    }
  ]
}

Result for libreoffice: same as Case N°1

Expected? Yes

Result for officeToImageViaPdf: same as Case N°1 and these errors are being displayed in the logs:

## When deployed on ACS:
alfresco-1  | 2025-10-24T08:12:01,058 [] WARN  [content.transform.LocalTransformServiceRegistry] [QuartzScheduler_Worker-2] Unable to process "overrideSupported": [{"transformerName": "officeToImageViaPdf", "sourceMediaType": "application/vnd.openxmlformats-officedocument.wordprocessingml.document", "targetMediaType": "image/png", "maxSourceSizeBytes": "-1", "priority": "51"}]. Read from file shared/classes/alfresco/extension/transform/pipelines/1-transform-custom.json
alfresco-1  | 2025-10-24T08:12:01,057 [] WARN  [content.transform.LocalTransformServiceRegistry] [QuartzScheduler_Worker-1] Unable to process "overrideSupported": [{"transformerName": "officeToImageViaPdf", "sourceMediaType": "application/vnd.openxmlformats-officedocument.wordprocessingml.document", "targetMediaType": "image/png", "maxSourceSizeBytes": "-1", "priority": "51"}]. Read from file shared/classes/alfresco/extension/transform/pipelines/1-transform-custom.json

## When deployed on Transform Core AIO:
transform-core-aio-1  | 2025-10-24T09:05:07.905Z  WARN 1 --- [cTaskExecutor-1] o.a.t.base.registry.TransformRegistry    : Unable to process "overrideSupported": [{"transformerName": "officeToImageViaPdf", "sourceMediaType": "application/vnd.openxmlformats-officedocument.wordprocessingml.document", "targetMediaType": "image/png", "maxSourceSizeBytes": "-1", "priority": "51"}]. Read from 2-transform-custom.json

Expected? No... All 32 target media are set with "priority: 49" while they should have the default value (50) except to image/png which should be the only one with the updated value (51). ACS/Transform aren't able to load/use the configuration for officeToImageViaPdf.

### Case N°3 - overrideSupported on specific source and target for libreoffice and supportedDefaults for officeToImageViaPdf

{
  "supportedDefaults": [
    {
      "transformerName": "officeToImageViaPdf",
      "sourceMediaType": "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
      "priority": 51,
      "maxSourceSizeBytes": -1
    }
  ],
  "overrideSupported": [
    {
      "transformerName": "libreoffice",
      "sourceMediaType": "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
      "targetMediaType": "application/pdf",
      "priority": 49,
      "maxSourceSizeBytes": -1
    }
  ]
}

Result for libreoffice: same as Case N°1

Expected? Yes

Result for officeToImageViaPdf: same as Case N°1 (no warnings in the logs this time)

Expected? No... All 32 target media are set with "priority: 49" while they should have the updated value (51).

### Case N°4 - supportedDefaults on specific source for libreoffice and officeToImageViaPdf

{
  "supportedDefaults": [
    {
      "transformerName": "officeToImageViaPdf",
      "sourceMediaType": "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
      "priority": 51,
      "maxSourceSizeBytes": -1
    },
    {
      "transformerName": "libreoffice",
      "sourceMediaType": "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
      "priority": 49,
      "maxSourceSizeBytes": -1
    }
  ]
}

Result for libreoffice:

  • When deployed on ACS side: This configuration is ignored
  • When deployed on Transform side: All 5 targets types are with "priority: 49"

Expected?

  • When deployed on ACS side: No... It shouldn't be ignored
  • When deployed on Transform side: Yes

Result for officeToImageViaPdf:

  • When deployed on ACS side: This configuration is ignored
  • When deployed on Transform side: same as Case N°1 (no warnings in the logs this time)

Expected?

  • When deployed on ACS side: No... It shouldn't be ignored
  • When deployed on Transform side: No... All 32 target media are set with "priority: 49" while they should have the updated value (51).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions