Skip to content

feat: Add support for Tika MIME Types #142

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 8 commits into from

Conversation

halostatue
Copy link
Member

@halostatue halostatue commented Apr 8, 2025

No description provided.

@halostatue halostatue changed the title tika mimetypes feat: Add support for Tika MIME Types Apr 8, 2025
This is the data half of priority sorting support.

for most of its existence, MIME::Types has supported a fairly complex
sorting algorithm that does not necessarily perform well at runtime
especially with a growing dataset of MIME types.

This experimental branch is based on the idea that we can quantize the
sorting of similar types into a bye-packed bucket. For full details of
the sort mechanism, see the corresponding experimental branch in
ruby-mime-types.

The changes here provide the necessary updates to support the new fields
(sort priority and extension priority) and to do an in-place update of
the data.
This mostly adds extensions from the Tika data.

It is not clear whether this will be acceptable because of the
difference between the MIT and Apache 2 licences, and it is far too late
to switch to the Apache 2 licence for this data since I am not the only
contributor.
@halostatue halostatue force-pushed the priority-extensions branch 2 times, most recently from 0358032 to 52c8efa Compare May 7, 2025 01:20
halostatue added a commit that referenced this pull request May 7, 2025
This mostly adds extensions from the Tika data.

It is not clear whether this will be acceptable because of the
difference between the MIT and Apache 2 licences, and it is far too late
to switch to the Apache 2 licence for this data since I am not the only
contributor.

Updated documentation for Tika parser.

Closes: #142
@halostatue halostatue force-pushed the priority-extensions branch from 52c8efa to b54825d Compare May 7, 2025 03:17
halostatue added a commit that referenced this pull request May 7, 2025
This mostly adds extensions from the Tika data.

Updated documentation for Tika parser.

Closes: #142
@halostatue halostatue force-pushed the priority-extensions branch 2 times, most recently from 63f07ed to 180c8ee Compare May 7, 2025 03:20
@halostatue halostatue mentioned this pull request May 7, 2025
@halostatue halostatue force-pushed the priority-extensions branch from 180c8ee to c07d1cd Compare May 7, 2025 03:27
halostatue added a commit that referenced this pull request May 7, 2025
This mostly adds extensions from the Tika data.

Updated documentation for Tika parser.

Closes: #142
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant