Skip to content

feat: Add support for Tika MIME Types #142

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 8 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
470 changes: 177 additions & 293 deletions CHANGELOG.md

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions Manifest.txt
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ data/mime.encoding.column
data/mime.flags.column
data/mime.friendly.column
data/mime.pext.column
data/mime.spri.column
data/mime.use_instead.column
data/mime.xrefs.column
lib/mime-types-data.rb
Expand Down
25 changes: 16 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,8 @@
- home :: https://github.com/mime-types/mime-types-data/
- issues :: https://github.com/mime-types/mime-types-data/issues
- code :: https://github.com/mime-types/mime-types-data/
- changelog :: https://github.com/mime-types/mime-types-data/blob/main/CHANGELOG.md
- changelog ::
https://github.com/mime-types/mime-types-data/blob/main/CHANGELOG.md

## Description

Expand All @@ -20,7 +21,8 @@ provided in mime-types-data contains detailed information about MIME entities.
There are many types defined by RFCs and vendors, so the list is long but
invariably; don't hesitate to offer additional type definitions for
consideration. MIME type definitions found in mime-types are from RFCs, W3C
recommendations, the [IANA Media Types registry][registry], and user
recommendations, the [IANA Media Types registry][registry], the
[Apache httpd registry][httpd], the [Apache Tika media registry][tika] and user
contributions. It conforms to RFCs 2045 and 2231.

### Data Formats Supported in this Registry
Expand Down Expand Up @@ -51,24 +53,29 @@ This registry contains the MIME media types in four formats:

## mime-types-data Modified Semantic Versioning

mime-types-data uses a heavily modified [Semantic Versioning][semver] scheme to
indicate that the data formats compatibility based on a `SCHEMA` version and the
date of the data update: `SCHEMA.YEAR.MONTHDAY`.
mime-types-data uses a [Semantic Versioning][semver] scheme heavily modified
with [Calendar Versioning][calver] aspects to indicate that the data formats
compatibility based on a `SCHEMA` version and the date of the data update:
`SCHEMA.YEAR.MONTHDAY`.

1. If an incompatible data format change is made to any of the supported
formats, `SCHEMA` will be incremented. The current `SCHEMA` is 3, supporting
the YAML, JSON, and columnar formats required for Ruby mime-types 3.0.
the YAML, JSON, columnar, and mini-mime formats required for Ruby mime-types
3.0.

2. When the data is updated, the `YEAR.MONTHDAY` combination will be updated. An
update on the last day of October 2015 would be written as `2015.1031`,
resulting in the full version of `3.2015.1031`.
update on the last day of October 2025 would be written as `2025.1031`,
resulting in the full version of `3.2025.1031`.

3. If multiple versions of the data need to be released on the same day due to
error, there will be an additional `REVISION` field incremented on the end of
the version. Thus, if three revisions need to be published on October 31st,
2015, the last release would be `3.2015.1031.2` (remember that the first
release has an implied `0`.)

[registry]: https://www.iana.org/assignments/media-types/media-types.xhtml
[registry]: https://www.iana.org/assignments/media-types/media-types.xml
[semver]: http://semver.org/
[minimime]: https://github.com/discourse/mini_mime
[httpd]: https://svn.apache.org/repos/asf/httpd/httpd/trunk/docs/conf/mime.types
[tika]: https://github.com/apache/tika/blob/main/tika-core/src/main/resources/org/apache/tika/mime/tika-mimetypes.xml
[calver]: https://calver.org
20 changes: 15 additions & 5 deletions Rakefile
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,7 @@ require "rubygems"
require "hoe"
require "rake/clean"

$LOAD_PATH.unshift("lib")
$LOAD_PATH.unshift("support")
$LOAD_PATH.unshift("lib", "support")

Hoe.plugin :halostatue

Expand All @@ -29,7 +28,7 @@ Hoe.spec "mime-types-data" do

extra_dev_deps << ["hoe", "~> 4.0"]
extra_dev_deps << ["hoe-halostatue", "~> 2.0"]
extra_dev_deps << ["mime-types", ">= 3.4.0", "< 4"]
extra_dev_deps << ["mime-types", ">= 3.7.0.pre2", "< 4"]
extra_dev_deps << ["nokogiri", "~> 1.6"]
extra_dev_deps << ["rake", ">= 10.0", "< 14"]
extra_dev_deps << ["standard", "~> 1.0"]
Expand All @@ -42,11 +41,17 @@ namespace :mime do
IANARegistry.download(to: args.destination)
end

desc "Download the current MIME type configuration from Apache."
desc "Download the current MIME type configuration from Apache httpd."
task :apache, [:destination] do |_, args|
require "apache_mime_types"
ApacheMIMETypes.download(to: args.destination)
end

desc "Download the current MIME type configuration from Apache Tika."
task :tika, [:destination] do |_, args|
require "tika_mime_types"
TikeMIMETypes.download(to: args.destination)
end
end

task :version do
Expand Down Expand Up @@ -81,13 +86,18 @@ namespace :release do
end
end

desc "Default conversion from YAML to JSON and Columnar"
desc "Full data conversion for release"
task :convert do
require "prepare_release"

PrepareRelease.new.convert_types
end

task "convert:upgrade" do
require "convert"
Convert.from_yaml_to_yaml
end

Rake::Task["gem"].prerequisites.unshift("convert")
Rake::Task["gem"].prerequisites.unshift("git:manifest")
Rake::Task["gem"].prerequisites.unshift("gemspec")
Expand Down
18 changes: 3 additions & 15 deletions data/content_type_mime.db

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading