Skip to content

Conversation

@jake-low
Copy link

@jake-low jake-low commented Sep 17, 2025

Hello! This is my first contribution to Martin.

This PR adds pack and unpack subcommands to the mbtiles CLI tool. Pack reads a directory tree of tiles and creates a new MBTiles archive, and unpack does the inverse, extracting the contents of an MBTiles archive into a directory tree. The directory tree is organized like {z}/{x}/{y}.{ext}, matching how URLs on a tile server are typically structured.

For planet-sized data, such a directory tree is impractically large (it contains millions of files, and simply traversing it can be very slow). But for developers, who are often working with smaller test datasets, it can be convenient to work with tilesets in their directory form: it lets you easily inspect individual tile files, and also permits serving a tileset with any HTTP server (such as python -m http.server or npx serve). Tools like tippecanoe and GDAL also support generating tilesets in directory form.

Mapbox's mbutil tool supports packing and unpacking MBTiles archives, but it is no longer maintained. I also found it tricky to use (it has some surprising behaviors, like only compressing pbf tiles if you pass a special flag, despite compression being required by the spec). So I thought it would be nice if Martin's mbtiles tool had the same functionality.

This PR adds two new dependencies: flate2 for compression (already required by other packages in the workspace), and walkdir for directory traversal (needed for pack).

Thanks for considering this feature! Please let me know if you have any questions or feedback.

Copy link
Member

@nyurik nyurik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks pretty good! I made a few comments. Overall, I don't see any reasons not to have this, but others might chime in?

Comment on lines +393 to +404
if !input_directory.exists() {
anyhow::bail!(
"Input directory does not exist: {}",
input_directory.display()
);
}
if !input_directory.is_dir() {
anyhow::bail!(
"Input path is not a directory: {}",
input_directory.display()
);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't the WalkDir report these errors as part of iteration?

.iter()
.filter_map(|c| {
c.to_str()
.and_then(|s| s.split('.').next())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you removing file extension this way? Isn't there a Path-fn for this? Or am I misunderstanding?

Comment on lines +464 to +468
format = match entry.path().extension().and_then(|s| s.to_str()) {
Some("pbf" | "mvt") => Some("pbf".to_string()),
Some("jpg" | "jpeg") => Some("jpg".to_string()),
Some("webp") => Some("webp".to_string()),
Some("png") => Some("png".to_string()),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you probably want to use

pub fn parse(value: &str) -> Option<Self> {
Some(match value.to_ascii_lowercase().as_str() {
"gif" => Self::Gif,
"jpg" | "jpeg" => Self::Jpeg,
to convert extension -> format -> back to format value. This way you will also support future formats

TileScheme::Tms => y,
};

mbt.insert_tiles(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doing it one by one will likely kill performance - while this is ok for the first pass, you may want to batch up files, likely even using some parallel IO reading, and once the batch is big enough, insert it. Since sqlite is in-proc, you might actually achieve the same result with one by one inserting if you set some internal sqlite flags to do insertions in bulk (i.e. transaction), but you would have to research that


// Get the format from metadata to determine file extension and compression
let format = mbt.get_metadata_value(&mut conn, "format").await?;
let (extension, decompress) = match format.as_deref() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same thing - see utils for all this


while let Some(tile) = tiles.try_next().await? {
let Some(z) = tile.zoom_level else {
log::warn!("Skipping tile with missing zoom level");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while this would be mega-rare, you might still have some sort of a flag to avoid flooding the output with duplicate warnings

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is actually possible, right?

tokio = { workspace = true, features = ["rt-multi-thread"] }
xxhash-rust.workspace = true

walkdir = "2.5.0"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this to workspace and fix newline spacing

Comment on lines +121 to +145
/// Tile ID scheme for input directory
#[arg(long, value_enum, default_value = "xyz")]
scheme: TileScheme,
},
/// Unpack an MBTiles file into a directory tree of tiles
#[command(name = "unpack")]
Unpack {
/// MBTiles file to read
input_file: PathBuf,
/// directory to write
output_directory: PathBuf,
/// Tile ID scheme for output directory
#[arg(long, value_enum, default_value = "xyz")]
scheme: TileScheme,
},
}

#[derive(Clone, Copy, PartialEq, Debug, clap::ValueEnum)]
enum TileScheme {
/// XYZ (aka. "slippy map") scheme where Y=0 is at the top
#[value(name = "xyz")]
Xyz,
/// TMS scheme where Y=0 is at the bottom
#[value(name = "tms")]
Tms,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: imo, this is a bit more ergonomic, but both are fine.

Suggested change
/// Tile ID scheme for input directory
#[arg(long, value_enum, default_value = "xyz")]
scheme: TileScheme,
},
/// Unpack an MBTiles file into a directory tree of tiles
#[command(name = "unpack")]
Unpack {
/// MBTiles file to read
input_file: PathBuf,
/// directory to write
output_directory: PathBuf,
/// Tile ID scheme for output directory
#[arg(long, value_enum, default_value = "xyz")]
scheme: TileScheme,
},
}
#[derive(Clone, Copy, PartialEq, Debug, clap::ValueEnum)]
enum TileScheme {
/// XYZ (aka. "slippy map") scheme where Y=0 is at the top
#[value(name = "xyz")]
Xyz,
/// TMS scheme where Y=0 is at the bottom
#[value(name = "tms")]
Tms,
/// Tile ID scheme for input directory
#[arg(long, value_enum, default)]
scheme: TileScheme,
},
/// Unpack an MBTiles file into a directory tree of tiles
#[command(name = "unpack")]
Unpack {
/// MBTiles file to read
input_file: PathBuf,
/// directory to write
output_directory: PathBuf,
/// Tile ID scheme for output directory
#[arg(long, value_enum, default)]
scheme: TileScheme,
},
}
#[derive(Clone, Copy, Default, PartialEq, Debug, clap::ValueEnum)]
enum TileScheme {
/// XYZ (aka. "slippy map") scheme where Y=0 is at the top
#[value(name = "xyz")]
#[default]
Xyz,
/// TMS scheme where Y=0 is at the bottom
#[value(name = "tms")]
Tms,

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

Copy link
Member

@CommanderStorm CommanderStorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work.
I have done a few comments below.

What is missing from this PR is having tests as currently there are none.

Comment on lines +412 to +439
let walker = WalkDir::new(input_directory);
let entries = walker.into_iter().filter_entry(|entry| {
let should_include = if entry.file_type().is_dir() {
// skip directories except the root unless they have numeric names
entry.depth() == 0
|| entry
.file_name()
.to_str()
.is_some_and(|s| s.parse::<u32>().is_ok())
} else {
// skip files that do not have a numeric basename
entry
.file_name()
.to_str()
.and_then(|s| s.split('.').next().map(|b| b.parse::<u32>().is_ok()))
.unwrap_or(false)
};

if !should_include {
log::info!(
"Skipping {}{}",
entry.path().display(),
if entry.file_type().is_dir() { "/" } else { "" }
);
}

should_include
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should warn on every occurance.

Lets also activate follow_links or make this an option at least 🤔

Suggested change
let walker = WalkDir::new(input_directory);
let entries = walker.into_iter().filter_entry(|entry| {
let should_include = if entry.file_type().is_dir() {
// skip directories except the root unless they have numeric names
entry.depth() == 0
|| entry
.file_name()
.to_str()
.is_some_and(|s| s.parse::<u32>().is_ok())
} else {
// skip files that do not have a numeric basename
entry
.file_name()
.to_str()
.and_then(|s| s.split('.').next().map(|b| b.parse::<u32>().is_ok()))
.unwrap_or(false)
};
if !should_include {
log::info!(
"Skipping {}{}",
entry.path().display(),
if entry.file_type().is_dir() { "/" } else { "" }
);
}
should_include
});
let mut should_warn_for_invalid_directory_names = true;
let mut should_warn_for_invalid_file_names = true;
let walker = WalkDir::new(input_directory).follow_links(true);
let entries = walker.into_iter().filter_entry(|entry| {
if entry.file_type().is_dir() {
// skip directories except the root unless they have numeric names
let filename_is_numeric_or_root = entry
.file_name()
.to_str()
.is_some_and(|s| s.parse::<u32>().is_ok() ||
entry.depth() == 0;
if should_warn_for_invalid_directory_names && !filename_is_numeric_or_root {
log::info!("Skipping directory {}/ and similar occurances. Please make sure that all directories under {} are named numericaly",
entry.path().display()
input_directory.display()
);
should_warn_for_invalid_directory_names = false;
}
filename_is_numeric_or_root
} else {
// skip files that do not have a numeric basename
let filename_is_numeric_except_extensions = entry
.file_name()
.to_str()
.and_then(|s| s.split('.').next().map(|b| b.parse::<u32>().is_ok()))
.unwrap_or(false)
if should_warn_for_invalid_file_names && !filename_is_numeric_except_extensions {
log::info!("Skipping file {} and similar occurances. Please make sure that all filesnames under {} are numeric with optional exstensions",
entry.path().display()
input_directory.display()
);
should_warn_for_invalid_file_names = false;
}
filename_is_numeric_except_extensions
}
});

Comment on lines +474 to +476
if format == Some("pbf".to_string()) {
compress = true;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t think we should compress this indiscriminately.
The tile might already be compressed => we might compress it twice.

On another hand, lets give the user to opt out of compression, or?

Comment on lines +509 to +510
// TODO: set minzoom, maxzoom, and bbox?
// either compute them, or possibly read them from {input_directory}/metadata.json
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets just compute them instead.
It is cleaner this way.

For other metadata (atrribution, ...) it is might still be worthwhile to have this.

Comment on lines +543 to +545
// Query all tiles from the database
let mut tiles = sqlx::query!("SELECT zoom_level, tile_column, tile_row, tile_data FROM tiles ORDER BY zoom_level, tile_column, tile_row")
.fetch(&mut conn);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


while let Some(tile) = tiles.try_next().await? {
let Some(z) = tile.zoom_level else {
log::warn!("Skipping tile with missing zoom level");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is actually possible, right?

@jake-low
Copy link
Author

Thanks for the reviews. I'll update the PR to address your feedback as soon as I have a chance.

@CommanderStorm CommanderStorm marked this pull request as draft September 24, 2025 11:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants