Skip to content

Commit 41ae7ff

Browse files
authored
Add DEVELOPMENT.md doc (#1717)
This introduces a document describing how to get started as a contributor to Mountpoint. This partially replaces existing internal documentation that we have. By moving it into the repository, it is: - In an expected location - Available to internal and external contributors alike - Available for both developers but also AI agents should a developer want to make use of AI tooling ### Does this change impact existing behavior? No, project docs change only. ### Does this change need a changelog entry? Does it require a version change? No, docs change only. --- By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the [Developer Certificate of Origin (DCO)](https://developercertificate.org/). --------- Signed-off-by: Daniel Carl Jones <djonesoa@amazon.com>
1 parent 726b0b6 commit 41ae7ff

1 file changed

Lines changed: 169 additions & 0 deletions

File tree

doc/DEVELOPMENT.md

Lines changed: 169 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,169 @@
1+
# Developer Guide
2+
3+
This guide introduces developers to the Mountpoint for Amazon S3 project and covers common development tasks.
4+
You should be reading this document if you are new to the project
5+
and would like to be familiar with its purpose and development.
6+
7+
## What is Mountpoint for Amazon S3?
8+
9+
Mountpoint for Amazon S3 (aka. Mountpoint, mountpoint-s3) is a high-throughput file client
10+
that mounts Amazon S3 buckets as local file systems.
11+
You can mount the filesystem and see the objects in your bucket as files,
12+
with a directory structure created by interpreting the `/` delimiter.
13+
14+
Take a look at the root `README.md` for an overview of the project.
15+
16+
More specifically for developers, Mountpoint is a Linux FUSE file system.
17+
Using FUSE, Mountpoint can run a file system in userspace rather than directly in the Linux kernel.
18+
The kernel dispatches file system requests it receives to Mountpoint,
19+
which translates file operations like `open` and `read` into S3 object API calls.
20+
This provides applications access to S3 through a familiar file interface.
21+
22+
As a concrete example, customers are able to write the following Python code to interact with an S3 object:
23+
24+
```python
25+
with open('/mnt/bucket/example/file.txt', 'r') as f:
26+
file_content = f.read()
27+
print(file_content)
28+
```
29+
30+
### Limitations
31+
32+
Mountpoint isn't always the right fit for every solution involving files and S3.
33+
34+
Mountpoint avoids implementing support for operations that do not have a good equivalent in the S3 API.
35+
For instance, it does not support random writing to files since there's no good way to do that
36+
without needing to download significant amounts of data and reupload it.
37+
38+
When onboarding to the project, reading [Mountpoint's semantics doc](./SEMANTICS.md)
39+
will help highlight the challenges faced in building this translation layer.
40+
41+
### Additional resources
42+
43+
* Jeff Barr's blog post about Mountpoint: https://aws.amazon.com/blogs/aws/mountpoint-for-amazon-s3-generally-available-and-ready-for-production-workloads/
44+
* Section 2 of the paper "To FUSE or Not to FUSE: Performance of User-Space File Systems" has
45+
a nice introduction to FUSE with diagrams: https://www.usenix.org/system/files/conference/fast17/fast17-vangoor.pdf
46+
* Linux FUSE documentation: https://github.com/torvalds/linux/blob/master/Documentation/filesystems/fuse/fuse.rst
47+
* Definition of the FileSystem trait implemented by Mountpoint: https://docs.rs/fuser/latest/fuser/trait.Filesystem.html
48+
* Equivalent definition of the FileSystem trait in [libfuse],
49+
the reference implementation for the userspace FUSE daemon: https://libfuse.github.io/doxygen/structfuse__lowlevel__ops.html
50+
* FUSE protocol defined in the Linux kernel. Git blame can help provide additional context. https://github.com/torvalds/linux/blame/master/include/uapi/linux/fuse.h
51+
* Mountpoint's semantic documentation: https://github.com/awslabs/mountpoint-s3/blob/main/doc/SEMANTICS.md
52+
53+
## Architecture & Code Organization
54+
55+
Mountpoint for Amazon S3 is organized as a Rust workspace with six crates,
56+
each handling a specific layer of functionality.
57+
58+
### Crate Structure
59+
60+
```
61+
mountpoint-s3/ # Main binary crate (CLI, mount logic)
62+
mountpoint-s3-fs/ # Core filesystem implementation
63+
mountpoint-s3-client/ # S3 API client
64+
mountpoint-s3-fuser/ # FUSE bindings (fork of fuser)
65+
mountpoint-s3-crt/ # AWS Common Runtime bindings
66+
mountpoint-s3-crt-sys/ # Low-level CRT system bindings
67+
```
68+
69+
**Dependency Hierarchy:**
70+
```
71+
mountpoint-s3 # Main binary crate (CLI, mount logic)
72+
└── mountpoint-s3-fs # Filesystem implementation
73+
├── mountpoint-s3-client # SDK-like S3 client
74+
│ └── mountpoint-s3-crt # High-level AWS Common Runtime (CRT) bindings
75+
│ └── mountpoint-s3-crt-sys # Low-level CRT system bindings
76+
└── mountpoint-s3-fuser # FUSE bindings (fork of fuser)
77+
```
78+
79+
### Layers
80+
81+
**Customer-facing Layer (`mountpoint-s3`)**
82+
- CLI argument parsing and configuration
83+
- Main entry point (`mount-s3` binary)
84+
85+
**Filesystem Layer (`mountpoint-s3-fs`)**
86+
- FUSE filesystem implementation
87+
- File/directory abstraction over S3 objects
88+
- Caching and prefetching logic
89+
- Inode management and metadata handling
90+
91+
**S3 Client Layer (`mountpoint-s3-client`)**
92+
- Presents an SDK-like interface for talking to S3
93+
- Implements construction of S3 requests like HeadObject, DeleteObject
94+
- Provides streaming abstractions for reading and writing to S3
95+
96+
**FUSE Bindings (`mountpoint-s3-fuser`)**
97+
- Fork of the `fuser` crate with Mountpoint-specific optimizations
98+
- The crate README.md provides information on how it's maintained: https://github.com/awslabs/mountpoint-s3/blob/main/mountpoint-s3-fuser/README.md#fork-maintenance
99+
- Low-level FUSE protocol handling
100+
- Kernel interface for filesystem operations
101+
102+
**AWS Common Runtime (`mountpoint-s3-crt`, `mountpoint-s3-crt-sys`)**
103+
- Rust bindings for AWS Common Runtime (CRT)
104+
- `mountpoint-s3-crt` aims to provide a 'safe', idiomatic interface to the bindings in `mountpoint-s3-crt-sys`.
105+
106+
## Development Environment Setup
107+
108+
You will need the Rust and C toolchains installed.
109+
The `docs/INSTALL.md` has a section on building from source which can get you started.
110+
For running tests, you should [install cargo-nextest](https://nexte.st/docs/installation/pre-built-binaries/).
111+
112+
You will need a Linux environment that has FUSE support.
113+
114+
You should also have AWS credentials available for testing.
115+
Short-term AWS credentials are recommended.
116+
As an internal contributor, there are options recommended by the team owning Mountpoint.
117+
118+
## Working with the repository
119+
120+
Mountpoint for Amazon S3 is primarily developed on GitHub,
121+
via its public open-source repository: https://github.com/awslabs/mountpoint-s3.
122+
For new feature launches that depend on AWS service features that are not yet public,
123+
the Mountpoint team works with an internal fork of the repository.
124+
125+
Many team members are using [Visual Studio Code](https://code.visualstudio.com/) to work with the project.
126+
127+
The project contains a `Makefile` providing quick ways to run common tasks
128+
while excluding the `fuser` crate which is not typically being modified.
129+
For example, `make pre-pr-check` runs a group of commands such as `cargo fmt` while excluding the `fuser` crate.
130+
131+
### Running Unit Tests
132+
133+
Unit tests are defined either in the modules themselves inside a `test` mod block,
134+
or in the `tests/` directory of the crate.
135+
136+
In this project, we are using [nextest](https://nexte.st/) as our preferred test runner.
137+
138+
Tests can be run with something as simple as the following commands.
139+
140+
```bash
141+
# Run all unit tests
142+
cargo nextest run
143+
144+
# Run only the S3 client unit tests
145+
cargo nextest run -p mountpoint-s3-client
146+
```
147+
148+
### Running Integration Tests
149+
150+
Mountpoint has a suite of integration tests, covering different components
151+
such as the S3 client and end-to-end as a mounted FUSE file system.
152+
153+
The integration tests require an AWS Account with resources provisioned, such as S3 buckets and IAM roles.
154+
For internal contributors, there are setup instructions available internally.
155+
156+
> TODO: Add details on what resources need to be created.
157+
158+
Integration tests are gated behind Rust compile-time feature flags.
159+
The list of these may change over time.
160+
161+
The command below shows how to run the integration tests including FUSE-based tests and tests that require S3.
162+
163+
```bash
164+
cargo nextest run --features fuse_tests,s3_tests
165+
```
166+
167+
## Contributing Guidelines
168+
169+
For detailed contribution information, review [CONTRIBUTING.md](doc/CONTRIBUTING.md).

0 commit comments

Comments
 (0)