|
| 1 | +# Developer Guide |
| 2 | + |
| 3 | +This guide introduces developers to the Mountpoint for Amazon S3 project and covers common development tasks. |
| 4 | +You should be reading this document if you are new to the project |
| 5 | +and would like to be familiar with its purpose and development. |
| 6 | + |
| 7 | +## What is Mountpoint for Amazon S3? |
| 8 | + |
| 9 | +Mountpoint for Amazon S3 (aka. Mountpoint, mountpoint-s3) is a high-throughput file client |
| 10 | +that mounts Amazon S3 buckets as local file systems. |
| 11 | +You can mount the filesystem and see the objects in your bucket as files, |
| 12 | +with a directory structure created by interpreting the `/` delimiter. |
| 13 | + |
| 14 | +Take a look at the root `README.md` for an overview of the project. |
| 15 | + |
| 16 | +More specifically for developers, Mountpoint is a Linux FUSE file system. |
| 17 | +Using FUSE, Mountpoint can run a file system in userspace rather than directly in the Linux kernel. |
| 18 | +The kernel dispatches file system requests it receives to Mountpoint, |
| 19 | +which translates file operations like `open` and `read` into S3 object API calls. |
| 20 | +This provides applications access to S3 through a familiar file interface. |
| 21 | + |
| 22 | +As a concrete example, customers are able to write the following Python code to interact with an S3 object: |
| 23 | + |
| 24 | +```python |
| 25 | +with open('/mnt/bucket/example/file.txt', 'r') as f: |
| 26 | + file_content = f.read() |
| 27 | + print(file_content) |
| 28 | +``` |
| 29 | + |
| 30 | +### Limitations |
| 31 | + |
| 32 | +Mountpoint isn't always the right fit for every solution involving files and S3. |
| 33 | + |
| 34 | +Mountpoint avoids implementing support for operations that do not have a good equivalent in the S3 API. |
| 35 | +For instance, it does not support random writing to files since there's no good way to do that |
| 36 | +without needing to download significant amounts of data and reupload it. |
| 37 | + |
| 38 | +When onboarding to the project, reading [Mountpoint's semantics doc](./SEMANTICS.md) |
| 39 | +will help highlight the challenges faced in building this translation layer. |
| 40 | + |
| 41 | +### Additional resources |
| 42 | + |
| 43 | +* Jeff Barr's blog post about Mountpoint: https://aws.amazon.com/blogs/aws/mountpoint-for-amazon-s3-generally-available-and-ready-for-production-workloads/ |
| 44 | +* Section 2 of the paper "To FUSE or Not to FUSE: Performance of User-Space File Systems" has |
| 45 | + a nice introduction to FUSE with diagrams: https://www.usenix.org/system/files/conference/fast17/fast17-vangoor.pdf |
| 46 | +* Linux FUSE documentation: https://github.com/torvalds/linux/blob/master/Documentation/filesystems/fuse/fuse.rst |
| 47 | +* Definition of the FileSystem trait implemented by Mountpoint: https://docs.rs/fuser/latest/fuser/trait.Filesystem.html |
| 48 | +* Equivalent definition of the FileSystem trait in [libfuse], |
| 49 | + the reference implementation for the userspace FUSE daemon: https://libfuse.github.io/doxygen/structfuse__lowlevel__ops.html |
| 50 | +* FUSE protocol defined in the Linux kernel. Git blame can help provide additional context. https://github.com/torvalds/linux/blame/master/include/uapi/linux/fuse.h |
| 51 | +* Mountpoint's semantic documentation: https://github.com/awslabs/mountpoint-s3/blob/main/doc/SEMANTICS.md |
| 52 | + |
| 53 | +## Architecture & Code Organization |
| 54 | + |
| 55 | +Mountpoint for Amazon S3 is organized as a Rust workspace with six crates, |
| 56 | +each handling a specific layer of functionality. |
| 57 | + |
| 58 | +### Crate Structure |
| 59 | + |
| 60 | +``` |
| 61 | +mountpoint-s3/ # Main binary crate (CLI, mount logic) |
| 62 | +mountpoint-s3-fs/ # Core filesystem implementation |
| 63 | +mountpoint-s3-client/ # S3 API client |
| 64 | +mountpoint-s3-fuser/ # FUSE bindings (fork of fuser) |
| 65 | +mountpoint-s3-crt/ # AWS Common Runtime bindings |
| 66 | +mountpoint-s3-crt-sys/ # Low-level CRT system bindings |
| 67 | +``` |
| 68 | + |
| 69 | +**Dependency Hierarchy:** |
| 70 | +``` |
| 71 | +mountpoint-s3 # Main binary crate (CLI, mount logic) |
| 72 | +└── mountpoint-s3-fs # Filesystem implementation |
| 73 | + ├── mountpoint-s3-client # SDK-like S3 client |
| 74 | + │ └── mountpoint-s3-crt # High-level AWS Common Runtime (CRT) bindings |
| 75 | + │ └── mountpoint-s3-crt-sys # Low-level CRT system bindings |
| 76 | + └── mountpoint-s3-fuser # FUSE bindings (fork of fuser) |
| 77 | +``` |
| 78 | + |
| 79 | +### Layers |
| 80 | + |
| 81 | +**Customer-facing Layer (`mountpoint-s3`)** |
| 82 | +- CLI argument parsing and configuration |
| 83 | +- Main entry point (`mount-s3` binary) |
| 84 | + |
| 85 | +**Filesystem Layer (`mountpoint-s3-fs`)** |
| 86 | +- FUSE filesystem implementation |
| 87 | +- File/directory abstraction over S3 objects |
| 88 | +- Caching and prefetching logic |
| 89 | +- Inode management and metadata handling |
| 90 | + |
| 91 | +**S3 Client Layer (`mountpoint-s3-client`)** |
| 92 | +- Presents an SDK-like interface for talking to S3 |
| 93 | +- Implements construction of S3 requests like HeadObject, DeleteObject |
| 94 | +- Provides streaming abstractions for reading and writing to S3 |
| 95 | + |
| 96 | +**FUSE Bindings (`mountpoint-s3-fuser`)** |
| 97 | +- Fork of the `fuser` crate with Mountpoint-specific optimizations |
| 98 | + - The crate README.md provides information on how it's maintained: https://github.com/awslabs/mountpoint-s3/blob/main/mountpoint-s3-fuser/README.md#fork-maintenance |
| 99 | +- Low-level FUSE protocol handling |
| 100 | +- Kernel interface for filesystem operations |
| 101 | + |
| 102 | +**AWS Common Runtime (`mountpoint-s3-crt`, `mountpoint-s3-crt-sys`)** |
| 103 | +- Rust bindings for AWS Common Runtime (CRT) |
| 104 | +- `mountpoint-s3-crt` aims to provide a 'safe', idiomatic interface to the bindings in `mountpoint-s3-crt-sys`. |
| 105 | + |
| 106 | +## Development Environment Setup |
| 107 | + |
| 108 | +You will need the Rust and C toolchains installed. |
| 109 | +The `docs/INSTALL.md` has a section on building from source which can get you started. |
| 110 | +For running tests, you should [install cargo-nextest](https://nexte.st/docs/installation/pre-built-binaries/). |
| 111 | + |
| 112 | +You will need a Linux environment that has FUSE support. |
| 113 | + |
| 114 | +You should also have AWS credentials available for testing. |
| 115 | +Short-term AWS credentials are recommended. |
| 116 | +As an internal contributor, there are options recommended by the team owning Mountpoint. |
| 117 | + |
| 118 | +## Working with the repository |
| 119 | + |
| 120 | +Mountpoint for Amazon S3 is primarily developed on GitHub, |
| 121 | +via its public open-source repository: https://github.com/awslabs/mountpoint-s3. |
| 122 | +For new feature launches that depend on AWS service features that are not yet public, |
| 123 | +the Mountpoint team works with an internal fork of the repository. |
| 124 | + |
| 125 | +Many team members are using [Visual Studio Code](https://code.visualstudio.com/) to work with the project. |
| 126 | + |
| 127 | +The project contains a `Makefile` providing quick ways to run common tasks |
| 128 | +while excluding the `fuser` crate which is not typically being modified. |
| 129 | +For example, `make pre-pr-check` runs a group of commands such as `cargo fmt` while excluding the `fuser` crate. |
| 130 | + |
| 131 | +### Running Unit Tests |
| 132 | + |
| 133 | +Unit tests are defined either in the modules themselves inside a `test` mod block, |
| 134 | +or in the `tests/` directory of the crate. |
| 135 | + |
| 136 | +In this project, we are using [nextest](https://nexte.st/) as our preferred test runner. |
| 137 | + |
| 138 | +Tests can be run with something as simple as the following commands. |
| 139 | + |
| 140 | +```bash |
| 141 | +# Run all unit tests |
| 142 | +cargo nextest run |
| 143 | + |
| 144 | +# Run only the S3 client unit tests |
| 145 | +cargo nextest run -p mountpoint-s3-client |
| 146 | +``` |
| 147 | + |
| 148 | +### Running Integration Tests |
| 149 | + |
| 150 | +Mountpoint has a suite of integration tests, covering different components |
| 151 | +such as the S3 client and end-to-end as a mounted FUSE file system. |
| 152 | + |
| 153 | +The integration tests require an AWS Account with resources provisioned, such as S3 buckets and IAM roles. |
| 154 | +For internal contributors, there are setup instructions available internally. |
| 155 | + |
| 156 | +> TODO: Add details on what resources need to be created. |
| 157 | +
|
| 158 | +Integration tests are gated behind Rust compile-time feature flags. |
| 159 | +The list of these may change over time. |
| 160 | + |
| 161 | +The command below shows how to run the integration tests including FUSE-based tests and tests that require S3. |
| 162 | + |
| 163 | +```bash |
| 164 | +cargo nextest run --features fuse_tests,s3_tests |
| 165 | +``` |
| 166 | + |
| 167 | +## Contributing Guidelines |
| 168 | + |
| 169 | +For detailed contribution information, review [CONTRIBUTING.md](doc/CONTRIBUTING.md). |
0 commit comments