A distributed file system with semantics similar to both google file system and hadoop file system
- Git clone the repository using the command
git clone https://github.com/PramodRaoB/hybrid-distributed-file-system/ - Enter into the project directory using the command
cd hybrid-distributed-file-system - Optionally set up a python virtual environment
- Install the required python packages:
pip install -r requirements.txt - Compile the proto headers by running the command
cd src && make - Done!
The file system consists of three components, the master server, one or more chunk servers and one or more clients
To run the master server, run the command python master_server.py
To run the chunk servers, run the command python chunk_server.py <0-3>.
The file system supports multiple chunk-servers. The <0-3> option is to be relaced by an integer between 0 and 3
supporting 4 simultaneous chunk servers. This can be increased in config.py.
The client supports multiple commands all of which can be executed simultaneously across multiple client applications.
The syntax to run a client command looks like: python client.py command args...
- Copying a new file onto hybrid-DFS
python client.py create <local-file-path> <hybrid-dfs-file-path>
- Listing the files on hybrid-DFS. The option
1lists even the files being currently processed (copy/deletion).python client.py ls <0-1>
- Reading a file. Setting the last option to
-1reads until the end of filepython client.py read <file-path> <start-position> <number-of-bytes-to-read>
- Deleting a file
python client.py delete <file-path>