This project is an implementation of a multi-threaded HTTP/1.1 server built from scratch using Python's low-level socket programming. It is designed to handle multiple concurrent clients using a thread pool architecture, serving both static HTML and binary files (GET requests) and processing JSON uploads (POST requests). The server adheres to strict HTTP protocol requirements, including Connection Keep-Alive and robust security measures like Host header validation and Path Traversal protection.
Python 3.x
Basic Linux/macOS command-line environment for testing (using curl or netcat/nc).
Before running the server, ensure the following directory structure is set up:
project/ ├── server.py # The main server code └── resources/ # Root directory for serving content ├── index.html # Default HTML file (Required) ├── about.html # HTML file (Required) ├── contact.html # HTML file (Required) ├── sample.txt # Text file for binary transfer (Required) ├── logo.png # Image file for binary transfer (Required) ├── large.png # Large image file (>1MB) (Required) ├── photo.jpg # JPEG image file (Required) └── uploads/ # Directory for POST-ed JSON files
The server accepts up to three optional command-line arguments: port, host, and max_threads.
(Runs on 127.0.0.1:8080 with 10 threads)
python3 server.py
(Runs on 0.0.0.0:8000 with a thread pool size of 20)
python3 server.py 8000 0.0.0.0 20
Use curl or nc in a separate terminal to test the functionality.
| Test Method/Path | Command Example | Expected Status |
|---|---|---|
Basic GET / |
curl -i http://127.0.0.1:8080/ |
200 OK (text/html) |
Binary Download /logo.png |
curl -O http://127.0.0.1:8080/logo.png |
200 OK (application/octet-stream) |
JSON POST /upload |
curl -i -X POST -H "Content-Type: application/json" -d '{"data": "test"}' http://127.0.0.1:8080/upload |
201 Created |
Path Traversal /../etc/passwd |
curl -i http://127.0.0.1:8080/../etc/passwd |
403 Forbidden |
Host Mismatch Host: evil.com |
curl -i -H "Host: evil.com" http://127.0.0.1:8080/index.html |
403 Forbidden |
The server uses a Producer-Consumer model implemented with Python's built-in threading and queue modules for concurrency (Requirement 3).
Producer (Main Thread): The main thread is responsible for socket listening (server.accept()). When a new connection arrives, the main thread acts as the producer, placing the (socket, address) tuple onto the shared Connection Queue.
Consumer (Worker Threads): A pool of configurable worker threads (MAX_THREADS default 10) continuously monitors the Connection Queue. When a connection is available, a worker thread consumes the task (CONNECTION_QUEUE.get()), and calls handle_client(conn, addr).
Synchronization: The queue.Queue automatically handles synchronization (locks/mutexes) for safe multi-threaded access, preventing race conditions.
Saturation: If the thread pool is busy and the queue capacity (LISTEN_QUEUE_SIZE, default 50) is exceeded, the server immediately returns a 503 Service Unavailable response with a Retry-After header to the client, preventing resource exhaustion.
Binary file transfer is designed for efficiency and data integrity (Requirement 5B).
Files (.txt, .png, .jpg, .jpeg) are opened and read using the binary mode ('rb') to ensure raw byte data is handled without any encoding or corruption.
The Content-Type is set to application/octet-stream.
The Content-Disposition: attachment; filename="..." header is included to instruct the client (browser) to download the content as a file rather than attempting to display it inline.
The exact file size is calculated using os.path.getsize() and set in the Content-Length header.
Instead of reading the entire file into memory (which is inefficient for large files), the file content is read and sent to the socket in 4KB chunks (f.read(4096)), ensuring efficient buffer management and supporting the seamless transfer of large files (>1MB).
The server adheres to strict security protocols to prevent common web vulnerabilities (Requirement 7).
Mechanism: The server uses os.path.realpath() to convert the requested path (/../etc/passwd) and the server's document root (resources/) into their canonical, absolute forms.
Validation: It then strictly checks that the normalized requested path starts with the absolute path of the resources directory. If the request attempts to access any file outside this root (e.g., /../), the check fails.
Response: All attempts are logged, and the server returns 403 Forbidden.
Mechanism: The server extracts the Host header immediately after parsing the initial request.
Validation: It compares the received Host value against a list of explicitly permitted hosts (localhost, 127.0.0.1, localhost:PORT, 127.0.0.1:PORT).
Missing Host: If the header is missing (mandatory for HTTP/1.1), the server responds with 400 Bad Request and closes the connection.
Mismatched Host: If the header is present but invalid, the server responds with 403 Forbidden and logs the violation.
HTTP Version: Only supports the core features of HTTP/1.1 (Keep-Alive, Host header). It does not support features like compression (gzip), chunked transfer encoding, or pipelining.
Method Support: Only GET and POST methods are implemented. All others result in a 405 Method Not Allowed response.
Supported MIME Types: GET requests only support a limited set of file extensions (.html, .txt, .png, .jpg, .jpeg). Any other file type results in a 415 Unsupported Media Type error.
Error Handling: While robust for I/O and protocol errors, it does not include advanced signaling or resource management for high-load production environments.
The server imports Python modules for networking (socket), threading (threading, queue), file operations (os, json, datetime), and logging (logging). Logging is configured to include timestamps and thread names for better traceability.
ROOT_DIR → Folder from which files are served (resources/)
MAX_THREADS → Default size of thread pool (2, configurable later)
CONNECTION_QUEUE → Queue to manage incoming client connections
Generates the current date in the correct HTTP format (RFC 7231) for response headers.
This is the core request handler that runs inside each worker thread. It performs the following major tasks:
a. Request Parsing
- Reads raw HTTP data from the client.
- Extracts method, path, and headers.
- Validates the Host header to prevent unauthorized access. -Returns 400 if missing. -Returns 403 if invalid.
b. GET Request Handling
Handles requests for static and binary files:
-
Maps request paths to files inside the resources/ directory.
-
Protects against path traversal attacks using os.path.realpath and os.path.commonpath.
-
Serves:
- .html → text/html; charset=utf-8
- .txt, .png, .jpg, .jpeg → application/octet-stream
-
Sends files with Content-Disposition headers for downloads.
-
Returns:
- 404 if file not found
- 415 for unsupported file types
c. POST Request Handling
- Only accepts application/json requests.
- Reads and validates the JSON body.
- Saves data in resources/uploads/ as:
upload_[timestamp]_[randomid].json - Returns a JSON response with status 201 Created.
d. Error Handling
- Handles invalid methods (405), malformed requests (400), and unsupported types (415), logging every event.
Each worker thread continuously listens for connections from the shared queue (CONNECTION_QUEUE):
- When a connection arrives, it calls handle_client().
- Marks the task as complete once handled.
The main server startup function:
- Creates and binds a TCP socket.
- Starts listening for incoming connections (queue size 50).
- Spawns worker threads (MAX_THREADS) to form the thread pool.
- Gracefully shuts down on Ctrl + C.
When new connections arrive:
-
They are added to the queue if space is available.
-
If the queue is full, responds with:
HTTP/1.1 503 Service Unavailable Retry-After: 5
Comprehensive logging throughout:
- Server startup and configuration
- Connection assignments to threads
- File transfers and response
- Queue saturation warnings
- Security violations (invalid Host, path traversal, etc.)
Host Header Validation → Prevents forged requests Path Traversal Protection → Ensures files are only served from resources/ Error Codes → Prevents information leakage by returning generic HTTP errors
| Status Code | Description |
|---|---|
| 200 OK | File served successfully |
| 201 Created | JSON file created on POST |
| 400 Bad Request | Missing or invalid request |
| 403 Forbidden | Path or Host violation |
| 404 Not Found | Missing file |
| 405 Method Not Allowed | Unsupported method |
| 415 Unsupported Media Type | Wrong Content-Type or file type |
| 503 Service Unavailable | Thread pool full |
This implementation:
- Uses TCP sockets for communication
- Employs a multi-threaded architecture with a fixed-size thread pool
- Safely serves both HTML and binary content
- Handles POST JSON uploads
- Implements key HTTP protocol features, including Host validation, connection handling, and status responses
- Includes comprehensive logging and security protections