This document outlines the coding standards and best practices for Python development across all Bayat projects. Following these guidelines ensures code consistency, quality, and maintainability.
- Python Version
- Code Style
- Project Structure
- Dependencies Management
- Documentation
- Testing
- Error Handling
- Performance Considerations
- Security Best Practices
- Packaging and Deployment
- Environment Management
- IDE Configuration
- Use Python 3.9+ for all new projects
- Python 3.8 is the minimum supported version for existing projects
- Document Python version requirements in project README and
pyproject.toml
orsetup.py
All Python code should follow PEP 8 with the following specifics:
- Use 4 spaces for indentation (no tabs)
- Maximum line length of 88 characters (consistent with Black formatter)
- Use Black for automated code formatting
- Use isort for import sorting with the Black-compatible configuration
- Packages: lowercase, short, no underscores (e.g.,
utils
,models
) - Modules: lowercase with underscores (e.g.,
data_processor.py
) - Classes: CamelCase (e.g.,
DataProcessor
) - Functions/Methods: lowercase with underscores (e.g.,
process_data()
) - Variables: lowercase with underscores (e.g.,
user_input
) - Constants: uppercase with underscores (e.g.,
MAX_CONNECTIONS
) - Private attributes/methods: prefixed with underscore (e.g.,
_internal_method()
)
- Group imports in the following order:
- Standard library imports
- Related third-party imports
- Local application/library specific imports
- Use absolute imports for external modules and relative imports for internal modules
- Avoid wildcard imports (
from module import *
)
Example:
# Standard library
import os
import sys
from datetime import datetime
# Third-party
import numpy as np
import pandas as pd
from sqlalchemy import Column, Integer
# Local
from .models import User
from ..utils import helpers
- Use type hints for all function parameters and return values
- Use the
typing
module for complex types - For improved readability, consider using type aliases for complex types
Example:
from typing import Dict, List, Optional, Tuple, Union
# Type alias
UserData = Dict[str, Union[str, int, List[str]]]
def process_user_data(user_id: int, data: UserData) -> Tuple[bool, Optional[str]]:
"""Process user data and return success status with optional error message."""
# Implementation
return True, None
Standardized project structure for different types of Python applications:
my_package/
├── pyproject.toml # Project metadata and build configuration
├── setup.py # (Optional) Setup script if not using pyproject.toml
├── README.md # Project documentation
├── LICENSE # License information
├── .gitignore # Git ignore file
├── src/ # Source code directory
│ └── my_package/ # Package directory
│ ├── __init__.py # Package initialization
│ ├── module1.py # Module file
│ └── subpackage/ # Subpackage directory
│ ├── __init__.py
│ └── module2.py
├── tests/ # Test directory
│ ├── __init__.py
│ ├── test_module1.py
│ └── test_subpackage/
│ └── test_module2.py
└── docs/ # Documentation directory
└── index.md
my_webapp/
├── pyproject.toml # Project metadata and build configuration
├── requirements.txt # Dependencies
├── README.md # Project documentation
├── .env.example # Example environment variables
├── .gitignore # Git ignore file
├── manage.py # Django management script
├── app/ # Application directory
│ ├── __init__.py
│ ├── settings.py # Django settings
│ ├── urls.py # URL configuration
│ └── wsgi.py # WSGI configuration
├── apps/ # Django applications
│ ├── core/ # Core application
│ │ ├── __init__.py
│ │ ├── models.py
│ │ ├── views.py
│ │ ├── urls.py
│ │ └── tests.py
│ └── users/ # Users application
│ ├── __init__.py
│ ├── models.py
│ ├── views.py
│ ├── urls.py
│ └── tests.py
├── static/ # Static files
│ ├── css/
│ ├── js/
│ └── images/
├── templates/ # HTML templates
│ ├── base.html
│ └── pages/
└── docs/ # Documentation directory
data_project/
├── pyproject.toml # Project metadata and build configuration
├── requirements.txt # Dependencies
├── README.md # Project documentation
├── .gitignore # Git ignore file
├── data/ # Data directory (often gitignored)
│ ├── raw/ # Raw data
│ ├── processed/ # Processed data
│ └── external/ # External data sources
├── notebooks/ # Jupyter notebooks
│ ├── exploratory/ # Exploratory analysis
│ └── final/ # Final analysis
├── src/ # Source code
│ ├── __init__.py
│ ├── data/ # Data processing scripts
│ ├── features/ # Feature engineering scripts
│ ├── models/ # Model definition and training
│ └── visualization/ # Visualization scripts
├── tests/ # Tests
│ ├── __init__.py
│ └── test_data.py
├── models/ # Saved models
└── reports/ # Generated reports
└── figures/ # Generated figures
- Use
pip
with arequirements.txt
file for simple projects - Use
poetry
for complex projects with intricate dependency management - Use
pyproject.toml
for project configuration when possible
- Pin exact versions of dependencies for applications (
requests==2.28.1
) - Use version ranges for libraries (
requests>=2.27.0,<2.29.0
) - Separate development dependencies from production dependencies
- Include a
requirements-dev.txt
for development dependencies if using pip - Document all dependencies with purpose and usage in comments
[tool.poetry]
name = "my-project"
version = "0.1.0"
description = "My awesome Python project"
authors = ["Your Name <[email protected]>"]
readme = "README.md"
[tool.poetry.dependencies]
python = "^3.9"
requests = "^2.28.1"
pandas = "^1.5.0"
numpy = "^1.23.3"
[tool.poetry.group.dev.dependencies]
pytest = "^7.1.3"
black = "^22.8.0"
isort = "^5.10.1"
mypy = "^0.981"
flake8 = "^5.0.4"
[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"
[tool.black]
line-length = 88
[tool.isort]
profile = "black"
line_length = 88
[tool.mypy]
python_version = "3.9"
warn_return_any = true
warn_unused_configs = true
disallow_untyped_defs = true
disallow_incomplete_defs = true
Use Google-style docstrings for all modules, classes, methods, and functions:
def calculate_summary_statistics(
data: List[float], include_outliers: bool = True
) -> Dict[str, float]:
"""Calculate summary statistics for a list of numeric values.
This function computes basic statistical measures including mean,
median, standard deviation, min, and max for the provided data.
Args:
data: A list of numeric values to analyze
include_outliers: Whether to include outlier values in calculations.
If False, values outside 3 standard deviations are excluded.
Returns:
A dictionary containing the summary statistics with the following keys:
- mean: arithmetic mean
- median: median value
- std_dev: standard deviation
- min: minimum value
- max: maximum value
Raises:
ValueError: If the input data is empty or contains non-numeric values.
Examples:
>>> calculate_summary_statistics([1.0, 2.0, 3.0, 4.0, 5.0])
{'mean': 3.0, 'median': 3.0, 'std_dev': 1.41, 'min': 1.0, 'max': 5.0}
"""
# Implementation...
- Include a comprehensive README.md with:
- Project description and purpose
- Installation instructions
- Usage examples
- Configuration options
- Development setup
- Contribution guidelines
- Use MkDocs or Sphinx for generating documentation websites
- Include in-line comments for complex logic that isn't self-explanatory
- Use pytest as the primary testing framework
- Organize tests to mirror the structure of the application code
- Name test files with the prefix
test_
(e.g.,test_models.py
) - Name test functions with the prefix
test_
(e.g.,test_user_authentication
)
- Aim for at least 80% code coverage for application code
- 90% coverage for critical paths and core functionality
- 100% coverage for utility functions and helpers
- Unit Tests: Test individual functions and methods in isolation
- Integration Tests: Test interactions between components
- Functional Tests: Test complete features from user perspective
- Parametrized Tests: Use pytest's parametrize for testing multiple inputs
import pytest
from my_package.utils import validate_email
@pytest.mark.parametrize("email,expected", [
("[email protected]", True),
("invalid-email", False),
("user@example", False),
("[email protected]", False),
("@example.com", False),
])
def test_validate_email(email, expected):
"""Test that email validation correctly identifies valid and invalid emails."""
assert validate_email(email) == expected
def test_validate_email_with_empty_input():
"""Test that email validation raises ValueError for empty input."""
with pytest.raises(ValueError) as excinfo:
validate_email("")
assert "Email cannot be empty" in str(excinfo.value)
- Be explicit about exceptions, avoid bare
except:
clauses - Use specific built-in exceptions when appropriate
- Create custom exceptions for application-specific errors
- Include meaningful error messages in exceptions
- Log exceptions with appropriate context and stack traces
import logging
from typing import Dict, Optional
class DatabaseConnectionError(Exception):
"""Raised when database connection fails."""
pass
class UserNotFoundError(Exception):
"""Raised when requested user is not found."""
pass
def get_user(user_id: int) -> Dict[str, any]:
"""Retrieve user data from database.
Args:
user_id: The ID of the user to retrieve
Returns:
User data dictionary
Raises:
UserNotFoundError: If the user doesn't exist
DatabaseConnectionError: If database connection fails
"""
logger = logging.getLogger(__name__)
try:
# Database access logic
connection = get_database_connection()
user = connection.query(f"SELECT * FROM users WHERE id = {user_id}")
if not user:
logger.warning(f"User with ID {user_id} not found")
raise UserNotFoundError(f"User with ID {user_id} does not exist")
return user
except ConnectionError as e:
logger.error(f"Database connection failed: {str(e)}")
raise DatabaseConnectionError(f"Failed to connect to database: {str(e)}") from e
- Optimize for readability first, then performance when necessary
- Use appropriate data structures for the task (e.g., sets for membership testing)
- Prefer list comprehensions over loops for simple transformations
- Use generators for processing large datasets
- Take advantage of built-in functions and standard library
- Be aware of memory usage for large data structures
- Use generators and iterators for processing large datasets
- Consider chunking large files during processing
- Avoid creating unnecessary copies of large data
- Use the
time
module for basic timing - Use
cProfile
for more detailed performance profiling - Use
memory_profiler
to monitor memory usage - Document performance benchmarks for critical operations
- Never store sensitive information (passwords, API keys) in code
- Use environment variables or secure vaults for sensitive configuration
- Sanitize all user input before processing
- Use parameterized queries for database operations
- Regularly scan dependencies for security vulnerabilities
- Use tools like
safety
orbandit
in CI/CD pipelines - Keep dependencies updated to latest secure versions
- Use established libraries for authentication (e.g.,
authlib
,python-jose
) - Implement proper password hashing with
bcrypt
orargon2
- Use secure cookies with proper flags (httponly, secure, samesite)
- Implement proper session management
# BAD: SQL Injection vulnerability
query = f"SELECT * FROM users WHERE username = '{username}'"
# GOOD: Parameterized query
query = "SELECT * FROM users WHERE username = %s"
cursor.execute(query, (username,))
# BAD: Command injection vulnerability
os.system(f"convert {filename} output.png")
# GOOD: Use subprocess with safe arguments
subprocess.run(["convert", filename, "output.png"], check=True)
- Follow the src-layout pattern for packages
- Include necessary package metadata in
pyproject.toml
orsetup.py
- Create proper
__init__.py
files with appropriate imports - Define
__all__
in__init__.py
to control exported symbols
- Use Poetry or setuptools for building packages
- Generate wheel packages for distribution
- Publish to PyPI or private package repository for shared libraries
- Document installation and usage instructions
- Use multi-stage builds for smaller images
- Create appropriate Dockerfiles for different environments
- Use non-root users in containers
- Pin specific base image versions
Example Dockerfile:
FROM python:3.9-slim AS builder
WORKDIR /app
# Install build dependencies
RUN apt-get update \
&& apt-get install -y --no-install-recommends gcc \
&& rm -rf /var/lib/apt/lists/*
# Install Python dependencies
COPY requirements.txt .
RUN pip wheel --no-cache-dir --wheel-dir /app/wheels -r requirements.txt
# Final stage
FROM python:3.9-slim
WORKDIR /app
# Create non-root user
RUN adduser --disabled-password --gecos "" appuser
# Copy wheels from builder stage
COPY --from=builder /app/wheels /app/wheels
# Install dependencies
RUN pip install --no-cache /app/wheels/*
# Copy application code
COPY src/ /app/src/
# Switch to non-root user
USER appuser
# Run application
CMD ["python", "-m", "src.main"]
- Always use virtual environments for Python projects
- Use
venv
for simple projects orpoetry
for complex ones - Document environment setup instructions in README
- Use environment variables for configuration
- Use
.env
files for local development (never commit to version control) - Provide a
.env.example
file with required variables (but no sensitive values) - Use a library like
python-dotenv
orpydantic
for environment variable loading
- Keep configuration separate from code
- Use a hierarchical approach for configuration (defaults, env vars, config files)
- Validate configuration at startup
- Use strong typing for configuration objects
Example with Pydantic:
from pydantic import BaseSettings, Field, PostgresDsn, validator
from typing import Optional
class Settings(BaseSettings):
APP_NAME: str = "MyApp"
DEBUG: bool = False
DATABASE_URL: PostgresDsn
API_KEY: str
MAX_CONNECTIONS: int = 10
CACHE_TTL: int = Field(default=300, gt=0)
@validator("API_KEY")
def api_key_must_be_valid(cls, v):
if len(v) < 32:
raise ValueError("API key must be at least 32 characters")
return v
class Config:
env_file = ".env"
env_file_encoding = "utf-8"
# Use settings
settings = Settings()
Recommended extensions:
- Python (Microsoft)
- Pylance
- Python Docstring Generator
- Jupyter
- Python Test Explorer
Workspace settings (settings.json
):
{
"python.formatting.provider": "black",
"python.formatting.blackArgs": ["--line-length", "88"],
"python.linting.enabled": true,
"python.linting.pylintEnabled": false,
"python.linting.flake8Enabled": true,
"python.linting.flake8Args": ["--max-line-length", "88"],
"python.linting.mypyEnabled": true,
"python.sortImports.args": ["--profile", "black"],
"editor.formatOnSave": true,
"editor.codeActionsOnSave": {
"source.organizeImports": true
},
"python.testing.pytestEnabled": true,
"python.testing.unittestEnabled": false,
"python.testing.nosetestsEnabled": false,
}
Recommended plugins:
- Python Security
- Requirements
- Mypy
- Black
Configuration:
- Enable Black as external tool for formatting
- Configure isort as external tool for import sorting
- Setup mypy for type checking
- Configure pytest as the default test runner
Version | Date | Description |
---|---|---|
1.0 | 2025-03-20 | Initial version |