Skip to content

Latest commit

 

History

History
329 lines (245 loc) · 10.1 KB

code-generation.md

File metadata and controls

329 lines (245 loc) · 10.1 KB

Code Generation Standards

This document outlines Bayat's standards and best practices for using AI-assisted code generation tools in the development process.

Table of Contents

Introduction to Code Generation

Purpose and Benefits

  • Accelerated Development: AI-assisted code generation can speed up routine coding tasks
  • Reduced Boilerplate: Automating repetitive code patterns
  • Knowledge Augmentation: Accessing patterns and solutions beyond individual knowledge
  • Learning Tool: Discovering new approaches and techniques

Scope of Use

  • Appropriate Use Cases:

    • Generating routine code (CRUD operations, standard patterns)
    • Creating test cases
    • Converting between similar formats or languages
    • Documentation generation
    • Boilerplate reduction
  • Inappropriate Use Cases:

    • Security-critical components
    • Complex business logic without review
    • Performance-critical code without optimization
    • Replacing understanding of fundamental concepts

Supported Tools

Primary Tools

  1. GitHub Copilot:

    • Available for VS Code, Visual Studio, JetBrains IDEs, and Neovim
    • Business license required for commercial projects
    • Configuration standards:
      • Enable only on approved repositories
      • Configure .copilotignore for sensitive code areas
  2. Amazon CodeWhisperer:

    • Available for AWS Cloud9, VS Code, and JetBrains IDEs
    • Usage tracking requirements
    • Configuration standards documented in internal tools wiki
  3. Internal Tools:

    • Custom code generators for project-specific patterns
    • Template-based generation tools
    • Documentation found on internal developer portal

Configuration Management

  • Store tool configurations in version control
  • Document settings and configurations in project README files
  • Standardize settings across team members
  • Review settings as part of security audits

Usage Guidelines

General Principles

  • Understand Before Accepting: Always understand generated code before accepting it
  • Verify Correctness: Test and verify all generated code
  • Incremental Use: Generate small, focused sections rather than large blocks
  • Guide with Comments: Use comments to guide generation in the right direction
  • Review with Skepticism: Be particularly skeptical of complex logic or security-related code

Prompting Techniques

  • Be Specific: Provide clear, detailed requirements in comments
  • Context Matters: Ensure relevant context is visible to the generation tool
  • Progressive Refinement: Start with a high-level description and refine
  • Style Guidance: Include style examples in visible code or comments
  • Function Signatures: Define input/output types before generating implementation

Documentation Requirements

  • Comment when code is AI-generated but substantially modified
  • Include reference links for generated code based on specific sources
  • Document assumptions or limitations in generated code
  • Follow standard documentation guidelines

Quality Control

Code Review Process

  • Additional Scrutiny: Apply extra scrutiny during review of generated code
  • Generated Code Indicators: Consider using comments to mark generated sections
  • Specialized Checklists: Use AI-specific review checklists
  • Understanding Verification: Reviewers should verify the author understands the generated code

Testing Requirements

  • Higher test coverage requirements for generated code (target 90%+)
  • Include edge case tests for generated algorithms
  • Verify performance characteristics match requirements
  • Test error handling explicitly

Refactoring Guidelines

  • Refactor generated code to follow team conventions
  • Break down complex generated functions into smaller, testable units
  • Eliminate redundant or unnecessary code
  • Apply consistent naming conventions

Security Considerations

Data Privacy

  • Never share sensitive or proprietary code as context for generation
  • Be aware of context window limitations and what might be inadvertently included
  • Disable telemetry when working with sensitive codebases
  • Use private instances for highly confidential projects

Common Vulnerabilities

  • Special Review Areas:

    • Authentication and authorization code
    • Input validation and sanitization
    • SQL or database query generation
    • File system operations
    • Network request handling
    • Encryption and key management
  • Mitigation Strategies:

    • Run static analysis tools on generated code
    • Perform security-focused code reviews
    • Apply standard security testing procedures

Dependency Management

  • Verify licenses of suggested dependencies
  • Check security vulnerabilities in suggested packages
  • Follow Dependency Management Standards standards
  • Prefer suggesting established, well-maintained libraries

Training and Onboarding

Developer Training

  • Required training modules for AI code generation tools
  • Understanding limitations and best practices
  • Security awareness for code generation
  • How to effectively prompt for better results

Team Knowledge Sharing

  • Share effective prompts and patterns
  • Document project-specific generation techniques
  • Regular review of generation effectiveness
  • Collect examples of successful and problematic generations

Ethical and Legal Considerations

Intellectual Property

  • Understand the training data sources for your tools
  • Be aware of license implications for generated code
  • Document generation usage for compliance purposes
  • Follow company policies on attribution and licensing

Code Attribution

  • When to attribute generated code:
    • Direct use of generated code from documented sources
    • Generated solutions to standard algorithms
    • Generated implementations of patented techniques
  • When attribution is not needed:
    • Standard language syntax and patterns
    • Common, non-unique implementations

Bias and Fairness

  • Be aware of potential biases in generated code
  • Review for inclusive language and approaches
  • Challenge generated assumptions and defaults
  • Report problematic generations to tool providers

Measuring Effectiveness

Productivity Metrics

  • Time saved vs. manual implementation
  • Reduction in boilerplate code
  • Decrease in time spent on routine tasks
  • Impact on overall delivery timelines

Quality Metrics

  • Bug rates in generated vs. manual code
  • Technical debt introduced
  • Maintainability index comparisons
  • Security issue discovery rate

Adoption Measurement

  • Tool usage across teams
  • Developer satisfaction surveys
  • Improvement in adoption over time
  • Training effectiveness

Trend Analysis

  • Measuring quality impact
  • Trend analysis

Best Practices by Language

JavaScript/TypeScript

  • Guide type generation with explicit type definitions

  • Use JSDoc comments to improve generation quality

  • Be explicit about framework being used (React, Angular, etc.)

  • Include imports to guide library usage

  • Examples:

    // Generate a React component that displays a sortable table of users
    // with columns for name, email, and role. Include pagination.
    // The component should handle loading and error states.
    type User = {
      id: string;
      name: string;
      email: string;
      role: 'admin' | 'user' | 'guest';
    };

Python

  • Use type hints to guide generation

  • Include docstrings for function signatures

  • Be explicit about framework context (Django, Flask, etc.)

  • Reference style guide (PEP 8) in comments

  • Examples:

    # Generate a function that processes a CSV file,
    # extracts specific columns, and returns a pandas DataFrame
    # with proper error handling for missing files or malformed data
    def process_csv_file(file_path: str, columns: List[str]) -> pd.DataFrame:
        """
        Process a CSV file and return specified columns as a DataFrame.
        
        Args:
            file_path: Path to the CSV file
            columns: List of column names to extract
            
        Returns:
            DataFrame containing the specified columns
            
        Raises:
            FileNotFoundError: If the file doesn't exist
            ValueError: If the CSV is malformed or specified columns don't exist
        """

Java

  • Include package and import statements

  • Use interface definitions to guide implementations

  • Be explicit about framework being used (Spring, etc.)

  • Reference standard patterns in comments

  • Examples:

    // Generate a Spring Data JPA repository interface
    // for a User entity with custom methods to find users
    // by email and role with pagination support
    package com.bayat.user.repository;
    
    import com.bayat.user.model.User;
    import org.springframework.data.jpa.repository.JpaRepository;
    
    public interface UserRepository extends JpaRepository<User, Long> {
        // Custom methods will be generated here
    }

Code Generation in CI/CD

Automated Generation

  • Use case scenarios for automated generation
  • Integration with build processes
  • Validation requirements for generated artifacts
  • Handling generation failures

Pre-commit Hooks

  • Implementing automated checks for generated code
  • Security and quality validations
  • Standards enforcement
  • Documentation requirements

Metrics Collection

  • Tracking generation usage in CI/CD
  • Reporting on effectiveness
  • Measuring quality impact
  • Trend analysis

Version History

Version Date Description
1.0 2025-03-20 Initial version