Skip to content

Create Index.json Metadata System for CSV registries. #156

@undead2146

Description

@undead2146

Description

Implement index.json metadata system to provide dynamic discovery and integrity verification for CSV registries.

Requirements

• Create index.json with metadata about available CSVs
• Include file counts, sizes, checksums, supported languages
• Enable CSVDiscoverer to load registries dynamically
• Support versioning and cache invalidation
• Follow JSON schema for metadata structure

Metadata Validation

Validation Process:

  • Validate JSON schema (required fields present)
  • Verify fileCount matches actual lines in CSV file
  • Check that checksum.sha256 of CSV matches locally computed value
  • Ensure URLs are properly formatted
  • Validate gameType values ("Generals" or "ZeroHour")
  • Verify language codes are supported (en, de, fr, etc.)

Acceptance Criteria

  • index.json created with complete metadata (primary discovery source)
  • CSVDiscoverer can load and parse index.json
  • Checksums (MD5 + SHA256) for integrity verification
  • Version information for cache invalidation
  • Language lists for each registry
  • File counts and sizes for performance estimation

Technical Details

  • Location: docs/GameInstallationFilesRegistry/index.json
  • Schema: JSON with registries array and metadata
  • Usage: CSVDiscoverer loads index to discover available CSVs

JSON Schema Example

{
  "version": "1.0.0",
  "lastUpdated": "2025-09-17T10:30:00Z",
  "description": "Index of CSV registries for Command & Conquer Generals and Zero Hour validation",
  "registries": [
    {
      "id": "generals-1.08",
      "gameType": "Generals",
      "version": "1.08",
      "url": "https://raw.githubusercontent.com/Community-Outpost/GenHub/main/docs/GameInstallationFilesRegistry/Generals-1.08.csv",
      "fileCount": 45230,
      "totalSizeBytes": 45678901,
      "languages": ["All", "EN", "DE", "FR", "ES", "IT", "KO", "PL", "PT-BR", "ZH-CN", "ZH-TW"],
      "checksum": {
        "md5": "a1b2c3d4e5f67890123456789012345",
        "sha256": "abcdef1234567890abcdef1234567890abcdef1234567890abcdef1234567890"
      },
      "generatedAt": "2025-09-17T09:15:00Z",
      "generatorVersion": "1.0.0",
      "isActive": true
    },
    {
      "id": "zerohour-1.04",
      "gameType": "ZeroHour",
      "version": "1.04", 
      "url": "https://raw.githubusercontent.com/Community-Outpost/GenHub/main/docs/GameInstallationFilesRegistry/ZeroHour-1.04.csv",
      "fileCount": 38750,
      "totalSizeBytes": 38901234,
      "languages": ["All", "EN", "DE", "FR", "ES", "IT", "KO", "PL", "PT-BR", "ZH-CN", "ZH-TW"],
      "checksum": {
        "md5": "f6e5d4c3b2a19876543210987654321",
        "sha256": "fedcba0987654321fedcba0987654321fedcba0987654321fedcba0987654321"
      },
      "generatedAt": "2025-09-17T09:20:00Z",
      "generatorVersion": "1.0.0",
      "isActive": true
    }
  ]
}

Key Implementation Points

Checksum Generation:

// Generate after CSV
using (var md5 = MD5.Create())
using (var sha256 = SHA256.Create())
using (var stream = File.OpenRead(csvFilePath))
{
    var md5Hash = md5.ComputeHash(stream);
    stream.Position = 0;
    var sha256Hash = sha256.ComputeHash(stream);
    
    string md5Hex = BitConverter.ToString(md5Hash).Replace("-", "").ToLower();
    string sha256Hex = BitConverter.ToString(sha256Hash).Replace("-", "").ToLower();
}

CSVDiscoverer Integration (#139):
This metadata will be parsed by CSVDiscoverer to decide which CSV to fetch. The CSVDiscoverer will:

  • Load index.json from the repository
  • Filter registries by game type and language
  • Select the appropriate CSV URL for downloading
  • Use checksums to validate CSV integrity before processing

Implementation Steps

  1. Define JSON schema for index.json
  2. Create index generator utility
  3. Integrate with CSV generation process
  4. Update CSVDiscoverer to use index.json
  5. Add metadata validation

Cross-Cutting Sub-Issues (EPIC #108)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions