[FEAT]: GraphQL Performance Optimizations - Query Caching, Persistent Queries, and Smart Result Memoization

## Overview

This proposal outlines several GraphQL-specific performance optimizations that could significantly improve the efficiency and performance of `@octokit/graphql`. These optimizations are standard patterns in modern GraphQL clients (Apollo, URQL, Relay) and would benefit applications making repeated or similar queries.

## Motivation

Currently, `@octokit/graphql` executes every query as a fresh network request with no caching layer. This means:
- Identical queries are re-executed unnecessarily
- Query parsing and validation happen on every request
- Large query strings are sent over the network repeatedly
- Error responses aren't cached, leading to redundant failed requests
- No mechanism to reduce bandwidth for frequently-used queries

## Proposed Optimizations

### 1. Query Result Caching (HIGH IMPACT)

**Problem**: Identical queries with the same variables fetch data multiple times.

**Solution**: Implement a configurable cache layer that memoizes query results based on query string + variables.

```typescript
const graphqlWithCache = graphql.defaults({
  cache: {
    enabled: true,
    ttl: 60000, // 60 seconds default
    maxSize: 100, // maximum cache entries
    keyStrategy: 'query+variables' // cache key generation
  }
});

// First call - hits API
const result1 = await graphqlWithCache(`
  query($owner: String!, $repo: String!) {
    repository(owner: $owner, name: $repo) {
      stargazerCount
    }
  }
`, { owner: 'octokit', repo: 'graphql.js' });

// Second call within TTL - returns cached result
const result2 = await graphqlWithCache(`
  query($owner: String!, $repo: String!) {
    repository(owner: $owner, name: $repo) {
      stargazerCount
    }
  }
`, { owner: 'octokit', repo: 'graphql.js' });
```

**Benefits**:
- Reduces API rate limit consumption
- Faster response times for repeated queries
- Lower network bandwidth usage
- Configurable per-query or globally

**Implementation Considerations**:
- Use LRU (Least Recently Used) cache eviction
- Support cache invalidation by query pattern
- Respect HTTP cache headers from GitHub API
- Allow manual cache clearing
- Thread-safe for concurrent requests

---

### 2. Persistent Queries Support (MEDIUM-HIGH IMPACT)

**Problem**: Large GraphQL queries consume bandwidth, especially for complex queries sent repeatedly.

**Solution**: Implement Automatic Persisted Queries (APQ) where queries are hashed and only the hash is sent after the first request.

```typescript
const graphqlWithAPQ = graphql.defaults({
  persistedQueries: {
    enabled: true,
    hashAlgorithm: 'sha256',
    useGETForHashedQueries: true
  }
});

// First request: sends full query + hash
// Subsequent requests: sends only hash (saves bandwidth)
const result = await graphqlWithAPQ(LARGE_QUERY, variables);
```

**Benefits**:
- Reduces payload size by ~80-95% for large queries
- Enables GET requests for cached queries (better CDN caching)
- Improves performance on slow networks
- Standard GraphQL pattern (Apollo spec)

**Implementation Notes**:
- Generate SHA-256 hash of query string
- Store query-to-hash mapping client-side
- Fallback to full query if server doesn't recognize hash
- Compatible with GitHub GraphQL API if they support APQ

---

### 3. Query Parsing/Validation Cache (MEDIUM IMPACT)

**Problem**: Query parsing and validation overhead for frequently-used queries.

**Solution**: Cache parsed query AST and validation results.

```typescript
// Internal implementation
const queryParseCache = new Map();

function getOrParseQuery(queryString) {
  if (queryParseCache.has(queryString)) {
    return queryParseCache.get(queryString);
  }

  const parsed = parseQuery(queryString);
  queryParseCache.set(queryString, parsed);
  return parsed;
}
```

**Benefits**:
- Eliminates redundant parsing overhead
- Faster query preparation
- Lower CPU usage for repeated queries
- Memory-efficient with LRU eviction

---

### 4. Variable Serialization Cache (LOW-MEDIUM IMPACT)

**Problem**: JSON serialization of variables happens on every request, even for identical variable objects.

**Solution**: Cache serialized variables by reference or content hash.

```typescript
const variableCache = new WeakMap();

function serializeVariables(variables) {
  if (variableCache.has(variables)) {
    return variableCache.get(variables);
  }

  const serialized = JSON.stringify(variables);
  variableCache.set(variables, serialized);
  return serialized;
}
```

**Benefits**:
- Reduces serialization overhead for complex variable objects
- WeakMap allows garbage collection
- Minimal memory footprint

---

### 5. Error Response Caching (LOW-MEDIUM IMPACT)

**Problem**: Failed queries (e.g., permission errors, not found) are retried unnecessarily.

**Solution**: Cache error responses for a short TTL to prevent retry storms.

```typescript
const graphqlWithErrorCache = graphql.defaults({
  errorCache: {
    enabled: true,
    ttl: 5000, // 5 seconds for errors
    statusCodes: [403, 404] // only cache these errors
  }
});
```

**Benefits**:
- Prevents redundant failed requests
- Protects against accidental retry loops
- Reduces rate limit consumption for known failures
- Improves error handling UX

---

### 6. Smart Cache Invalidation (BONUS)

**Problem**: Knowing when to invalidate cached results.

**Solution**: Provide hooks for cache invalidation based on mutations or time.

```typescript
// Invalidate cache after mutations
await graphql.mutate(CREATE_ISSUE_MUTATION, variables);
graphql.cache.invalidate({ pattern: 'repository.*issues' });

// Or use optimistic invalidation
graphql.cache.invalidateAfter(60000); // 60s
```

---

## Proposed API Design

```typescript
import { graphql } from '@octokit/graphql';

// Global defaults with caching
const cachedGraphql = graphql.defaults({
  cache: {
    enabled: true,
    ttl: 60000,
    maxSize: 100,
    strategy: 'memory' // or 'redis', 'custom'
  },
  persistedQueries: {
    enabled: true,
    hashAlgorithm: 'sha256'
  },
  errorCache: {
    enabled: true,
    ttl: 5000
  }
});

// Per-query cache control
const result = await cachedGraphql(QUERY, {
  variables: { owner, repo },
  cache: {
    ttl: 300000, // override to 5 minutes
    key: 'custom-cache-key' // custom cache key
  }
});

// Cache utilities
cachedGraphql.cache.clear(); // clear all
cachedGraphql.cache.invalidate({ key: 'specific-key' });
cachedGraphql.cache.stats(); // cache hit/miss stats
```

---

## Implementation Priority

1. **High Priority**: Query Result Caching (#1)
2. **Medium Priority**: Persistent Queries (#2), Query Parsing Cache (#3)
3. **Low Priority**: Variable Serialization Cache (#4), Error Caching (#5)

---

## Compatibility Considerations

- All optimizations should be **opt-in** to maintain backward compatibility
- Default behavior remains unchanged (no caching)
- Cache strategies should be pluggable (memory, Redis, custom)
- TypeScript types must be maintained and enhanced
- Should work with existing authentication methods

---

## Performance Benchmarks (Expected)

Based on similar GraphQL client implementations:
- **Query Result Caching**: 90-99% latency reduction for cached hits
- **Persistent Queries**: 70-90% bandwidth reduction for large queries
- **Query Parsing Cache**: 5-15% CPU reduction for repeated queries
- **Error Caching**: 50-100% reduction in failed request retries

---

## References

- [Apollo Client Caching](https://www.apollographql.com/docs/react/caching/cache-configuration/)
- [Automatic Persisted Queries Spec](https://github.com/apollographql/apollo-link-persisted-queries)
- [URQL Document Caching](https://formidable.com/open-source/urql/docs/basics/document-caching/)
- [GraphQL Query Complexity](https://www.npmjs.com/package/graphql-query-complexity)

---

## Questions for Maintainers

1. Is there interest in adding caching capabilities to this library?
2. Should caching be built into core or offered as a separate `@octokit/graphql-cache` plugin?
3. Does GitHub's GraphQL API support Automatic Persisted Queries?
4. Are there concerns about cache invalidation complexity?

I'm happy to contribute a PR for any of these optimizations if there's interest!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FEAT]: GraphQL Performance Optimizations - Query Caching, Persistent Queries, and Smart Result Memoization #669

Overview

Motivation

Proposed Optimizations

1. Query Result Caching (HIGH IMPACT)

2. Persistent Queries Support (MEDIUM-HIGH IMPACT)

3. Query Parsing/Validation Cache (MEDIUM IMPACT)

4. Variable Serialization Cache (LOW-MEDIUM IMPACT)

5. Error Response Caching (LOW-MEDIUM IMPACT)

6. Smart Cache Invalidation (BONUS)

Proposed API Design

Implementation Priority

Compatibility Considerations

Performance Benchmarks (Expected)

References

Questions for Maintainers

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEAT]: GraphQL Performance Optimizations - Query Caching, Persistent Queries, and Smart Result Memoization #669

Description

Overview

Motivation

Proposed Optimizations

1. Query Result Caching (HIGH IMPACT)

2. Persistent Queries Support (MEDIUM-HIGH IMPACT)

3. Query Parsing/Validation Cache (MEDIUM IMPACT)

4. Variable Serialization Cache (LOW-MEDIUM IMPACT)

5. Error Response Caching (LOW-MEDIUM IMPACT)

6. Smart Cache Invalidation (BONUS)

Proposed API Design

Implementation Priority

Compatibility Considerations

Performance Benchmarks (Expected)

References

Questions for Maintainers

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions