-
Notifications
You must be signed in to change notification settings - Fork 88
Description
Overview
This proposal outlines several GraphQL-specific performance optimizations that could significantly improve the efficiency and performance of @octokit/graphql
. These optimizations are standard patterns in modern GraphQL clients (Apollo, URQL, Relay) and would benefit applications making repeated or similar queries.
Motivation
Currently, @octokit/graphql
executes every query as a fresh network request with no caching layer. This means:
- Identical queries are re-executed unnecessarily
- Query parsing and validation happen on every request
- Large query strings are sent over the network repeatedly
- Error responses aren't cached, leading to redundant failed requests
- No mechanism to reduce bandwidth for frequently-used queries
Proposed Optimizations
1. Query Result Caching (HIGH IMPACT)
Problem: Identical queries with the same variables fetch data multiple times.
Solution: Implement a configurable cache layer that memoizes query results based on query string + variables.
const graphqlWithCache = graphql.defaults({
cache: {
enabled: true,
ttl: 60000, // 60 seconds default
maxSize: 100, // maximum cache entries
keyStrategy: 'query+variables' // cache key generation
}
});
// First call - hits API
const result1 = await graphqlWithCache(`
query($owner: String!, $repo: String!) {
repository(owner: $owner, name: $repo) {
stargazerCount
}
}
`, { owner: 'octokit', repo: 'graphql.js' });
// Second call within TTL - returns cached result
const result2 = await graphqlWithCache(`
query($owner: String!, $repo: String!) {
repository(owner: $owner, name: $repo) {
stargazerCount
}
}
`, { owner: 'octokit', repo: 'graphql.js' });
Benefits:
- Reduces API rate limit consumption
- Faster response times for repeated queries
- Lower network bandwidth usage
- Configurable per-query or globally
Implementation Considerations:
- Use LRU (Least Recently Used) cache eviction
- Support cache invalidation by query pattern
- Respect HTTP cache headers from GitHub API
- Allow manual cache clearing
- Thread-safe for concurrent requests
2. Persistent Queries Support (MEDIUM-HIGH IMPACT)
Problem: Large GraphQL queries consume bandwidth, especially for complex queries sent repeatedly.
Solution: Implement Automatic Persisted Queries (APQ) where queries are hashed and only the hash is sent after the first request.
const graphqlWithAPQ = graphql.defaults({
persistedQueries: {
enabled: true,
hashAlgorithm: 'sha256',
useGETForHashedQueries: true
}
});
// First request: sends full query + hash
// Subsequent requests: sends only hash (saves bandwidth)
const result = await graphqlWithAPQ(LARGE_QUERY, variables);
Benefits:
- Reduces payload size by ~80-95% for large queries
- Enables GET requests for cached queries (better CDN caching)
- Improves performance on slow networks
- Standard GraphQL pattern (Apollo spec)
Implementation Notes:
- Generate SHA-256 hash of query string
- Store query-to-hash mapping client-side
- Fallback to full query if server doesn't recognize hash
- Compatible with GitHub GraphQL API if they support APQ
3. Query Parsing/Validation Cache (MEDIUM IMPACT)
Problem: Query parsing and validation overhead for frequently-used queries.
Solution: Cache parsed query AST and validation results.
// Internal implementation
const queryParseCache = new Map();
function getOrParseQuery(queryString) {
if (queryParseCache.has(queryString)) {
return queryParseCache.get(queryString);
}
const parsed = parseQuery(queryString);
queryParseCache.set(queryString, parsed);
return parsed;
}
Benefits:
- Eliminates redundant parsing overhead
- Faster query preparation
- Lower CPU usage for repeated queries
- Memory-efficient with LRU eviction
4. Variable Serialization Cache (LOW-MEDIUM IMPACT)
Problem: JSON serialization of variables happens on every request, even for identical variable objects.
Solution: Cache serialized variables by reference or content hash.
const variableCache = new WeakMap();
function serializeVariables(variables) {
if (variableCache.has(variables)) {
return variableCache.get(variables);
}
const serialized = JSON.stringify(variables);
variableCache.set(variables, serialized);
return serialized;
}
Benefits:
- Reduces serialization overhead for complex variable objects
- WeakMap allows garbage collection
- Minimal memory footprint
5. Error Response Caching (LOW-MEDIUM IMPACT)
Problem: Failed queries (e.g., permission errors, not found) are retried unnecessarily.
Solution: Cache error responses for a short TTL to prevent retry storms.
const graphqlWithErrorCache = graphql.defaults({
errorCache: {
enabled: true,
ttl: 5000, // 5 seconds for errors
statusCodes: [403, 404] // only cache these errors
}
});
Benefits:
- Prevents redundant failed requests
- Protects against accidental retry loops
- Reduces rate limit consumption for known failures
- Improves error handling UX
6. Smart Cache Invalidation (BONUS)
Problem: Knowing when to invalidate cached results.
Solution: Provide hooks for cache invalidation based on mutations or time.
// Invalidate cache after mutations
await graphql.mutate(CREATE_ISSUE_MUTATION, variables);
graphql.cache.invalidate({ pattern: 'repository.*issues' });
// Or use optimistic invalidation
graphql.cache.invalidateAfter(60000); // 60s
Proposed API Design
import { graphql } from '@octokit/graphql';
// Global defaults with caching
const cachedGraphql = graphql.defaults({
cache: {
enabled: true,
ttl: 60000,
maxSize: 100,
strategy: 'memory' // or 'redis', 'custom'
},
persistedQueries: {
enabled: true,
hashAlgorithm: 'sha256'
},
errorCache: {
enabled: true,
ttl: 5000
}
});
// Per-query cache control
const result = await cachedGraphql(QUERY, {
variables: { owner, repo },
cache: {
ttl: 300000, // override to 5 minutes
key: 'custom-cache-key' // custom cache key
}
});
// Cache utilities
cachedGraphql.cache.clear(); // clear all
cachedGraphql.cache.invalidate({ key: 'specific-key' });
cachedGraphql.cache.stats(); // cache hit/miss stats
Implementation Priority
- High Priority: Query Result Caching (Initial version #1)
- Medium Priority: Persistent Queries (Consider using an API design that prevents injection attacks #2), Query Parsing Cache (README: mention danger of query injection attacks #3)
- Low Priority: Variable Serialization Cache (remove
{data}
namespace from result #4), Error Caching (Don’t send empty variables object #5)
Compatibility Considerations
- All optimizations should be opt-in to maintain backward compatibility
- Default behavior remains unchanged (no caching)
- Cache strategies should be pluggable (memory, Redis, custom)
- TypeScript types must be maintained and enhanced
- Should work with existing authentication methods
Performance Benchmarks (Expected)
Based on similar GraphQL client implementations:
- Query Result Caching: 90-99% latency reduction for cached hits
- Persistent Queries: 70-90% bandwidth reduction for large queries
- Query Parsing Cache: 5-15% CPU reduction for repeated queries
- Error Caching: 50-100% reduction in failed request retries
References
- Apollo Client Caching
- Automatic Persisted Queries Spec
- URQL Document Caching
- GraphQL Query Complexity
Questions for Maintainers
- Is there interest in adding caching capabilities to this library?
- Should caching be built into core or offered as a separate
@octokit/graphql-cache
plugin? - Does GitHub's GraphQL API support Automatic Persisted Queries?
- Are there concerns about cache invalidation complexity?
I'm happy to contribute a PR for any of these optimizations if there's interest!
Metadata
Metadata
Assignees
Labels
Type
Projects
Status