ADR-010: Caching Strategy — Redis (OCI Cache)
Status: Proposed Date: 2026-05-17
Context
BayanCore requires a caching layer to improve performance and reduce load on backend services. Caching is needed for:
- Session management: User sessions and authentication tokens
- API response caching: Frequently accessed data (product catalogs, settings, reference data)
- RAG response caching: Cached AI responses for repeated queries
- ERPNext cache: Frappe framework cache, document cache, query cache
- Rate limiting: Token bucket counters for API rate limiting
ERPNext already uses Redis extensively for its internal caching and as the Celery broker. We need to formalize and extend this for BayanCore's broader caching needs.
Decision
Redis (OCI Cache managed service) is selected as the unified caching layer for BayanCore.
Rationale:
- ERPNext already depends on Redis — no new technology to introduce
- OCI Cache is a fully managed Redis service — minimal operational overhead
- Available in OCI Riyadh region — PDPL compliant
- Supports all required caching patterns: key-value, pub/sub, sorted sets (rate limiting)
- High availability with automatic failover
- In-memory performance for low-latency access
Cache Strategy:
| Cache Type | TTL | Eviction | Purpose |
|---|---|---|---|
| Session Cache | 24h | LRU | User sessions, auth tokens |
| API Response Cache | 5min | LRU | Frequently accessed reference data |
| RAG Response Cache | 1h | LRU | Cached AI responses for repeated queries |
| ERPNext Document Cache | 30min | LRU | Frappe document and query cache |
| Rate Limit Counters | 1min | TTL expiry | API rate limiting (token bucket) |
| Workflow State Cache | 10min | LRU | Active workflow instance state |
Cache Invalidation:
- Write-through for session data
- Cache-aside for API responses
- Event-driven invalidation via OCI Streaming (when data changes, publish cache invalidation event)
- TTL-based expiry as fallback
Consequences
- Positive: Already part of ERPNext stack, fully managed, supports all caching patterns, KSA-compliant
- Trade-offs: In-memory only — data loss on restart (mitigated by persistence config), cache stampede risk (mitigated by distributed locks)
- Risks: Memory limits require monitoring, cache invalidation bugs can serve stale data
Alternatives Considered
- Memcached: Simpler but lacks persistence, pub/sub, and advanced data structures needed for rate limiting
- OCI API Gateway caching: Limited to API responses, doesn't cover session or application-level caching
- CDN caching: Only for static assets, not suitable for dynamic API responses or session data
- Application-level in-process cache: Faster but doesn't support horizontal scaling across multiple app instances