ADR-008: Vector Store — OCI Search with OpenSearch

Status: Proposed Date: 2026-05-17

Context

BayanCore's AI features (Ask, Act, Automate tiers) require a vector database for the RAG (Retrieval-Augmented Generation) pipeline. The system must store embeddings of company documents, policies, and business data to enable semantic search and context-aware AI responses. All vector data must remain in KSA to comply with PDPL data residency requirements.

Since we selected MariaDB as the primary ERP database (ADR-003), we cannot use pgvector. We need a dedicated vector search solution that is fully managed, KSA-compliant, and integrates well with our OCI-hosted AI inference models.

Decision

OCI Search with OpenSearch is selected as the vector search engine for BayanCore's RAG pipeline.

Rationale:

Fully managed service by OCI — minimal operational overhead
Available in OCI Riyadh region — full PDPL compliance
Native vector search with k-NN plugin for billion-scale similarity search
Hybrid search: combines vector similarity with full-text search for better results
Built-in RAG pipeline support with OCI Generative AI Agents integration
LangChain integration for simplified AI application development
Supports Cohere and Llama embedding models (both OCI-hosted)
Enterprise-grade security with encryption, access controls, and auditing

Deployment Configuration:

Single OpenSearch cluster in OCI Riyadh
Vector indexes for company document embeddings
Full-text indexes for keyword search fallback
Integration with OCI Generative AI for embedding generation (Cohere embed-multilingual-v3)
Data ingestion pipeline: documents → chunking → embedding → OpenSearch index

Consequences

Positive: Fully managed, KSA-compliant, RAG-ready, hybrid search capabilities, less ops overhead than self-hosted alternatives
Trade-offs: Tied to OCI ecosystem, OpenSearch-specific query syntax
Risks: Service availability depends on OCI region health, embedding model costs

Alternatives Considered

Qdrant on OKE: More flexible vector features but requires managing Kubernetes deployment and increases operational complexity
Oracle AI Vector Search (Oracle DB 26ai): Powerful but requires Oracle Database license (expensive) and conflicts with MariaDB decision
pgvector on PostgreSQL: Would require running PostgreSQL alongside MariaDB, adding unnecessary complexity
Milvus self-hosted: Apache-licensed and scalable but requires full self-management on OCI compute

Context​

Decision​

Consequences​

Alternatives Considered​

Related Documents​

Context

Decision

Consequences

Alternatives Considered

Related Documents