# ADR 001: Architecture Design for Enhanced Semantic & Hybrid Search ## 1. Context and Problem Statement The "Trend-Scout AI" bot currently utilizes a basic synchronous implementation of ChromaDB to fulfill both categorical retrieval (`/latest`) and free-text queries (`/search`). Two major issues have severely impacted the user experience: 1. **Incorrect Categories in `/latest`**: The system performs a dense vector search using the requested category name (e.g., "AI") rather than a deterministic exact match. This returns semantically related news regardless of their actual assigned category, yielding false positives. 2. **Poor Semantic Matches in `/search`**: - The default English-centric embedding model (e.g., `all-MiniLM-L6-v2`) handles Russian summaries and specialized technical acronyms poorly. - Pure vector search ignores exact keyword matches, leading to frustrated user expectations when searching for specific entities (e.g., "OpenAI o1" or specific version numbers). 3. **Blocking I/O operations**: The `ChromaStore` executes blocking synchronous operations within `async def` wrappers, potentially starving the `asyncio` event loop and violating asynchronous data flow requirements. ## 2. Decision Drivers * **Accuracy & Relevance**: Strict categorization and high recall for exact keywords + conceptual similarity. * **Multilingual Support**: Strong performance on both English source texts and Russian summaries. * **Performance & Concurrency**: Fully non-blocking (async) operations. * **Adherence to SOLID**: Maintain strict interface boundaries, dependency inversion, and existing Domain Transfer Objects (DTOs). * **Alignment with Agent Architecture**: Ensure the Vector Storage Agent focuses strictly on storage/retrieval coordination without leaking AI processing duties. ## 3. Proposed Architecture ### 3.1. Asynchronous Data Flow (I/O) * **Decision**: Migrate the local ChromaDB calls to run in a thread pool executor. Alternatively, if ChromaDB is hosted as a standalone server, utilize `chromadb.AsyncHttpClient`. * **Implementation**: Encapsulate blocking calls like `self.collection.upsert()` and `self.collection.query()` inside `asyncio.to_thread()` to prevent blocking the Telegram bot's main event loop. ### 3.2. Interface Segregation (ISP) for Storage The current `IVectorStore` interface conflates generic vector searching, exact categorical retrieval, and database administration. * **Action**: Segregate the interfaces to adhere to ISP. * **Refactored Interfaces**: ```python class IStoreCommand(ABC): @abstractmethod async def store(self, item: EnrichedNewsItemDTO) -> None: ... class IStoreQuery(ABC): @abstractmethod async def search_hybrid(self, query: str, limit: int = 5) -> List[EnrichedNewsItemDTO]: ... @abstractmethod async def get_latest_by_category(self, category: Optional[str], limit: int = 10) -> List[EnrichedNewsItemDTO]: ... @abstractmethod async def get_top_ranked(self, limit: int = 10) -> List[EnrichedNewsItemDTO]: ... ``` ### 3.3. Strict Metadata Filtering for `/latest` * **Mechanism**: The `/latest` command must completely bypass vector similarity search. Instead, it will use ChromaDB's `.get()` method coupled with a strict `where` metadata filter: `where={"category": {"$eq": category}}`. * **Sorting Architecture**: Because ChromaDB does not natively support sorting results by a metadata field (like `timestamp`), the `get_latest_by_category` method will over-fetch (e.g., fetch up to 100 recent items using the metadata filter) and perform a fast, deterministic in-memory sort by `timestamp` descending before slicing to the requested `limit`. ### 3.4. Hybrid Search Architecture (Keyword + Vector) * **Mechanism**: Implement a Hybrid Search Strategy utilizing **Reciprocal Rank Fusion (RRF)**. * **Sparse Retrieval (Keyword)**: Integrate a lightweight keyword index alongside ChromaDB. Given the bot's scale, **SQLite FTS5 (Full-Text Search)** is the optimal choice. It provides persistent, fast token matching without the overhead of Elasticsearch. * **Dense Retrieval (Vector)**: ChromaDB semantic search. * **Fusion Strategy**: 1. The new `HybridSearchStrategy` issues queries to both the SQLite FTS index and ChromaDB concurrently using `asyncio.gather`. 2. The results are normalized using the RRF formula: `Score = 1 / (k + rank_sparse) + 1 / (k + rank_dense)` (where `k` is typically 60). 3. The combined list of DTOs is sorted by the fused score and returned. ### 3.5. Embedding Model Evaluation & Upgrade * **Decision**: Replace the default ChromaDB embedding function with a dedicated, explicitly configured multilingual model. * **Recommendation**: Utilize `intfloat/multilingual-e5-small` (for lightweight CPU environments) or `sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2`. Both provide excellent English-Russian cross-lingual semantic alignment. * **Integration (DIP)**: Apply the Dependency Inversion Principle by injecting the embedding function (or an `IEmbeddingProvider` interface) into the `ChromaStore` constructor. This allows for seamless A/B testing of embedding models without touching the core storage logic. ## 4. Application to the Agent Architecture * **Vector Storage Agent (Database)**: This agent's responsibility shifts from "pure vector storage" to "Hybrid Storage Management." It coordinates the `ChromaStore` (Dense) and `SQLiteStore` (Sparse) implementations. * **AI Processor Agent**: To maintain Single Responsibility (SRP), embedding generation can be shifted from the storage layer to the AI Processor Agent. The AI Processor generates the vector using an Ollama hosted embedding model and attaches it directly to the `EnrichedNewsItemDTO`. The Storage Agent simply stores the pre-calculated vector, drastically reducing the dependency weight of the storage module. ## 5. Next Steps for Implementation 1. Add `sqlite3` FTS5 table initialization to the project scaffolding. 2. Refactor `src/storage/base.py` to segregate `IStoreQuery` and `IStoreCommand`. 3. Update `ChromaStore` to accept pre-calculated embeddings and utilize `asyncio.to_thread`. 4. Implement the RRF sorting algorithm in a new `search_hybrid` pipeline. 5. Update `src/bot/handlers.py` to route `/latest` through `get_latest_by_category`.