62 lines
2.9 KiB
Markdown
62 lines
2.9 KiB
Markdown
# Agents Architecture and Tasks
|
|
This document outlines the architecture, responsibilities, and tasks for the subagents working on the **"Trend-Scout AI"** Telegram bot.
|
|
|
|
## Core Principles
|
|
- **Test-Driven Development (TDD):** Write tests before implementing features. Use `pytest`.
|
|
- **SOLID Principles:** Ensure code is modular, maintainable, and follows Single Responsibility, Open/Closed, Liskov Substitution, Interface Segregation, and Dependency Inversion principles.
|
|
- **Asynchronous I/O:** Use `asyncio` for network requests, database operations, and bot handling.
|
|
|
|
## 1. Crawler Agent (Data Collection)
|
|
**Responsibility:** Collect data from various sources (RSS feeds, HTML parsing for protected sites like Samsung/Sony Newsroom).
|
|
**Inputs:**
|
|
- Target URLs and source types (RSS, HTML).
|
|
- Configuration for Scrapy/Playwright.
|
|
**Outputs:**
|
|
- Standardized DTOs (Data Transfer Objects) containing: `title`, `url`, `content_text`, `source`, `timestamp`.
|
|
**Tasks:**
|
|
1. Setup TDD environment for crawlers (mocking HTTP responses).
|
|
2. Implement RSS parser for standard sources (e.g., Nature, Habr).
|
|
3. Implement HTML parser (Playwright/Scrapy) for complex/protected sources.
|
|
4. Ensure SRP: Crawlers only fetch and parse, returning raw text data.
|
|
|
|
## 2. AI Processor Agent (NLP & LLM)
|
|
**Responsibility:** Analyze collected data using the Ollama API (`gpt-oss:120b-cloud` model).
|
|
**Inputs:**
|
|
- Standardized DTOs from Crawler Agent.
|
|
- Prompts for relevance scoring (0-10) and summarization in Russian.
|
|
- Keywords for anomaly detection (e.g., "WebGPU", "NPU acceleration", "Edge AI").
|
|
**Outputs:**
|
|
- Enriched DTOs containing: `relevance_score`, `summary_ru`, `anomalies_detected`.
|
|
**Tasks:**
|
|
1. Setup TDD with mocked Ollama API responses.
|
|
2. Implement an `ILLMProvider` interface (Dependency Inversion).
|
|
3. Implement the concrete Ollama provider.
|
|
4. Create prompt templates for relevance, summarization, and anomaly detection.
|
|
|
|
## 3. Vector Storage Agent (Database)
|
|
**Responsibility:** Store and retrieve processed data using a Vector Database (ChromaDB).
|
|
**Inputs:**
|
|
- Enriched DTOs from AI Processor Agent.
|
|
- Search queries.
|
|
**Outputs:**
|
|
- Stored records.
|
|
- Search results (similar news/trends).
|
|
**Tasks:**
|
|
1. Setup TDD for in-memory ChromaDB operations.
|
|
2. Define interfaces for `IVectorStore` (Interface Segregation).
|
|
3. Implement embedding generation (using a lightweight local model or Ollama).
|
|
4. Implement store and semantic search functionalities.
|
|
|
|
## 4. Telegram Bot Agent (Frontend)
|
|
**Responsibility:** Handle user interactions, display summaries, and send alerts for anomalies.
|
|
**Inputs:**
|
|
- Telegram updates.
|
|
- Enriched DTOs (for alerts).
|
|
**Outputs:**
|
|
- Formatted Telegram messages.
|
|
**Tasks:**
|
|
1. Setup TDD for `aiogram` handlers (mocking Telegram API).
|
|
2. Implement `/start`, `/help`, and `/latest` commands.
|
|
3. Implement a notification service for high-relevance news and anomalies.
|
|
4. Ensure OCP: Routers should easily accept new commands without modifying core logic.
|