:Release Notes: - Added a new Telegram command `/get_hottest <number> [format]` to export the top `N` trends as a CSV or Markdown file. :Detailed Notes: - Created `ITrendExporter` interface and concrete `CsvTrendExporter` and `MarkdownTrendExporter` implementations for formatting DTOs. - Updated `src/bot/handlers.py` to include `command_get_hottest_handler` mapping to `/get_hottest`. - Used `BufferedInputFile` to stream generated files asynchronously directly to Telegram without disk I/O. - Fixed unrelated pipeline test failures regarding `EphemeralClient` usage with ChromaDB. :Testing Performed: - Implemented TDD with `pytest` for parsing parameters, exporting logic, and handling empty DB scenarios. - Ran the full test suite (90 tests) which completed successfully. :QA Notes: - Fully covered the new handler using `pytest-asyncio` and `aiogram` mocked objects. :Issues Addressed: - Resolves request to export high-relevance parsed entries. Change-Id: I25dd90f1e4491ba298682518d835259bffab4190
Trend-Scout AI
Trend-Scout AI is an intelligent Telegram bot designed for automated monitoring, analysis, and summarization of technological trends. It was developed to support R&D activities (specifically within the context of LG Electronics R&D Lab in St. Petersburg) by scanning the environment for emerging technologies, competitive benchmarks, and scientific breakthroughs.
🚀 Key Features
- Automated Multi-Source Crawling: Monitors RSS feeds, scientific journals (Nature, Science), IT conferences (CES, CVPR), and corporate newsrooms using Playwright and Scrapy.
- AI-Powered Analysis: Utilizes LLMs (via Ollama API) to evaluate the relevance of news articles based on specific R&D landscapes (e.g., WebOS, Chromium, Edge AI).
- Russian Summarization: Automatically generates concise summaries in Russian for quick review.
- Anomaly Detection: Alerts users when there is a significant surge in mentions of specific technologies (e.g., "WebGPU", "NPU acceleration").
- Semantic Search: Employs a vector database (ChromaDB) to allow searching for trends and news by meaning rather than just keywords.
- Telegram Interface: Simple and effective interaction via Telegram for receiving alerts and querying the latest trends.
🏗 Architecture
The project follows a modular, agent-based architecture designed around SOLID principles and asynchronous I/O:
- Crawler Agent: Responsible for fetching and parsing data from various sources into standardized DTOs.
- AI Processor Agent: Enriches data by scoring relevance, summarizing content, and detecting technological anomalies using LLMs.
- Vector Storage Agent: Manages persistent storage and semantic retrieval using ChromaDB.
- Telegram Bot Agent: Handles user interaction, command processing (
/start,/latest,/help), and notification delivery. - Orchestrator: Coordinates the flow between crawling, processing, and storage in periodic background iterations.
🛠 Tech Stack
- Language: Python 3.12+
- Frameworks:
aiogram(Telegram Bot),playwright(Web Crawling),pydantic(Data Validation) - Database:
ChromaDB(Vector Store) - AI/LLM:
Ollama(local or cloud models) - Testing:
pytest,pytest-asyncio - Environment: Docker-ready,
.envfor configuration
📋 Prerequisites
- Python 3.12 or higher
- Ollama installed and running (for AI processing)
- Playwright browsers installed (
playwright install chromium)
⚙️ Installation & Setup
-
Clone the repository:
git clone https://github.com/your-repo/trend-scout-ai.git cd trend-scout-ai -
Create and activate a virtual environment:
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate -
Install dependencies:
pip install -r requirements.txt playwright install chromium -
Configure environment variables: Create a
.envfile in the root directory:TELEGRAM_BOT_TOKEN=your_bot_token_here TELEGRAM_CHAT_ID=your_chat_id_here OLLAMA_API_URL=http://localhost:11434/api/generate CHROMA_DB_PATH=./chroma_db
🏃 Usage
Start the Bot and Background Crawler
To run the full system (bot + periodic crawler):
python -m src.main
Run Manual Update
To trigger a manual crawl and update of the vector store:
python update_chroma_store.py
🧪 Testing
The project maintains a high test coverage following TDD principles.
Run all tests:
pytest
Run specific test categories:
pytest tests/crawlers/
pytest tests/processor/
pytest tests/storage/
📂 Project Structure
src/: Core application logic.bot/: Telegram bot handlers and setup.crawlers/: Web scraping modules and factory.processor/: LLM integration and prompt logic.storage/: Vector database operations.orchestrator/: Main service coordination.
tests/: Comprehensive test suite.docs/: Architecture Decision Records (ADR) and methodology.chroma_db/: Persistent vector storage (local).requirements.txt: Python dependencies.