AI-Trend-Scout

Author	SHA1	Message	Date
Artur Mukhamadiev	9daf07b72d	Update Ollama prompt and crawler sources - crawlers.yml appended with more google scholar topics, removed habr AI - in LLM prompt removed C++ trends relation and changed web rendering to web engine	2026-03-16 13:45:20 +03:00
Artur Mukhamadiev	7490970a93	Update Ollama prompt categories to include System Tools and match R&D targets	2026-03-16 13:36:55 +03:00
Artur Mukhamadiev	66399f23ab	Update Ollama prompt to a unified Strategic Tech Scout format with stricter AI penalty	2026-03-16 13:30:28 +03:00
Artur Mukhamadiev	fbdb7d7806	feat(ai): optimize processor for academic content - Add specialized prompt branch for research papers and SOTA detection - Improve Russian summarization quality for technical abstracts - Update relevance scoring to prioritize NPU/Edge AI breakthroughs - Add README.md with project overview	2026-03-16 00:11:19 +03:00
Artur Mukhamadiev	a304ae9cd2	feat(crawler): add academic and research sources - Implement crawlers for Microsoft Research, SciRate, and Google Scholar - Use Playwright with stealth for Google Scholar anti-bot mitigation - Update CrawlerFactory to support new research crawler types - Add unit and integration tests for all academic sources with high coverage	2026-03-16 00:11:15 +03:00
Artur Mukhamadiev	65fccbc614	feat(storage): implement hybrid search and fix async chroma i/o - Add ADR 001 for Hybrid Search Architecture - Implement Phase 1 (Exact Match) and Phase 2 (Semantic Fallback) in ChromaStore - Wrap blocking ChromaDB calls in asyncio.to_thread - Update IVectorStore interface to support category filtering and thresholds - Add comprehensive tests for hybrid search logic	2026-03-16 00:11:07 +03:00
Artur Mukhamadiev	217037f72e	feat(crawlers): convert multiple sources from Playwright to Static/RSS - Added `StaticCrawler` for generic aiohttp+BS4 parsing. - Added `SkolkovoCrawler` for specialized Next.js parsing of sk.ru. - Converted ICRA 2025, RSF, CES 2025, and Telegram Addmeto to `static`. - Converted Horizon Europe to `rss` using its native feed. - Updated `CrawlerFactory` to support new crawler types. - Validated changes with unit tests.	2026-03-15 21:21:14 +03:00
Artur Mukhamadiev	a363ca41cf	feat(crawlers): implement specialized CppConf crawler and AI analysis - Added CppConfCrawler using aiohttp and regex to parse Next.js JSON data, skipping the Playwright bottleneck. - Added C++ specific prompts to OllamaProvider for trend analysis (identifying C++26, memory safety, coroutines). - Created offline pytest fixtures and TDD unit tests for the parser. - Created end-to-end pipeline test mapping Crawler -> AI Processor -> Vector DB.	2026-03-15 20:34:39 +03:00
Artur Mukhamadiev	a0eeba0918	Enhance /hottest command with optional limit	2026-03-15 01:34:33 +03:00
Artur Mukhamadiev	9fdb4b35cd	Implement 'Top Ranked' feature and expand Habr sources	2026-03-15 01:32:25 +03:00
Artur Mukhamadiev	019d9161de	Update crawler selectors and add comprehensive tests	2026-03-15 00:48:27 +03:00
Artur Mukhamadiev	87af585e1b	Refactor crawlers configuration and add new sources - Move hard-coded crawlers from main.py to crawlers.yml - Use CrawlerFactory to load configuration - Add 9 new sources: C++ Russia, ICRA 2025, Technoprom, INNOPROM, Hannover Messe, RSF, Skolkovo, Horizon Europe, Addmeto - Update task list	2026-03-15 00:45:04 +03:00
Artur Mukhamadiev	9c31977e98	[feat] playwright crawler :Release Notes: - :Detailed Notes: - :Testing Performed: - :QA Notes: as always AI generated :Issues Addressed: -	2026-03-14 20:13:53 +03:00
Artur Mukhamadiev	4bf7cb4331	[perf] stabilization of previous release	2026-03-13 13:23:30 +03:00
Artur Mukhamadiev	9c8e4c7345	[tg] stats/search features Processed data is not written back to user	2026-03-13 12:50:49 +03:00
Artur Mukhamadiev	5f093075f7	[ai] mvp generated by gemini	2026-03-13 11:48:37 +03:00

16 Commits