Update Ollama prompt and crawler sources

- crawlers.yml appended with more google scholar topics, removed habr AI
- in LLM prompt removed C++ trends relation and changed web rendering to
  web engine
This commit is contained in:
Artur Mukhamadiev 2026-03-16 13:40:46 +03:00
parent 7490970a93
commit 9daf07b72d
2 changed files with 18 additions and 9 deletions

View File

@ -1,7 +1,4 @@
crawlers:
- type: rss
url: "https://habr.com/ru/rss/hubs/artificial_intelligence/articles/?fl=ru"
source: "Habr AI"
- type: rss
url: "https://www.nature.com/nature.rss"
source: "Nature"
@ -102,9 +99,21 @@ crawlers:
source: "SciRate"
- type: scholar
url: "https://scholar.google.com/"
source: "Google Scholar"
query: "WebGPU machine learning"
source: "Google Scholar WebGPU"
query: "WebGPU"
- type: scholar
url: "https://scholar.google.com/"
source: "Google Scholar"
source: "Google Scholar NPU"
query: "NPU acceleration"
- type: scholar
url: "https://scholar.google.com/"
source: "Google Scholar Browsers"
query: "Browsers | Lightweight Web Engine"
- type: scholar
url: "https://scholar.google.com/"
source: "Google Scholar Performance"
query: "Software Optimization"
- type: scholar
url: "https://scholar.google.com/"
source: "Google Scholar BMI"
query: "Brain-machine interface (IoT|Webengine|Linux)"

View File

@ -32,11 +32,11 @@ class OllamaProvider(ILLMProvider):
"1. 'relevance_score' (integer 0-10): Score the potential impact on our R&D targets.\n"
"2. 'summary_ru' (string): A concise technical summary in Russian (2-3 sentences). Explain methodology, core innovation, and practical relevance.\n"
"3. 'anomalies_detected' (list of strings): Identify state-of-the-art (SOTA) breakthroughs, strategic disruptions, new standards, or unexpected results. Return [] if none.\n"
"4. 'category' (string): Must be exactly one of: 'WebEngines/Browsers', 'System Tools (SWE)', 'Middleware Platforms', 'Cross-Platform', 'SmartTV/IoT', 'Samsung New Technologies', 'C++ Trends', 'Competitors', 'Academic/SOTA', 'Other'.\n\n"
"4. 'category' (string): Must be exactly one of: 'WebEngines/Browsers', 'System Tools (SWE)', 'Middleware Platforms', 'Cross-Platform', 'SmartTV/IoT', 'Samsung New Technologies', 'Competitors', 'Academic/SOTA', 'Other'.\n\n"
"SCORING GUIDELINES ('relevance_score'):\n"
"Start with a base score:\n"
"- 9-10 (Core R&D): Breakthroughs in web rendering engines, cross-platform frameworks, modern C++ paradigms relevant to system tools, or SOTA research in web/middleware.\n"
"- 9-10 (Core R&D): Breakthroughs in web engines, cross-platform frameworks, system tools, or SOTA research in web engines/middleware.\n"
"- 7-8 (Ecosystem): Solid improvements applicable to Automotive Content Platforms, IoT ecosystems, SmartTV OS, or major SWE tool improvements.\n"
"- 4-6 (Peripheral): Theoretical work, general programming news, or technologies with distant industrial application.\n"
"- 0-3 (Out of Scope): Pure medicine, social sciences, consumer electronics reviews, pure audio/acoustics.\n\n"