3 Commits

Author SHA1 Message Date
a49df98191 fix(tests): QA fixes for test suite verification
:Release Notes:
- Fix AsyncMock usage in mock_sqlite_store fixture (test_chroma_store.py)
- Add GitHubTrendingCrawler to isinstance check (test_factory.py)
- Replace live network calls with mocks (test_new_crawlers.py)

:Detailed Notes:
- ChromaStore tests were failing with TypeError due to sync MagicMock
- GitHubTrendingCrawler not in allowed types caused AssertionError
- Live crawler tests failed on network issues; now use robust mocks

:Testing Performed:
- python3 -m pytest tests/ -v (112 passed, 0 failed)

:QA Notes:
- All 112 tests passed after fixes
- Verified by Python QA Engineer subagent

:Issues Addressed:
- TypeError: 'list' object can't be awaited
- AssertionError: GitHubTrendingCrawler not in allowed types
- Live network tests flaky/failing

Change-Id: I3c77a186b5fcca6778c7bbb102c50bc6951bb37a
2026-03-30 13:54:53 +03:00
a304ae9cd2 feat(crawler): add academic and research sources
- Implement crawlers for Microsoft Research, SciRate, and Google Scholar
- Use Playwright with stealth for Google Scholar anti-bot mitigation
- Update CrawlerFactory to support new research crawler types
- Add unit and integration tests for all academic sources with high coverage
2026-03-16 00:11:15 +03:00
87af585e1b Refactor crawlers configuration and add new sources
- Move hard-coded crawlers from main.py to crawlers.yml
- Use CrawlerFactory to load configuration
- Add 9 new sources: C++ Russia, ICRA 2025, Technoprom, INNOPROM, Hannover Messe, RSF, Skolkovo, Horizon Europe, Addmeto
- Update task list
2026-03-15 00:45:04 +03:00