Worker
thoth.ingestion.worker
¶
Ingestion worker HTTP server for Cloud Run.
This module provides the HTTP application with routing to workflow endpoints. All business logic has been extracted to the flows/ package for modularity: - health.py: Health check endpoint - clone.py: Clone handbook to GCS - ingest.py: Main ingestion workflow (file listing, batching, Cloud Tasks) - batch.py: Batch processing with idempotency checks - merge.py: Merge isolated batch LanceDB tables into main store - job_status.py: Job status and listing endpoints
The worker maintains singleton instances for shared services: - SourceRegistry: Multi-source configuration (handbook, dnd, personal) - JobManager: Firestore job tracking with sub-job aggregation - TaskQueueClient: Cloud Tasks batch distribution