Skip to content

Worker

thoth.ingestion.worker

Ingestion worker HTTP server for Cloud Run.

This module provides the HTTP application with routing to workflow endpoints. All business logic has been extracted to the flows/ package for modularity: - health.py: Health check endpoint - clone.py: Clone handbook to GCS - ingest.py: Main ingestion workflow (file listing, batching, Cloud Tasks) - batch.py: Batch processing with idempotency checks - merge.py: Merge isolated batch LanceDB tables into main store - job_status.py: Job status and listing endpoints

The worker maintains singleton instances for shared services: - SourceRegistry: Multi-source configuration (handbook, dnd, personal) - JobManager: Firestore job tracking with sub-job aggregation - TaskQueueClient: Cloud Tasks batch distribution

logger = setup_logger(__name__) module-attribute

BATCH_PREFIX_PATTERN = 'lancedb_batch_' module-attribute

create_app() -> Starlette

Create the Starlette application with all routes.

main() -> None

Run the uvicorn server.