thoth.ingestion.worker

Ingestion worker HTTP server for Cloud Run.

This module provides the HTTP application with routing to workflow endpoints. All business logic has been extracted to the flows/ package for modularity: - health.py: Health check endpoint - clone.py: Clone handbook to GCS - ingest.py: Main ingestion workflow (file listing, batching, Cloud Tasks) - batch.py: Batch processing with idempotency checks - merge.py: Merge isolated batch LanceDB tables into main store - job_status.py: Job status and listing endpoints

The worker maintains singleton instances for shared services: - SourceRegistry: Multi-source configuration (handbook, dnd, personal) - JobManager: Firestore job tracking with sub-job aggregation - TaskQueueClient: Cloud Tasks batch distribution

Functions

configure_root_logger([level, json_output])

Configure the root logger for the application.

create_app()

Create the Starlette application with all routes.

main()

Run the uvicorn server.

setup_logger(name[, level, simple, json_output])

Create and configure a logger with structured JSON output.

Classes

Route(path, endpoint, *[, methods, name, ...])

Starlette([debug, routes, middleware, ...])

Creates an Starlette application.

thoth.ingestion.worker.create_app() Starlette[source]

Create the Starlette application with all routes.

thoth.ingestion.worker.main() None[source]

Run the uvicorn server.