Skip to content

Ingest

thoth.ingestion.flows.ingest

Main ingestion workflow.

Handles the /ingest endpoint which: 1. Lists files from GCS 2. Creates sub-jobs for each batch 3. Enqueues batches to Cloud Tasks for parallel processing

logger = setup_logger(__name__) module-attribute

ingest(request: Request) -> JSONResponse async

Start an ingestion job.

Creates a job record and starts background processing. Returns immediately with job_id for status tracking.

Request body

source: Source identifier (required) - 'handbook', 'dnd', or 'personal' force: Force full re-ingestion (optional, default: false)

Returns:

Type Description
JSONResponse

202 Accepted with job_id for status polling