Environment Configuration for Thoth¶
This document describes the environment variables used by the Thoth application.
Required Environment Variables¶
Python Configuration¶
PYTHONUNBUFFERED=1- Disable Python output buffering for real-time loggingPYTHONDONTWRITEBYTECODE=1- Prevent Python from writing .pyc files
Google Cloud Platform¶
GCP_PROJECT_ID- Your GCP project ID (e.g.,thoth-dev-485501)GCS_BUCKET_NAME- Name of GCS bucket for vector DB persistence (e.g.,thoth-storage-bucket)
Application Settings¶
GCS_BUCKET_NAME/ local path - LanceDB usesgs://bucket/lancedbin cloud or local directoryLOG_LEVEL- Logging level: DEBUG, INFO, WARNING, ERROR, CRITICAL (default:INFO)
Optional Environment Variables¶
GitLab Integration¶
GITLAB_URL- GitLab instance URL (default:https://gitlab.com)GITLAB_TOKEN- Personal access token for GitLab APIGITLAB_PROJECT_ID- GitLab project ID to sync
Repository Configuration¶
REPO_LOCAL_PATH- Local path for cloning repositories (default:./handbook_repo)SYNC_SCHEDULE- Cron schedule for automatic syncing (default:0 */6 * * *- every 6 hours)
Model Configuration¶
EMBEDDING_MODEL- Sentence transformer model name (default:all-MiniLM-L6-v2)CHUNK_SIZE- Document chunk size in characters (default:1000)CHUNK_OVERLAP- Chunk overlap in characters (default:200)
GCS Backup¶
GCS_AUTO_BACKUP- Enable automatic backups to GCS (default:false)GCS_BACKUP_SCHEDULE- Cron schedule for backups (default:0 0 * * *- daily at midnight)
Cloud Run Configuration¶
When deploying to Google Cloud Run, set these environment variables in the service configuration:
gcloud run services update thoth-mcp-server \
--region=us-central1 \
--set-env-vars="PYTHONUNBUFFERED=1,GCP_PROJECT_ID=thoth-dev-485501,GCS_BUCKET_NAME=thoth-storage-bucket,LOG_LEVEL=INFO"
Or use the Cloud Console:
Navigate to Cloud Run → Select service → Edit & Deploy New Revision
Go to “Variables & Secrets” tab
Add environment variables as key-value pairs
Terraform Configuration¶
Environment variables are set in infra/cloud_run.tf:
env {
name = "GCS_BUCKET_NAME"
value = google_storage_bucket.thoth_bucket.name
}
To update:
Modify
cloud_run.tfRun
terraform planto preview changesRun
terraform applyto deploy
Local Development¶
Create a .env file in the project root:
# .env
GCP_PROJECT_ID=thoth-dev-485501
GCS_BUCKET_NAME=thoth-storage-bucket
# LanceDB: local path via --db-path or gs://bucket/lancedb when GCS_BUCKET_NAME set
LOG_LEVEL=DEBUG
GITLAB_TOKEN=your_gitlab_token_here
Load environment variables:
# Using direnv
direnv allow
# Or manually with bash
export $(cat .env | xargs)
# Or with Python dotenv
# Already configured in the application
Service Account Credentials¶
For local development with GCS:
Create a service account in GCP Console
Grant roles:
roles/storage.objectAdmin- For GCS bucket accessroles/logging.logWriter- For Cloud Logging
Download JSON key file
Set environment variable:
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account-key.json
For Cloud Run deployment:
Service account is automatically configured via Terraform
Uses workload identity (no key file needed)
Verification¶
Check environment configuration:
# Run health check
python -c "from thoth.shared.health import health_check_cli; health_check_cli()"
# Verify GCS access
python -c "from google.cloud import storage; client = storage.Client(); print([b.name for b in client.list_buckets()])"
Security Best Practices¶
Never commit secrets to version control
Use
.envfiles for local development (add to.gitignore)Use Secret Manager for sensitive values in production
Rotate credentials regularly
Use least-privilege service accounts
Enable audit logging for GCS access
Troubleshooting¶
GCS Access Issues¶
# Check credentials
gcloud auth application-default login
# Verify bucket access
gsutil ls gs://thoth-storage-bucket/
Cloud Run Environment¶
# View current environment variables
gcloud run services describe thoth-mcp-server \
--region=us-central1 \
--format="value(spec.template.spec.containers[0].env)"
# Check logs for startup errors
gcloud logging read "resource.type=cloud_run_revision" --limit=50
Missing Variables¶
The application will log warnings for missing optional variables but will fail for required ones. Check logs:
# Local
python -m thoth.mcp.server
# Cloud Run
gcloud logging read "resource.labels.service_name=thoth-mcp-server" --limit=100