thoth.ingestion.gitlab_api

GitLab API client with rate limiting, caching, and error handling.

Functions

get_secret_manager()

Return the global SecretManagerClient singleton, creating it if needed.

quote()

Each part of a URL, e.g. the path info, the query, etc., has a different set of reserved characters that must be quoted.

setup_logger(name[, level, simple, json_output])

Create and configure a logger with structured JSON output.

Classes

Any(*args, **kwargs)

Special type indicating an unconstrained type.

CacheEntry(data[, ttl])

Represents a cached API response.

GitLabAPIClient([token, base_url, timeout, ...])

GitLab API client with rate limiting, caching, and error handling.

HTTPAdapter([pool_connections, ...])

The built-in HTTP Adapter for urllib3.

Retry([total, connect, read, redirect, ...])

Retry configuration.

datetime(year, month, day[, hour[, minute[, ...)

The year, month and day arguments are required.

timedelta

Difference between two datetime values.

Exceptions

GitLabAPIError

Raised for GitLab API errors.

RateLimitError

Raised when rate limit is exceeded.

exception thoth.ingestion.gitlab_api.RateLimitError[source]

Bases: Exception

Raised when rate limit is exceeded.

exception thoth.ingestion.gitlab_api.GitLabAPIError[source]

Bases: Exception

Raised for GitLab API errors.

class thoth.ingestion.gitlab_api.CacheEntry(data: Any, ttl: int = 300)[source]

Bases: object

Represents a cached API response.

__init__(data: Any, ttl: int = 300)[source]

Initialize cache entry.

Parameters:
  • data – Data to cache

  • ttl – Time to live in seconds

is_expired() bool[source]

Check if cache entry is expired.

class thoth.ingestion.gitlab_api.GitLabAPIClient(token: str | None = None, base_url: str = 'https://gitlab.com/api/v4', timeout: int = 30, max_retries: int = 3, backoff_factor: float = 2, logger: Logger | None = None)[source]

Bases: object

GitLab API client with rate limiting, caching, and error handling.

__init__(token: str | None = None, base_url: str = 'https://gitlab.com/api/v4', timeout: int = 30, max_retries: int = 3, backoff_factor: float = 2, logger: Logger | None = None)[source]

Initialize GitLab API client.

Parameters:
  • token – GitLab personal access token. If not provided, will try to get from Secret Manager or GITLAB_TOKEN environment variable.

  • base_url – Base URL for GitLab API. If not provided, will try to get from Secret Manager or GITLAB_BASE_URL environment variable.

  • timeout – Request timeout in seconds

  • max_retries – Maximum number of retries for failed requests

  • backoff_factor – Backoff factor for exponential backoff

  • logger – Logger instance

clear_cache() None[source]

Clear all cached data.

get(endpoint: str, params: dict | None = None, use_cache: bool = True, cache_ttl: int = 300) Any[source]

Make GET request.

Parameters:
  • endpoint – API endpoint

  • params – Query parameters

  • use_cache – Whether to use caching

  • cache_ttl – Cache time to live in seconds

Returns:

Response data

post(endpoint: str, data: dict | None = None) Any[source]

Make POST request.

Parameters:
  • endpoint – API endpoint

  • data – Request body data

Returns:

Response data

put(endpoint: str, data: dict | None = None) Any[source]

Make PUT request.

Parameters:
  • endpoint – API endpoint

  • data – Request body data

Returns:

Response data

delete(endpoint: str) Any[source]

Make DELETE request.

Parameters:

endpoint – API endpoint

Returns:

Response data

get_project(project_id: str, use_cache: bool = True) dict[str, Any][source]

Get project details.

Parameters:
  • project_id – Project ID or URL-encoded path

  • use_cache – Whether to use caching

Returns:

Project data

list_projects(params: dict | None = None, use_cache: bool = True) list[dict[str, Any]][source]

List projects.

Parameters:
  • params – Query parameters (e.g., {‘per_page’: 100, ‘page’: 1})

  • use_cache – Whether to use caching

Returns:

List of projects

get_repository_tree(project_id: str, path: str = '', ref: str = 'main', recursive: bool = False, use_cache: bool = True) list[dict[str, Any]][source]

Get repository tree.

Parameters:
  • project_id – Project ID or URL-encoded path

  • path – Path inside repository

  • ref – Branch/tag name

  • recursive – Get tree recursively

  • use_cache – Whether to use caching

Returns:

List of repository tree items

get_file(project_id: str, file_path: str, ref: str = 'main', use_cache: bool = True) dict[str, Any][source]

Get file content from repository.

Parameters:
  • project_id – Project ID or URL-encoded path

  • file_path – Path to file in repository

  • ref – Branch/tag name

  • use_cache – Whether to use caching

Returns:

File data including content

get_commits(project_id: str, ref: str = 'main', since: str | None = None, until: str | None = None, path: str | None = None, use_cache: bool = True) list[dict[str, Any]][source]

Get commits for a project.

Parameters:
  • project_id – Project ID or URL-encoded path

  • ref – Branch/tag name

  • since – Only commits after this date (ISO 8601 format)

  • until – Only commits before this date (ISO 8601 format)

  • path – Only commits that include this file path

  • use_cache – Whether to use caching

Returns:

List of commits

get_commit(project_id: str, commit_sha: str, use_cache: bool = True) dict[str, Any][source]

Get a single commit.

Parameters:
  • project_id – Project ID or URL-encoded path

  • commit_sha – Commit SHA

  • use_cache – Whether to use caching

Returns:

Commit data

get_commit_diff(project_id: str, commit_sha: str, use_cache: bool = True) list[dict[str, Any]][source]

Get diff of a commit.

Parameters:
  • project_id – Project ID or URL-encoded path

  • commit_sha – Commit SHA

  • use_cache – Whether to use caching

Returns:

List of diffs

list_branches(project_id: str, use_cache: bool = True) list[dict[str, Any]][source]

List branches.

Parameters:
  • project_id – Project ID or URL-encoded path

  • use_cache – Whether to use caching

Returns:

List of branches

get_branch(project_id: str, branch: str, use_cache: bool = True) dict[str, Any][source]

Get branch details.

Parameters:
  • project_id – Project ID or URL-encoded path

  • branch – Branch name

  • use_cache – Whether to use caching

Returns:

Branch data

list_merge_requests(project_id: str, state: str = 'opened', params: dict | None = None, use_cache: bool = True) list[dict[str, Any]][source]

List merge requests.

Parameters:
  • project_id – Project ID or URL-encoded path

  • state – State filter (‘opened’, ‘closed’, ‘merged’, ‘all’)

  • params – Additional query parameters

  • use_cache – Whether to use caching

Returns:

List of merge requests

get_merge_request(project_id: str, mr_iid: int, use_cache: bool = True) dict[str, Any][source]

Get merge request details.

Parameters:
  • project_id – Project ID or URL-encoded path

  • mr_iid – Merge request IID

  • use_cache – Whether to use caching

Returns:

Merge request data

get_current_user(use_cache: bool = True) dict[str, Any][source]

Get current authenticated user.

Parameters:

use_cache – Whether to use caching

Returns:

User data

Raises:

GitLabAPIError – If not authenticated

get_user(user_id: int, use_cache: bool = True) dict[str, Any][source]

Get user details.

Parameters:
  • user_id – User ID

  • use_cache – Whether to use caching

Returns:

User data

get_rate_limit_info() dict[str, Any][source]

Get current rate limit information.

Returns:

Dictionary with rate limit info