Skip to content

Gitlab api

thoth.ingestion.gitlab_api

GitLab API client with rate limiting, caching, and error handling.

DEFAULT_BASE_URL = 'https://gitlab.com/api/v4' module-attribute

DEFAULT_TIMEOUT = 30 module-attribute

DEFAULT_MAX_RETRIES = 3 module-attribute

DEFAULT_BACKOFF_FACTOR = 2 module-attribute

CACHE_DEFAULT_TTL = 300 module-attribute

RATE_LIMIT_MARGIN = 10 module-attribute

MSG_AUTH_REQUIRED = 'Authentication token required for this operation' module-attribute

MSG_RATE_LIMIT_EXCEEDED = 'Rate limit exceeded. Waiting {wait_time}s' module-attribute

MSG_REQUEST_FAILED = 'Request failed after {attempts} attempts: {error}' module-attribute

MSG_INVALID_RESPONSE = 'Invalid response from GitLab API: {error}' module-attribute

RateLimitError

Raised when rate limit is exceeded.

GitLabAPIError

Raised for GitLab API errors.

CacheEntry

Represents a cached API response.

data = data instance-attribute

expires_at = datetime.now(tz=UTC) + timedelta(seconds=ttl) instance-attribute

__init__(data: Any, ttl: int = CACHE_DEFAULT_TTL)

Initialize cache entry.

Parameters:

Name Type Description Default
data Any

Data to cache

required
ttl int

Time to live in seconds

CACHE_DEFAULT_TTL

is_expired() -> bool

Check if cache entry is expired.

GitLabAPIClient

GitLab API client with rate limiting, caching, and error handling.

token = token instance-attribute

base_url = base_url instance-attribute

timeout = timeout instance-attribute

logger = logger or setup_logger(__name__) instance-attribute

session = requests.Session() instance-attribute

__init__(token: str | None = None, base_url: str = DEFAULT_BASE_URL, timeout: int = DEFAULT_TIMEOUT, max_retries: int = DEFAULT_MAX_RETRIES, backoff_factor: float = DEFAULT_BACKOFF_FACTOR, logger: logging.Logger | None = None)

Initialize GitLab API client.

Parameters:

Name Type Description Default
token str | None

GitLab personal access token. If not provided, will try to get from Secret Manager or GITLAB_TOKEN environment variable.

None
base_url str

Base URL for GitLab API. If not provided, will try to get from Secret Manager or GITLAB_BASE_URL environment variable.

DEFAULT_BASE_URL
timeout int

Request timeout in seconds

DEFAULT_TIMEOUT
max_retries int

Maximum number of retries for failed requests

DEFAULT_MAX_RETRIES
backoff_factor float

Backoff factor for exponential backoff

DEFAULT_BACKOFF_FACTOR
logger Logger | None

Logger instance

None

clear_cache() -> None

Clear all cached data.

get(endpoint: str, params: dict | None = None, use_cache: bool = True, cache_ttl: int = CACHE_DEFAULT_TTL) -> Any

Make GET request.

Parameters:

Name Type Description Default
endpoint str

API endpoint

required
params dict | None

Query parameters

None
use_cache bool

Whether to use caching

True
cache_ttl int

Cache time to live in seconds

CACHE_DEFAULT_TTL

Returns:

Type Description
Any

Response data

post(endpoint: str, data: dict | None = None) -> Any

Make POST request.

Parameters:

Name Type Description Default
endpoint str

API endpoint

required
data dict | None

Request body data

None

Returns:

Type Description
Any

Response data

put(endpoint: str, data: dict | None = None) -> Any

Make PUT request.

Parameters:

Name Type Description Default
endpoint str

API endpoint

required
data dict | None

Request body data

None

Returns:

Type Description
Any

Response data

delete(endpoint: str) -> Any

Make DELETE request.

Parameters:

Name Type Description Default
endpoint str

API endpoint

required

Returns:

Type Description
Any

Response data

get_project(project_id: str, use_cache: bool = True) -> dict[str, Any]

Get project details.

Parameters:

Name Type Description Default
project_id str

Project ID or URL-encoded path

required
use_cache bool

Whether to use caching

True

Returns:

Type Description
dict[str, Any]

Project data

list_projects(params: dict | None = None, use_cache: bool = True) -> list[dict[str, Any]]

List projects.

Parameters:

Name Type Description Default
params dict | None

Query parameters (e.g., {'per_page': 100, 'page': 1})

None
use_cache bool

Whether to use caching

True

Returns:

Type Description
list[dict[str, Any]]

List of projects

get_repository_tree(project_id: str, path: str = '', ref: str = 'main', recursive: bool = False, use_cache: bool = True) -> list[dict[str, Any]]

Get repository tree.

Parameters:

Name Type Description Default
project_id str

Project ID or URL-encoded path

required
path str

Path inside repository

''
ref str

Branch/tag name

'main'
recursive bool

Get tree recursively

False
use_cache bool

Whether to use caching

True

Returns:

Type Description
list[dict[str, Any]]

List of repository tree items

get_file(project_id: str, file_path: str, ref: str = 'main', use_cache: bool = True) -> dict[str, Any]

Get file content from repository.

Parameters:

Name Type Description Default
project_id str

Project ID or URL-encoded path

required
file_path str

Path to file in repository

required
ref str

Branch/tag name

'main'
use_cache bool

Whether to use caching

True

Returns:

Type Description
dict[str, Any]

File data including content

get_commits(project_id: str, ref: str = 'main', since: str | None = None, until: str | None = None, path: str | None = None, use_cache: bool = True) -> list[dict[str, Any]]

Get commits for a project.

Parameters:

Name Type Description Default
project_id str

Project ID or URL-encoded path

required
ref str

Branch/tag name

'main'
since str | None

Only commits after this date (ISO 8601 format)

None
until str | None

Only commits before this date (ISO 8601 format)

None
path str | None

Only commits that include this file path

None
use_cache bool

Whether to use caching

True

Returns:

Type Description
list[dict[str, Any]]

List of commits

get_commit(project_id: str, commit_sha: str, use_cache: bool = True) -> dict[str, Any]

Get a single commit.

Parameters:

Name Type Description Default
project_id str

Project ID or URL-encoded path

required
commit_sha str

Commit SHA

required
use_cache bool

Whether to use caching

True

Returns:

Type Description
dict[str, Any]

Commit data

get_commit_diff(project_id: str, commit_sha: str, use_cache: bool = True) -> list[dict[str, Any]]

Get diff of a commit.

Parameters:

Name Type Description Default
project_id str

Project ID or URL-encoded path

required
commit_sha str

Commit SHA

required
use_cache bool

Whether to use caching

True

Returns:

Type Description
list[dict[str, Any]]

List of diffs

list_branches(project_id: str, use_cache: bool = True) -> list[dict[str, Any]]

List branches.

Parameters:

Name Type Description Default
project_id str

Project ID or URL-encoded path

required
use_cache bool

Whether to use caching

True

Returns:

Type Description
list[dict[str, Any]]

List of branches

get_branch(project_id: str, branch: str, use_cache: bool = True) -> dict[str, Any]

Get branch details.

Parameters:

Name Type Description Default
project_id str

Project ID or URL-encoded path

required
branch str

Branch name

required
use_cache bool

Whether to use caching

True

Returns:

Type Description
dict[str, Any]

Branch data

list_merge_requests(project_id: str, state: str = 'opened', params: dict | None = None, use_cache: bool = True) -> list[dict[str, Any]]

List merge requests.

Parameters:

Name Type Description Default
project_id str

Project ID or URL-encoded path

required
state str

State filter ('opened', 'closed', 'merged', 'all')

'opened'
params dict | None

Additional query parameters

None
use_cache bool

Whether to use caching

True

Returns:

Type Description
list[dict[str, Any]]

List of merge requests

get_merge_request(project_id: str, mr_iid: int, use_cache: bool = True) -> dict[str, Any]

Get merge request details.

Parameters:

Name Type Description Default
project_id str

Project ID or URL-encoded path

required
mr_iid int

Merge request IID

required
use_cache bool

Whether to use caching

True

Returns:

Type Description
dict[str, Any]

Merge request data

get_current_user(use_cache: bool = True) -> dict[str, Any]

Get current authenticated user.

Parameters:

Name Type Description Default
use_cache bool

Whether to use caching

True

Returns:

Type Description
dict[str, Any]

User data

Raises:

Type Description
GitLabAPIError

If not authenticated

get_user(user_id: int, use_cache: bool = True) -> dict[str, Any]

Get user details.

Parameters:

Name Type Description Default
user_id int

User ID

required
use_cache bool

Whether to use caching

True

Returns:

Type Description
dict[str, Any]

User data

get_rate_limit_info() -> dict[str, Any]

Get current rate limit information.

Returns:

Type Description
dict[str, Any]

Dictionary with rate limit info