Skip to content

GitLab Source Component #190

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 18, 2025
Merged

GitLab Source Component #190

merged 5 commits into from
Apr 18, 2025

Conversation

butschster
Copy link
Collaborator

@butschster butschster commented Apr 18, 2025

This PR adds a new GitLab source component that mirrors the functionality of the existing GitHub source implementation. This allows users to fetch code and documentation from both public and private GitLab repositories, supporting both gitlab.com and self-hosted GitLab instances.

Key Features

  • API Integration:

    • GitLab API v4 support
    • Authentication via personal access tokens
    • Support for both gitlab.com and self-hosted GitLab instances
    • Custom HTTP headers support
  • Repository Access:

    • Support for specific branches or tags
    • Directory structure browsing and filtering
    • File content retrieval
  • Configuration:

    • Global server configuration in settings
    • Per-source server configuration
    • Environment variable interpolation for tokens
  • Documentation:

    • Added YAML configuration examples

Summary by CodeRabbit

  • New Features

    • Added GitLab integration, allowing users to fetch, filter, and generate documentation from GitLab repositories.
    • Introduced support for configuring multiple GitLab servers, including custom tokens and headers.
    • Enhanced configuration options with advanced filtering for file paths and content within GitLab sources.
  • Documentation

    • Updated and streamlined the README for clarity, added a table of contents, and included GitLab as a supported source.
    • Added comprehensive configuration and usage examples for GitLab integration.
    • Extended the JSON schema to support GitLab source configuration and server definitions.

Implement GitLab source functionality to mirror GitHub source implementation:
- Implement GitLab server configuration registry
- Support global server configurations
- Add content filtering capabilities
- Include example configuration
@butschster butschster added the sources New source type for data retrieval label Apr 18, 2025
@butschster butschster added this to the 1.26 milestone Apr 18, 2025
@butschster butschster self-assigned this Apr 18, 2025
@butschster butschster linked an issue Apr 18, 2025 that may be closed by this pull request
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🧹 Nitpick comments (19)
src/Lib/GitlabClient/Model/GitlabRepository.php (2)

10-26: Validate repository format in constructor

The constructor assumes the repository is provided in "group/project" format but doesn't validate this. Consider adding validation to prevent unexpected behavior with malformed inputs.

public function __construct(
    public string $repository,
    public string $branch = 'main',
) {
+    // Validate repository format
+    if (!\preg_match('/^[^\/]+\/[^\/]+$/', $repository)) {
+        throw new \InvalidArgumentException('Repository must be in format "group/project"');
+    }
    
    // Convert repository name to URL-encoded project ID for API calls
    $this->projectId = \urlencode($repository);
}

31-34: Update method documentation for clarity

The method comment "Get the full repository URL (without server URL)" is misleading since it returns the repository path, not a URL.

/**
- * Get the full repository URL (without server URL)
+ * Get the repository path (in format "group/project")
 */
public function getPath(): string
{
    return $this->repository;
}
src/Source/Gitlab/Config/ServerRegistry.php (1)

46-49: Add PHPDoc for return type.

Consider adding a PHPDoc comment to specify the return type for the all() method.

+/**
+ * @return array<string, ServerConfig>
+ */
 public function all(): array
 {
     return $this->servers;
 }
src/Source/Gitlab/GitlabFileInfo.php (2)

41-41: Consider direct closure invocation.

For better readability and slightly improved performance, consider directly invoking the closure instead of using \call_user_func.

-        return $this->fetchedContent = \call_user_func($this->content);
+        return $this->fetchedContent = ($this->content)();

54-57: Potential inconsistency in isDir() behavior.

When 'type' is missing from metadata, the getType() method falls back to 'blob' (line 51), which could incorrectly identify directories as files. Consider a more robust validation.

 public function isDir(): bool
 {
-    return $this->getType() === 'tree';
+    // GitLab API uses 'tree' for directories and 'blob' for files
+    return isset($this->metadata['type']) && $this->metadata['type'] === 'tree';
 }
src/Source/Gitlab/Config/GitlabServerParserPlugin.php (3)

64-69: Consider more specific exception handling.

The current implementation catches all throwable exceptions indiscriminately. Consider distinguishing between different types of exceptions to handle critical configuration issues differently.

-            } catch (\Throwable $e) {
+            } catch (\InvalidArgumentException $e) {
                 $this->logger?->warning("Failed to register GitLab server '{$name}'", [
                     'error' => $e->getMessage(),
                     'config' => $serverConfig,
                 ]);
+            } catch (\Throwable $e) {
+                // For more serious exceptions, we might want to log as errors
+                $this->logger?->error("Critical error registering GitLab server '{$name}'", [
+                    'error' => $e->getMessage(),
+                    'exception' => get_class($e),
+                    'trace' => $e->getTraceAsString(),
+                ]);
             }

72-76: Improve log message for partial successes.

The final log message indicates success regardless of whether any errors occurred during parsing. Consider reflecting the partial success if some servers failed to register.

-        $this->logger?->info('GitLab server configurations parsed successfully', [
-            'registeredServers' => \count($this->serverRegistry->all()),
+        $totalServers = \count($serversConfig);
+        $registeredServers = \count($this->serverRegistry->all());
+        
+        $logMethod = ($registeredServers === $totalServers) ? 'info' : 'warning';
+        $message = ($registeredServers === $totalServers)
+            ? 'GitLab server configurations parsed successfully'
+            : 'GitLab server configurations parsed with some failures';
+            
+        $this->logger?->$logMethod($message, [
+            'registeredServers' => $registeredServers,
+            'totalServers' => $totalServers,
         ]);

40-77: Clarify the intention of returning null from parse().

The method signature indicates it returns ?RegistryInterface, but it always returns null. Consider adding a clarifying comment about why null is returned.

 public function parse(array $config, string $rootPath): ?RegistryInterface
 {
     // Method implementation...
     
+    // This plugin registers servers directly into the ServerRegistry
+    // and doesn't need to return a registry object
     return null;
 }
docs/example/config/gitlab-source-example.yaml (1)

19-22: Remove trailing whitespace to keep YAMLLint green

The blank line after the headers block contains stray spaces, which YAMLlint flags as an error (see static‑analysis hint). This will break CI pipelines that run yamllint -d "{extends: default, rules: {trailing-spaces: enable}}".

20   X-Custom-Header: "custom-value"
-21␠␠
+21
🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 21-21: trailing spaces

(trailing-spaces)

src/Source/Gitlab/Config/ServerConfig.php (1)

52-60: Resolve variables in header keys as well

withResolvedVariables() only resolves header values. Variables inside keys (e.g. "X-${TEAM}-Token") remain unresolved.

-$resolvedHeaders = \array_map(static fn($value) => $resolver->resolve($value), $this->headers);
+$resolvedHeaders = [];
+foreach ($this->headers as $key => $value) {
+    $resolvedHeaders[$resolver->resolve($key)] = $resolver->resolve($value);
+}
src/Source/Gitlab/GitlabSourceFactory.php (1)

74-80: Accept pre‑instantiated ServerConfig objects

If a configuration parser already converted the server node into a ServerConfig instance, the match will erroneously throw.

-case true:
-    \is_string($server) => $this->serverRegistry->get($server),
-    \is_array($server)  => ServerConfig::fromArray($server),
-    default             => throw new \RuntimeException('GitLab server must be provided'),
+case true:
+    $server instanceof ServerConfig => $server,
+    \is_string($server)            => $this->serverRegistry->get($server),
+    \is_array($server)             => ServerConfig::fromArray($server),
+    default                        => throw new \RuntimeException('GitLab server must be provided'),
src/Source/Gitlab/GitlabSourceFetcher.php (1)

110-115: Don’t trim() file contents – it can corrupt code snippets

trim() strips trailing new‑lines and whitespace that may be meaningful (e.g. “\n\n” in markdown files, blank lines required by some interpreters, etc.).
Most rendering engines ignore an extra leading/trailing newline automatically, so the explicit trim() is rarely needed.

-                    code: \trim($fileContent),
+                    code: $fileContent,
src/Source/Gitlab/GitlabFinder.php (1)

37-41: $basePath argument is accepted but never used

The public find() signature promises a $basePath, yet the value is ignored.
Either remove the parameter (if not needed) or honour it when composing $sourcePaths to avoid misleading API contracts.

src/Lib/GitlabClient/GitlabClient.php (1)

168-195: Missing retry / back‑off strategy for transient GitLab errors

A single network hiccup or 5xx from GitLab will currently abort the process.
Consider wrapping makeApiRequest() logic with a simple exponential back‑off (e.g. 3 attempts) or at least expose retry control via options.

src/Source/Gitlab/GitlabSource.php (5)

12-14: Consider expanding the class documentation.

The class documentation is minimal. Consider adding more details about the purpose of this class, its role in the GitLab integration, and how it differs from other source types. This would help developers understand the component's purpose at a glance.


32-46: Consider adding validation for the repository format.

The constructor accepts a repository parameter in "group/project" format but doesn't validate this format. Adding validation would prevent runtime errors when interacting with the GitLab API.

public function __construct(
    public readonly string $repository,
    // other parameters...
) {
+    if (!preg_match('/^[^\/]+\/[^\/]+$/', $repository)) {
+        throw new \InvalidArgumentException("Repository must be in 'group/project' format");
+    }
    parent::__construct(description: $description, tags: $tags, modifiers: $modifiers);
}

34-35: Ensure sourcePaths is always treated as an array internally.

The sourcePaths parameter is defined as string|array but later cast to array in the in() method. Consider normalizing it to an array in the constructor for consistency.

public function __construct(
    public readonly string $repository,
-    public readonly string|array $sourcePaths,
+    string|array $sourcePaths,
    public readonly string $branch = 'main',
    // other parameters...
) {
+    $this->sourcePaths = is_array($sourcePaths) ? $sourcePaths : [$sourcePaths];
    parent::__construct(description: $description, tags: $tags, modifiers: $modifiers);
}

Then update the property declaration and in() method:

+    /** @var array<string> */
+    public readonly array $sourcePaths;

// ...

public function in(): array|null
{
-    return (array) $this->sourcePaths;
+    return $this->sourcePaths;
}

37-38: Consider a more inclusive default file pattern.

The default file pattern *.* only matches files with extensions. Consider using a more inclusive pattern like * to match all files, including those without extensions (like README, LICENSE, etc.).

-    public readonly string|array $filePattern = '*.*',
+    public readonly string|array $filePattern = '*',

85-88: Add a code comment for the in() method.

Other interface methods have comments explaining their behavior, but in() doesn't. Consider adding a comment for consistency.

public function in(): array|null
{
+    // Return source paths to search within
    return (array) $this->sourcePaths;
}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 189e609 and 957bb7c.

📒 Files selected for processing (15)
  • docs/example/config/gitlab-source-example.yaml (1 hunks)
  • src/Application/Bootloader/GitlabClientBootloader.php (1 hunks)
  • src/Application/Kernel.php (4 hunks)
  • src/Lib/GitlabClient/GitlabClient.php (1 hunks)
  • src/Lib/GitlabClient/GitlabClientInterface.php (1 hunks)
  • src/Lib/GitlabClient/Model/GitlabRepository.php (1 hunks)
  • src/Source/Gitlab/Config/GitlabServerParserPlugin.php (1 hunks)
  • src/Source/Gitlab/Config/ServerConfig.php (1 hunks)
  • src/Source/Gitlab/Config/ServerRegistry.php (1 hunks)
  • src/Source/Gitlab/GitlabFileInfo.php (1 hunks)
  • src/Source/Gitlab/GitlabFinder.php (1 hunks)
  • src/Source/Gitlab/GitlabSource.php (1 hunks)
  • src/Source/Gitlab/GitlabSourceBootloader.php (1 hunks)
  • src/Source/Gitlab/GitlabSourceFactory.php (1 hunks)
  • src/Source/Gitlab/GitlabSourceFetcher.php (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
src/Source/Gitlab/Config/ServerConfig.php (2)
src/Lib/Variable/VariableResolver.php (1)
  • VariableResolver (10-39)
src/Source/Gitlab/GitlabSource.php (1)
  • __construct (32-48)
🪛 YAMLlint (1.35.1)
docs/example/config/gitlab-source-example.yaml

[error] 21-21: trailing spaces

(trailing-spaces)

🔇 Additional comments (16)
src/Application/Kernel.php (4)

14-14: LGTM: GitlabClientBootloader import added

The import for the new GitlabClientBootloader class is correctly added.


30-30: LGTM: GitlabSourceBootloader import added

The import for the new GitlabSourceBootloader class is correctly added.


57-57: LGTM: GitlabClientBootloader added to bootloaders list

The GitlabClientBootloader is properly added to the core bootloaders list.


73-73: LGTM: GitlabSourceBootloader added to Sources section

The GitlabSourceBootloader is correctly added to the Sources section of bootloaders.

src/Application/Bootloader/GitlabClientBootloader.php (1)

1-32: LGTM: GitlabClientBootloader implementation is solid

The bootloader correctly:

  • Declares dependency on HttpClientBootloader
  • Registers GitlabClientInterface as a singleton that resolves to GitlabClient
  • Follows the application's bootloader pattern

The implementation is clean and focused on its responsibility.

src/Lib/GitlabClient/GitlabClientInterface.php (2)

14-31: LGTM: Repository interaction methods well-defined

The getContents and getFileContent methods provide a clear contract for interacting with GitLab repositories. The method signatures and documentation are thorough and appropriately typed.


32-51: LGTM: Configuration methods are well-defined

The methods for setting GitLab client configuration (token, server URL, and custom headers) are clearly defined and well-documented.

src/Lib/GitlabClient/Model/GitlabRepository.php (1)

36-44: Ensure URL construction is robust for all GitLab instances

The method correctly handles trailing slashes in server URLs, which is good for compatibility with different GitLab instance configurations.

src/Source/Gitlab/GitlabSourceBootloader.php (3)

14-44: The bootloader implementation follows good practices.

The class structure is clean and follows proper bootloader pattern implementation. The separation of concerns between defineSingletons(), init(), and boot() methods is well-defined, with clear comments explaining each registration step.


16-16:

✅ Verification successful

Verify PHP version compatibility for #[\Override] attribute.

The #[\Override] attribute used here is a PHP 8.3 feature. Please ensure this is compatible with the project's minimum PHP version requirement to avoid runtime errors in environments using older PHP versions.


🏁 Script executed:

#!/bin/bash
# Check project's minimum PHP version requirement
echo "Checking composer.json for PHP version requirement:"
if [ -f "composer.json" ]; then
    grep -E '"php":\s*".*"' composer.json
fi

# Check PHP version compatibility
echo -e "\nChecking PHP version compatibility for #[\Override] attribute:"
grep -r --include="*.php" "#\[\\\\Override\]" src/

Length of output: 5153


✅ PHP 8.3 Attribute Usage Confirmed

The project’s composer.json already requires PHP ^8.3, so the native #[\Override] attribute is fully supported. No changes needed.


28-28:

❓ Verification inconclusive

Verify dependency injection for GitlabSourceFactory.

The GitlabSourceFactory is injected as a constructor dependency but is not defined as a singleton in defineSingletons(). Verify if this is intentional or if it needs to be registered as a singleton.


🏁 Script executed:

#!/bin/bash
# Look for GitlabSourceFactory definition to check if it's registered elsewhere
echo "Checking if GitlabSourceFactory is defined elsewhere:"
grep -r --include="*.php" "GitlabSourceFactory::class" src/

Length of output: 176


🏁 Script executed:

#!/bin/bash
# Locate the GitlabSourceBootloader file
echo "Finding GitlabSourceBootloader:"
grep -R "class GitlabSourceBootloader" -n src

Length of output: 222


🏁 Script executed:

#!/bin/bash
echo "Inspecting defineSingletons() in GitlabSourceBootloader:"
grep -R -n -A20 "function defineSingletons" src/Source/Gitlab/GitlabSourceBootloader.php

Length of output: 942


Confirm GitlabSourceFactory DI Registration

It looks like GitlabSourceFactory is injected in init() but isn’t listed in defineSingletons(). Please verify whether it should be explicitly registered as a singleton. If so, add it alongside the other services:

 public function defineSingletons(): array
 {
     return [
         GitlabSourceFetcher::class  => GitlabSourceFetcher::class,
         ServerRegistry::class       => ServerRegistry::class,
+        GitlabSourceFactory::class  => GitlabSourceFactory::class,
     ];
 }

Otherwise, ensure that leaving it unregistered (for transient/autowire resolution) is intentional and safe.

src/Source/Gitlab/Config/ServerRegistry.php (1)

10-54: Well-structured registry implementation.

The ServerRegistry class follows good practices with clear method names, proper type hints, error handling, and a fluent interface for method chaining. The approach for managing server configurations is clean and maintainable.

src/Source/Gitlab/GitlabFileInfo.php (1)

14-83: Well-designed file info wrapper.

The class effectively extends Symfony's SplFileInfo to provide GitLab-specific metadata access with appropriate fallbacks. The lazy loading pattern for content is a good performance optimization.

src/Source/Gitlab/Config/GitlabServerParserPlugin.php (1)

15-78: Well-structured configuration parser plugin.

The implementation effectively follows the plugin interface, has good error handling, and includes comprehensive logging. The class structure is clean and the parsing logic is clear.

src/Source/Gitlab/GitlabSource.php (2)

1-123: Overall, this is a well-structured implementation of a GitLab source.

The class properly extends the source system with GitLab support and implements all required interface methods. The use of readonly properties and strict typing indicates good PHP practices. The code follows the same patterns as other source implementations based on the PR description.


105-122:

❌ Incorrect review comment

Handle ServerConfig serialization explicitly.

The jsonSerialize() method doesn't explicitly handle the case when $server is a ServerConfig object. Ensure that ServerConfig objects are properly serialized.

Please verify that ServerConfig implements JsonSerializable or is properly handled during serialization. You can run this script to check:


🏁 Script executed:

#!/bin/bash
# Check if ServerConfig implements JsonSerializable
grep -A 2 "class ServerConfig" src/Source/Gitlab/Config/ServerConfig.php | grep -E "implements|extends" | grep "JsonSerializable"

# Also check if ServerConfig has a jsonSerialize method
grep -A 2 "function jsonSerialize" src/Source/Gitlab/Config/ServerConfig.php

Length of output: 351


No changes needed for ServerConfig serialization

ServerConfig already implements \JsonSerializable and provides its own jsonSerialize() method, so including it in the returned array is sufficient for proper JSON encoding.

Likely an incorrect or invalid review comment.

Comment on lines +22 to +25
$this->logger?->debug('Registering GitLab server', [
'name' => $name,
'config' => $config,
]);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Avoid logging potentially sensitive information.

The entire $config object is being logged, which might contain sensitive information like access tokens. Consider logging only non-sensitive metadata or redacting sensitive fields.

 $this->logger?->debug('Registering GitLab server', [
     'name' => $name,
-    'config' => $config,
+    'url' => $config->getUrl(),
+    'hasToken' => $config->getToken() !== null,
 ]);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
$this->logger?->debug('Registering GitLab server', [
'name' => $name,
'config' => $config,
]);
$this->logger?->debug('Registering GitLab server', [
'name' => $name,
'url' => $config->getUrl(),
'hasToken' => $config->getToken() !== null,
]);

Comment on lines +34 to +42
#[\Override]
public function getContents(): string
{
if ($this->fetchedContent !== null) {
return $this->fetchedContent;
}

return $this->fetchedContent = \call_user_func($this->content);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add exception handling for content fetching.

The getContents() method doesn't handle potential exceptions from the content closure. Consider adding a try-catch block to handle possible network or API errors gracefully.

 public function getContents(): string
 {
     if ($this->fetchedContent !== null) {
         return $this->fetchedContent;
     }
 
-    return $this->fetchedContent = \call_user_func($this->content);
+    try {
+        return $this->fetchedContent = ($this->content)();
+    } catch (\Throwable $e) {
+        throw new \RuntimeException(
+            sprintf('Failed to fetch content for GitLab file: %s', $this->getPath()),
+            0,
+            $e
+        );
+    }
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
#[\Override]
public function getContents(): string
{
if ($this->fetchedContent !== null) {
return $this->fetchedContent;
}
return $this->fetchedContent = \call_user_func($this->content);
}
#[\Override]
public function getContents(): string
{
if ($this->fetchedContent !== null) {
return $this->fetchedContent;
}
try {
return $this->fetchedContent = ($this->content)();
} catch (\Throwable $e) {
throw new \RuntimeException(
sprintf('Failed to fetch content for GitLab file: %s', $this->getPath()),
0,
$e
);
}
}

'name' => $name,
]);

$server = ServerConfig::fromArray($serverConfig);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Validate server configuration before parsing.

There's no verification that $serverConfig is an array before calling ServerConfig::fromArray(). Consider adding validation to prevent potential type errors.

+                if (!\is_array($serverConfig)) {
+                    throw new \InvalidArgumentException(
+                        sprintf('Server configuration for "%s" must be an array, %s given', 
+                               $name, gettype($serverConfig))
+                    );
+                }
                 $server = ServerConfig::fromArray($serverConfig);

Comment on lines +39 to +49
public static function fromArray(array $config): self
{
if (!isset($config['url'])) {
throw new \InvalidArgumentException('GitLab server configuration must have a "url" property');
}

return self::create(
url: (string) $config['url'],
token: isset($config['token']) ? (string) $config['token'] : null,
headers: (array) ($config['headers'] ?? []),
);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

⚠️ Potential issue

Validate header key/value types early

fromArray() blindly casts headers to array. If the caller passes a nested array or non‑string values, that invalid shape silently propagates and will be sent to the HTTP client later, producing hard‑to‑trace runtime errors.

-headers: (array) ($config['headers'] ?? []),
+headers: \array_map(
+    static fn ($v) => (string) $v,
+    \array_filter(
+        (array) ($config['headers'] ?? []),
+        static fn ($k, $v) => \is_string($k) && ( \is_scalar($v) || $v === null),
+        ARRAY_FILTER_USE_BOTH
+    )
+),

This enforces array<string,string> at the boundary.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
public static function fromArray(array $config): self
{
if (!isset($config['url'])) {
throw new \InvalidArgumentException('GitLab server configuration must have a "url" property');
}
return self::create(
url: (string) $config['url'],
token: isset($config['token']) ? (string) $config['token'] : null,
headers: (array) ($config['headers'] ?? []),
);
public static function fromArray(array $config): self
{
if (!isset($config['url'])) {
throw new \InvalidArgumentException('GitLab server configuration must have a "url" property');
}
return self::create(
url: (string) $config['url'],
token: isset($config['token']) ? (string) $config['token'] : null,
headers: \array_map(
static fn ($v) => (string) $v,
\array_filter(
(array) ($config['headers'] ?? []),
static fn ($k, $v) => \is_string($k) && (\is_scalar($v) || $v === null),
ARRAY_FILTER_USE_BOTH
)
),
);
}

Comment on lines +35 to +38
$this->logger?->debug('Creating GitLab source', [
'path' => (string) $this->dirs->getRootPath(),
'config' => $config,
]);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

⚠️ Potential issue

Avoid leaking secrets in debug logs

Dumping the whole $config array may include token, gitlabToken, or inline server.headers. In production with DEBUG enabled, these values could surface in logs.

-    'config' => $config,
+    // Redact possible secrets before logging
+    'config' => \array_replace_recursive($config, [
+        'token' => '[redacted]',
+        'gitlabToken' => '[redacted]',
+        'server' => [
+            'token' => '[redacted]',
+            'headers' => '[redacted]',
+        ],
+    ]),
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
$this->logger?->debug('Creating GitLab source', [
'path' => (string) $this->dirs->getRootPath(),
'config' => $config,
]);
$this->logger?->debug('Creating GitLab source', [
'path' => (string) $this->dirs->getRootPath(),
// Redact possible secrets before logging
'config' => \array_replace_recursive($config, [
'token' => '[redacted]',
'gitlabToken' => '[redacted]',
'server' => [
'token' => '[redacted]',
'headers' => '[redacted]',
],
]),
]);

Comment on lines +68 to +70
// Convert notPath to match Symfony Finder's naming convention
$notPath = $config['excludePatterns'] ?? $config['notPath'] ?? [];

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Normalise notPath/excludePatterns to an array

GitlabSource expects array $notPath, but a user can supply a single string. Passing a string will later cause type errors when iterating.

-$notPath = $config['excludePatterns'] ?? $config['notPath'] ?? [];
+$notPath = $config['excludePatterns'] ?? $config['notPath'] ?? [];
+if (\is_string($notPath)) {
+    $notPath = [$notPath];
+}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Convert notPath to match Symfony Finder's naming convention
$notPath = $config['excludePatterns'] ?? $config['notPath'] ?? [];
// Convert notPath to match Symfony Finder's naming convention
$notPath = $config['excludePatterns'] ?? $config['notPath'] ?? [];
if (\is_string($notPath)) {
$notPath = [$notPath];
}

Comment on lines +20 to +25
public function __construct(
private GitlabFinder $finder,
private ContentBuilderFactory $builderFactory = new ContentBuilderFactory(),
#[LoggerPrefix(prefix: 'gitlab-source')]
private ?LoggerInterface $logger = null,
) {}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Constructor uses an object as a default parameter – this is a hard syntax error

PHP still forbids new ... in default parameter values (even in promoted‑property parameters).
Trying to load this class will throw a parse error and break the whole application bootstrap.

-        private ContentBuilderFactory $builderFactory = new ContentBuilderFactory(),
+        private ?ContentBuilderFactory $builderFactory = null,
) {
    $this->builderFactory ??= new ContentBuilderFactory();
}

Because the class is readonly, the assignment shown above is still legal (it happens inside the constructor).

Without this fix, none of the GitLab functionality will be reachable.

Comment on lines +30 to +35
public function __construct(
private readonly GitlabClientInterface $gitlabClient,
private readonly ServerRegistry $serverRegistry,
private readonly VariableResolver $variableResolver = new VariableResolver(),
private readonly FileTreeBuilder $fileTreeBuilder = new FileTreeBuilder(),
) {}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Same syntax error as above – objects in default parameter values

new VariableResolver() and new FileTreeBuilder() in the parameter list will not compile.

-        private readonly VariableResolver $variableResolver = new VariableResolver(),
-        private readonly FileTreeBuilder $fileTreeBuilder = new FileTreeBuilder(),
+        private readonly ?VariableResolver $variableResolver = null,
+        private readonly ?FileTreeBuilder $fileTreeBuilder = null,
) {
    $this->variableResolver ??= new VariableResolver();
    $this->fileTreeBuilder ??= new FileTreeBuilder();
}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
public function __construct(
private readonly GitlabClientInterface $gitlabClient,
private readonly ServerRegistry $serverRegistry,
private readonly VariableResolver $variableResolver = new VariableResolver(),
private readonly FileTreeBuilder $fileTreeBuilder = new FileTreeBuilder(),
) {}
public function __construct(
private readonly GitlabClientInterface $gitlabClient,
private readonly ServerRegistry $serverRegistry,
private readonly ?VariableResolver $variableResolver = null,
private readonly ?FileTreeBuilder $fileTreeBuilder = null,
) {
$this->variableResolver ??= new VariableResolver();
$this->fileTreeBuilder ??= new FileTreeBuilder();
}

Comment on lines +45 to +52
// Format URL for GitLab API v4
$url = $this->buildApiUrl("/projects/{$repository->projectId}/repository/tree", [
'ref' => $repository->branch,
'path' => $path,
'recursive' => 'false',
]);

$this->logger?->debug('Making GitLab API request', [
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Inefficient tree traversal ­– use GitLab’s recursive=true and larger per_page

Multiple round‑trips are made for every sub‑directory.
GitLab’s tree endpoint supports:

?recursive=true&per_page=100

Fetching the entire tree in one request can cut latency and rate‑limit pressure dramatically.

-            'recursive' => 'false',
+            'recursive' => 'true',
+            'per_page'  => '100',

(Leave true/1 according to API docs.)

@butschster butschster added the enhancement New feature or request label Apr 18, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
json-schema.json (1)

1092-1114: Enhance gitlabServerConfig.url with URI format validation
To maintain consistency with other URL properties in this schema (e.g., urlSource.urls, mcpServerConfig.url), consider adding "format": "uri" to the url property:

 "gitlabServerConfig": {
   "type": "object",
   "required": ["url"],
   "properties": {
-    "url": {
-      "type": "string",
-      "description": "GitLab server URL"
-    },
+    "url": {
+      "type": "string",
+      "format": "uri",
+      "description": "GitLab server URL"
+    },
     "token": {
       "type": "string",
       "description": "GitLab API token for authentication (can use env var pattern ${TOKEN_NAME})"
     },
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8a9e357 and fb8f02f.

📒 Files selected for processing (3)
  • context.yaml (1 hunks)
  • json-schema.json (4 hunks)
  • prompts.yaml (1 hunks)
✅ Files skipped from review due to trivial changes (2)
  • prompts.yaml
  • context.yaml
🔇 Additional comments (4)
json-schema.json (4)

191-203: Introduce settings.gitlab for GitLab server configurations
This addition mirrors the existing settings.mcp.servers pattern by adding a gitlab block under settings.properties that references the new gitlabServerConfig definition. It correctly scopes server definitions for GitLab integration.


623-623: Allow gitlab as a valid source type
Adding "gitlab" to the source.type enum ensures that the new GitLab source component can be recognized by the configuration schema.


695-706: Map gitlab source type to its schema definition
The new if/then branch correctly wires "gitlab" to #/definitions/gitlabSource, ensuring conditional validation for GitLab sources.


1010-1091: Define the gitlabSource schema
The gitlabSource definition includes required fields (repository, sourcePaths), filtering options (patterns, path/content filters), tree‑view controls, a server reference, and modifiers. It aligns well with githubSource and the broader file‑source patterns.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
prompts.yaml (2)

9-9: Remove trailing spaces
YAMLLint reports trailing spaces on these lines. Please remove the extra spaces to satisfy the linter.

Also applies to: 11-11, 16-16, 24-24, 31-32

🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 9-9: trailing spaces

(trailing-spaces)


35-36: Reduce excessive blank lines
There are two consecutive blank lines here; the maximum allowed is one. Please remove the extra blank line.

🧰 Tools
🪛 YAMLlint (1.35.1)

[warning] 35-35: too many blank lines

(2 > 0) (empty-lines)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8a9e357 and fb8f02f.

📒 Files selected for processing (3)
  • context.yaml (1 hunks)
  • json-schema.json (4 hunks)
  • prompts.yaml (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • context.yaml
🧰 Additional context used
🪛 YAMLlint (1.35.1)
prompts.yaml

[error] 9-9: trailing spaces

(trailing-spaces)


[error] 11-11: trailing spaces

(trailing-spaces)


[error] 16-16: trailing spaces

(trailing-spaces)


[error] 24-24: trailing spaces

(trailing-spaces)


[error] 31-31: trailing spaces

(trailing-spaces)


[error] 32-32: trailing spaces

(trailing-spaces)


[warning] 35-35: too many blank lines

(2 > 0) (empty-lines)

🔇 Additional comments (5)
json-schema.json (5)

191-203: Add GitLab settings block in schema
The new settings.gitlab section mirrors the existing settings.mcp.servers structure, enabling pre-defined GitLab server configurations. This aligns well with the requirements for GitLab integration.


623-631: Include gitlab in the source.type enum
The "gitlab" entry has been correctly added alongside other source types, enabling schema validation for GitLab sources.


695-706: Add conditional schema branch for gitlabSource
The if: type = "gitlab" / then: $ref = "#/definitions/gitlabSource" pattern correctly parallels the existing GitHub branch.


1010-1091: Define gitlabSource schema
The gitlabSource definition includes repository/sourcePaths, branch/tag selection, file patterns, exclusion filters, tree view options, inline or named server configs, and modifiers. It aligns with other source definitions and supports the full set of GitLab integration features.


1092-1114: Define gitlabServerConfig schema
The server config schema requires url and optionally accepts token and custom headers. It follows the established pattern for external service configurations.

@butschster butschster moved this to In review in Context Generator Apr 18, 2025
@context-hub context-hub deleted a comment from coderabbitai bot Apr 18, 2025
Copy link

coderabbitai bot commented Apr 18, 2025

Caution

Review failed

The pull request is closed.

Walkthrough

This update introduces comprehensive support for GitLab as a source type throughout the application. It adds new classes for GitLab API interaction, configuration parsing, file discovery, and content fetching. The JSON schema and documentation are updated to reflect GitLab integration, including configuration examples and schema extensions. The application kernel and bootloader are modified to register the new GitLab components. Supporting files and interfaces are created for server configuration, source modeling, and file info handling. Documentation files are updated to include GitLab examples and references, while configuration and prompt files are adjusted to accommodate the new source type.

Changes

File(s) Change Summary
README.md Improved clarity, formatting, and structure; added Table of Contents; updated supported sources to include GitLab; revised examples and removed outdated sections.
context.yaml, src/Lib/context.yaml, src/Source/context.yaml Removed variables and tools sections; added import for prompts.yaml; added document entries for GitLab Client and GitLab source implementation.
docs/example/config/gitlab-source-example.yaml Added a comprehensive YAML example for configuring GitLab sources, including server settings and multiple document scenarios.
json-schema.json Extended schema to support GitLab as a source type, adding settings, source definitions, and server configuration options.
prompts.yaml Added a new prompt for updating JSON Schema to support new features.
src/Application/Bootloader/GitlabClientBootloader.php, src/Application/Kernel.php Registered GitLab bootloaders in the application kernel and bootloader system.
src/Lib/GitlabClient/GitlabClient.php, src/Lib/GitlabClient/GitlabClientInterface.php Introduced GitLab client implementation and interface for API interactions, including methods for retrieving repository and file contents.
src/Lib/GitlabClient/Model/GitlabRepository.php Added a readonly class modeling a GitLab repository with URL encoding and utility methods.
src/Source/Gitlab/Config/GitlabServerParserPlugin.php, src/Source/Gitlab/Config/ServerConfig.php, src/Source/Gitlab/Config/ServerRegistry.php Added classes for parsing, representing, and managing GitLab server configurations.
src/Source/Gitlab/GitlabFileInfo.php Added a class to encapsulate GitLab file metadata and content retrieval.
src/Source/Gitlab/GitlabFinder.php Implemented a finder class to discover and filter files in a GitLab repository according to source configuration.
src/Source/Gitlab/GitlabSource.php Introduced a source class modeling GitLab repository sources with filtering, modifiers, and serialization.
src/Source/Gitlab/GitlabSourceBootloader.php Added a bootloader to register GitLab source fetcher, server registry, and config parser plugin.
src/Source/Gitlab/GitlabSourceFactory.php Added a factory for creating GitLabSource instances from configuration arrays, with validation and normalization.
src/Source/Gitlab/GitlabSourceFetcher.php Implemented a fetcher to retrieve, process, and format content from GitLab sources, applying modifiers and formatting.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant AppKernel
    participant GitlabSourceFetcher
    participant GitlabFinder
    participant GitlabClient
    participant GitlabRepository

    User->>AppKernel: Start application
    AppKernel->>GitlabClientBootloader: Register GitLab client singleton
    AppKernel->>GitlabSourceBootloader: Register GitLab source components

    User->>GitlabSourceFetcher: Fetch content from GitLab source
    GitlabSourceFetcher->>GitlabFinder: Find files in repository
    GitlabFinder->>GitlabClient: Get repository contents (API call)
    GitlabClient->>GitlabRepository: Build API URL and fetch data
    GitlabClient-->>GitlabFinder: Return file metadata
    GitlabFinder-->>GitlabSourceFetcher: Return filtered file list

    loop For each file
        GitlabSourceFetcher->>GitlabClient: Get file content (API call)
        GitlabClient-->>GitlabSourceFetcher: Return file content
        GitlabSourceFetcher->>GitlabSourceFetcher: Apply modifiers and format
    end

    GitlabSourceFetcher-->>User: Return formatted content
Loading

Poem

🐰
GitLab hops into our code,
With YAML paths and schema mode.
New classes bloom, the docs expand,
Now GitLab sources lend a hand!
Through APIs and filters neat,
Our context grows, robust—complete!
Hooray for every merge and fetch—
The rabbit’s dream is now well etched.

Tip

⚡💬 Agentic Chat (Pro Plan, General Availability)
  • We're introducing multi-step agentic chat in review comments and issue comments, within and outside of PR's. This feature enhances review and issue discussions with the CodeRabbit agentic chat by enabling advanced interactions, including the ability to create pull requests directly from comments and add commits to existing pull requests.

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 189e609 and 6ccf5d4.

📒 Files selected for processing (21)
  • README.md (7 hunks)
  • context.yaml (1 hunks)
  • docs/example/config/gitlab-source-example.yaml (1 hunks)
  • json-schema.json (4 hunks)
  • prompts.yaml (1 hunks)
  • src/Application/Bootloader/GitlabClientBootloader.php (1 hunks)
  • src/Application/Kernel.php (4 hunks)
  • src/Lib/GitlabClient/GitlabClient.php (1 hunks)
  • src/Lib/GitlabClient/GitlabClientInterface.php (1 hunks)
  • src/Lib/GitlabClient/Model/GitlabRepository.php (1 hunks)
  • src/Lib/context.yaml (1 hunks)
  • src/Source/Gitlab/Config/GitlabServerParserPlugin.php (1 hunks)
  • src/Source/Gitlab/Config/ServerConfig.php (1 hunks)
  • src/Source/Gitlab/Config/ServerRegistry.php (1 hunks)
  • src/Source/Gitlab/GitlabFileInfo.php (1 hunks)
  • src/Source/Gitlab/GitlabFinder.php (1 hunks)
  • src/Source/Gitlab/GitlabSource.php (1 hunks)
  • src/Source/Gitlab/GitlabSourceBootloader.php (1 hunks)
  • src/Source/Gitlab/GitlabSourceFactory.php (1 hunks)
  • src/Source/Gitlab/GitlabSourceFetcher.php (1 hunks)
  • src/Source/context.yaml (1 hunks)
✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@butschster butschster merged commit 2eceaaf into main Apr 18, 2025
6 of 7 checks passed
@github-project-automation github-project-automation bot moved this from In review to Done in Context Generator Apr 18, 2025
@butschster butschster deleted the issue/186 branch April 18, 2025 20:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request sources New source type for data retrieval
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

GitLab Source Implementation
1 participant