Set up your self-hosted model infrastructure

Tier: For a limited time, Ultimate. On October 17, 2024, Ultimate with GitLab Duo Enterprise. Offering: Self-managed Status: Beta
History
The availability of this feature is controlled by a feature flag. For more information, see the history.

By self-hosting the model, AI Gateway, and GitLab instance, there are no calls to external architecture, ensuring maximum levels of security.

To set up your self-hosted model infrastructure:

  1. Install the large language model (LLM) serving infrastructure.
  2. Configure your GitLab instance.
  3. Install the GitLab AI Gateway.

Install large language model serving infrastructure

Install one of the following GitLab-approved LLM models:

Model family Model Code completion Code generation GitLab Duo Chat
Mistral Codestral 22B (see setup instructions) Yes Yes No
Mistral Mistral 7B No Yes Yes
Mistral Mixtral 8x22B No Yes Yes
Mistral Mixtral 8x7B No Yes Yes
Mistral Mistral 7B Text Yes No No
Mistral Mixtral 8x22B Text Yes No No
Mistral Mixtral 8x7B Text Yes No No
Claude 3 Claude 3.5 Sonnet No Yes Yes

The following models are under evaluation, and support is limited:

Model family Model Code completion Code generation GitLab Duo Chat
CodeGemma CodeGemma 2b Yes No No
CodeGemma CodeGemma 7b-it (Instruction) No Yes No
CodeGemma CodeGemma 7b-code (Code) Yes No No
CodeLlama Code-Llama 13b-code Yes No No
CodeLlama Code-Llama 13b No Yes No
DeepSeekCoder DeepSeek Coder 33b Instruct Yes Yes No
DeepSeekCoder DeepSeek Coder 33b Base Yes No No
GPT GPT-3.5-Turbo No Yes No
GPT GPT-4 No Yes No
GPT GPT-4 Turbo No Yes No
GPT GPT-4o No Yes No
GPT GPT-4o-mini No Yes No

Use a serving architecture

To host your models, you should use:

  • For non-cloud on-premise deployments, vLLM.
  • For cloud deployments, AWS Bedrock or Azure as a cloud providers.

Configure your GitLab instance

Prerequisites:

  • Upgrade to the latest version of GitLab.
  1. The GitLab instance must be able to access the AI Gateway.

    1. Where your GitLab instance is installed, update the /etc/gitlab/gitlab.rb file.

      sudo vim /etc/gitlab/gitlab.rb
      
    2. Add and save the following environment variables.

      gitlab_rails['env'] = {
      'GITLAB_LICENSE_MODE' => 'production',
      'CUSTOMER_PORTAL_URL' => 'https://customers.gitlab.com',
      'AI_GATEWAY_URL' => '<path_to_your_ai_gateway>:<port>'
      }
    3. Run reconfigure:

      sudo gitlab-ctl reconfigure
      

GitLab AI Gateway

Install the GitLab AI Gateway.

Enable logging

Prerequisites:

  • You must be an administrator for your self-managed instance.

To enable logging and access the logs, enable the feature flag:

Feature.enable(:expanded_ai_logging)

Disabling the feature flag stops logs from being written.

Logs in your GitLab installation

In your instance log directory, a file called llm.log is populated.

For more information on:

Logs in your AI Gateway container

To specify the location of logs generated by AI Gateway, run:

docker run -e AIGW_GITLAB_URL=<your_gitlab_instance> \
 -e AIGW_GITLAB_API_URL=https://<your_gitlab_domain>/api/v4/ \
 -e AIGW_GITLAB_API_URL=https://<your_gitlab_domain>/api/v4/ \
 -e AIGW_LOGGING__TO_FILE="aigateway.log" \
 -v <your_file_path>:"aigateway.log"
 <image>

If you do not specify a file name, logs are streamed to the output.

Additionally, the outputs of the AI Gateway execution can also be useful for debugging issues. To access them:

  • When using Docker:

    docker logs <container-id>
    
  • When using Kubernetes:

    kubectl logs <container-name>
    

To ingest these logs into the logging solution, see your logging provider documentation.

Logs in your inference service provider

GitLab does not manage logs generated by your inference service provider. Please refer to the documentation of your inference service provider on how to use their logs.

Cross-referencing logs between AI Gateway and GitLab

The property correlation_id is assigned to every request and is carried across different components that respond to a request. For more information, see the documentation on finding logs with a correlation ID.

Correlation ID is not available in your model provider logs.

Troubleshooting

First, run the debugging scripts to verify your self-hosted model setup.

For more information on other actions to take, see the troubleshooting documentation.