LLM Configuration

Before getting started with Rubra, make sure you configure what models you want to give Rubra access to by editing the llm-config.yaml file.

We currently use LiteLLM as the chat completions server. This may change in the future.

The models currently supported:

OpenAI
- GPT-4-turbo (gpt-4-1106-preview)
Anthropic
- claude-2.1
Local Models
- See Local LLM Deployment for more information
- Must be named openai/custom

This is what you config file should look like:

model_list:
  - model_name: gpt-4-1106-preview
    litellm_params:
      model: gpt-4-1106-preview
      api_key: "OPENAI_API_KEY"
      custom_llm_provider: "openai"
  - model_name: claude-2.1
    litellm_params:
      model: claude-2.1
      api_key: "CLAUDE_API_KEY"
  # the following is for locally running LLMs deployed with LM Studio or llama.cpp
  - model_name: custom 
    litellm_params:
      model: openai/custom
      api_base: "http://host.docker.internal:1234/v1"  # host.docker.internal allows docker to use your local machine's IP address (localhost)
      api_key: "None"
      custom_llm_provider: "openai"
  
litellm_settings:
  drop_params: True
  set_verbose: True
  cache: True

# For caching
environment_variables:
  REDIS_HOST: "redis"
  REDIS_PORT: "6379"
  REDIS_PASSWORD: ""

Edit model list to include the models you want to use. To use Rubra, you need to specify at least one model.

[Architecture Diagram] (/img/llm-config.svg)