language_models

`LLMConfig` ¶

Bases: BaseSettings

Common configuration for all language models.

`LLMMessage` ¶

Bases: BaseModel

Class representing an entry in the msg-history sent to the LLM API. It could be one of these: - a user message - an LLM ("Assistant") response - a fn-call or tool-call-list from an OpenAI-compatible LLM API response - a result or results from executing a fn or tool-call(s)

`api_dict(model, has_system_role=True)` ¶

Convert to dictionary for API request, keeping ONLY the fields that are expected in an API call! E.g., DROP the tool_id, since it is only for use in the Assistant API, not the completion API.

Parameters:

Name	Type	Description	Default
`has_system_role`	`bool`	whether the message has a system role (if not, set to "user" role)	`True`

Returns: dict: dictionary representation of LLM message

Source code in langroid/language_models/base.py

def api_dict(self, model: str, has_system_role: bool = True) -> Dict[str, Any]:
    """
    Convert to dictionary for API request, keeping ONLY
    the fields that are expected in an API call!
    E.g., DROP the tool_id, since it is only for use in the Assistant API,
        not the completion API.

    Args:
        has_system_role: whether the message has a system role (if not,
            set to "user" role)
    Returns:
        dict: dictionary representation of LLM message
    """
    d = self.model_dump()
    files: List[FileAttachment] = d.pop("files")
    if len(files) > 0 and self.role == Role.USER:
        # In there are files, then content is an array of
        # different content-parts
        d["content"] = [
            dict(
                type="text",
                text=self.content,
            )
        ] + [f.to_dict(model) for f in self.files]

    # if there is a key k = "role" with value "system", change to "user"
    # in case has_system_role is False
    if not has_system_role and "role" in d and d["role"] == "system":
        d["role"] = "user"
        if "content" in d:
            d["content"] = "[ADDITIONAL SYSTEM MESSAGE:]\n\n" + d["content"]
    # drop None values since API doesn't accept them
    dict_no_none = {k: v for k, v in d.items() if v is not None}
    if "name" in dict_no_none and dict_no_none["name"] == "":
        # OpenAI API does not like empty name
        del dict_no_none["name"]
    if "function_call" in dict_no_none:
        # arguments must be a string
        if "arguments" in dict_no_none["function_call"]:
            dict_no_none["function_call"]["arguments"] = json.dumps(
                dict_no_none["function_call"]["arguments"]
            )
    if "tool_calls" in dict_no_none:
        # convert tool calls to API format
        for tc in dict_no_none["tool_calls"]:
            if "arguments" in tc["function"]:
                # arguments must be a string
                tc["function"]["arguments"] = json.dumps(
                    tc["function"]["arguments"]
                )
    # IMPORTANT! drop fields that are not expected in API call
    dict_no_none.pop("tool_id", None)
    dict_no_none.pop("timestamp", None)
    dict_no_none.pop("chat_document_id", None)
    return dict_no_none

`LLMFunctionCall` ¶

Bases: BaseModel

Structure of LLM response indicating it "wants" to call a function. Modeled after OpenAI spec for function_call field in ChatCompletion API.

`from_dict(message)` `staticmethod` ¶

Initialize from dictionary. Args: d: dictionary containing fields to initialize

Source code in langroid/language_models/base.py

@staticmethod
def from_dict(message: Dict[str, Any]) -> "LLMFunctionCall":
    """
    Initialize from dictionary.
    Args:
        d: dictionary containing fields to initialize
    """
    fun_call = LLMFunctionCall(name=message["name"])
    fun_args_str = message["arguments"]
    # sometimes may be malformed with invalid indents,
    # so we try to be safe by removing newlines.
    if fun_args_str is not None:
        fun_args_str = fun_args_str.replace("\n", "").strip()
        dict_or_list = parse_imperfect_json(fun_args_str)

        if not isinstance(dict_or_list, dict):
            raise ValueError(
                f"""
                    Invalid function args: {fun_args_str}
                    parsed as {dict_or_list},
                    which is not a valid dict.
                    """
            )
        fun_args = dict_or_list
    else:
        fun_args = None
    fun_call.arguments = fun_args

    return fun_call

`LLMFunctionSpec` ¶

Bases: BaseModel

Description of a function available for the LLM to use. To be used when calling the LLM chat() method with the functions parameter. Modeled after OpenAI spec for functions fields in ChatCompletion API.

`Role` ¶

Bases: str, Enum

Possible roles for a message in a chat.

`LLMTokenUsage` ¶

Bases: BaseModel

Usage of tokens by an LLM.

`LLMResponse` ¶

Bases: BaseModel

Class representing response from LLM.

`to_LLMMessage()` ¶

Convert LLM response to an LLMMessage, to be included in the message-list sent to the API. This is currently NOT used in any significant way in the library, and is only provided as a utility to construct a message list for the API when directly working with an LLM object.

In a ChatAgent, an LLM response is first converted to a ChatDocument, which is in turn converted to an LLMMessage via ChatDocument.to_LLMMessage() See ChatAgent._prep_llm_messages() and ChatAgent.llm_response_messages

Source code in langroid/language_models/base.py

def to_LLMMessage(self) -> LLMMessage:
    """Convert LLM response to an LLMMessage, to be included in the
    message-list sent to the API.
    This is currently NOT used in any significant way in the library, and is only
    provided as a utility to construct a message list for the API when directly
    working with an LLM object.

    In a `ChatAgent`, an LLM response is first converted to a ChatDocument,
    which is in turn converted to an LLMMessage via `ChatDocument.to_LLMMessage()`
    See `ChatAgent._prep_llm_messages()` and `ChatAgent.llm_response_messages`
    """
    return LLMMessage(
        role=Role.ASSISTANT,
        content=self.message,
        name=None if self.function_call is None else self.function_call.name,
        function_call=self.function_call,
        tool_calls=self.oai_tool_calls,
    )

`get_recipient_and_message()` ¶

If message or function_call of an LLM response contains an explicit recipient name, return this recipient name and message stripped of the recipient name if specified.

Two cases: (a) message contains addressing string "TO: ", or (b) message is empty and function_call/tool_call with explicit recipient

Returns:

Type	Description
`str`	name of recipient, which may be empty string if no recipient
`str`	content of message

Source code in langroid/language_models/base.py

def get_recipient_and_message(
    self,
) -> Tuple[str, str]:
    """
    If `message` or `function_call` of an LLM response contains an explicit
    recipient name, return this recipient name and `message` stripped
    of the recipient name if specified.

    Two cases:
    (a) `message` contains addressing string "TO: <name> <content>", or
    (b) `message` is empty and function_call/tool_call with explicit `recipient`


    Returns:
        (str): name of recipient, which may be empty string if no recipient
        (str): content of message

    """

    if self.function_call is not None:
        # in this case we ignore message, since all information is in function_call
        msg = ""
        args = self.function_call.arguments
        recipient = ""
        if isinstance(args, dict):
            recipient = args.get("recipient", "")
        return recipient, msg
    else:
        msg = self.message
        if self.oai_tool_calls is not None:
            # get the first tool that has a recipient field, if any
            for tc in self.oai_tool_calls:
                if tc.function is not None and tc.function.arguments is not None:
                    recipient = tc.function.arguments.get(
                        "recipient"
                    )  # type: ignore
                    if recipient is not None and recipient != "":
                        return recipient, ""

    # It's not a function or tool call, so continue looking to see
    # if a recipient is specified in the message.

    # First check if message contains "TO: <recipient> <content>"
    recipient_name, content = parse_message(msg) if msg is not None else ("", "")
    # check if there is a top level json that specifies 'recipient',
    # and retain the entire message as content.
    if recipient_name == "":
        recipient_name = top_level_json_field(msg, "recipient") if msg else ""
        content = msg
    return recipient_name, content

`OpenAIChatModel` ¶

Bases: ModelName

Enum for OpenAI Chat models

`AnthropicModel` ¶

Bases: ModelName

Enum for Anthropic models

`GeminiModel` ¶

Bases: ModelName

Enum for Gemini models

`OpenAICompletionModel` ¶

Bases: str, Enum

Enum for OpenAI Completion models

`OpenAIGPTConfig(**kwargs)` ¶

Bases: LLMConfig

Class for any LLM with an OpenAI-like API: besides the OpenAI models this includes: (a) locally-served models behind an OpenAI-compatible API (b) non-local models, using a proxy adaptor lib like litellm that provides an OpenAI-compatible API. (We could rename this class to OpenAILikeConfig, but we keep it as-is for now)

Important Note: Due to the env_prefix = "OPENAI_" defined below, all of the fields below can be set AND OVERRIDDEN via env vars,

by upper-casing the name and prefixing with OPENAI_, e.g.¶

OPENAI_MAX_OUTPUT_TOKENS=1000.¶

If any of these is defined in this way in the environment¶

(either via explicit setenv or export or via .env file + load_dotenv()),¶

the environment variable takes precedence over the value in the config.¶

Source code in langroid/language_models/openai_gpt.py

def __init__(self, **kwargs) -> None:  # type: ignore
    local_model = "api_base" in kwargs and kwargs["api_base"] is not None

    chat_model = kwargs.get("chat_model", "")
    local_prefixes = ["local/", "litellm/", "ollama/", "vllm/", "llamacpp/"]
    if any(chat_model.startswith(prefix) for prefix in local_prefixes):
        local_model = True

    warn_gpt_3_5 = (
        "chat_model" not in kwargs.keys()
        and not local_model
        and default_openai_chat_model == OpenAIChatModel.GPT3_5_TURBO
    )

    if warn_gpt_3_5:
        existing_hook = kwargs.get("run_on_first_use", noop)

        def with_warning() -> None:
            existing_hook()
            gpt_3_5_warning()

        kwargs["run_on_first_use"] = with_warning

    super().__init__(**kwargs)

`model_copy(*, update=None, deep=False)` ¶

Copy config while preserving nested model instances and subclasses.

Important: Avoid reconstructing via model_dump as that coerces nested models to their annotated base types (dropping subclass-only fields). Instead, defer to Pydantic's native model_copy, which keeps nested BaseModel instances (and their concrete subclasses) intact.

Source code in langroid/language_models/openai_gpt.py

def model_copy(
    self, *, update: Mapping[str, Any] | None = None, deep: bool = False
) -> "OpenAIGPTConfig":
    """
    Copy config while preserving nested model instances and subclasses.

    Important: Avoid reconstructing via `model_dump` as that coerces nested
    models to their annotated base types (dropping subclass-only fields).
    Instead, defer to Pydantic's native `model_copy`, which keeps nested
    `BaseModel` instances (and their concrete subclasses) intact.
    """
    # Delegate to BaseSettings/BaseModel implementation to preserve types
    return super().model_copy(update=update, deep=deep)  # type: ignore[return-value]

`create(prefix)` `classmethod` ¶

Create a config class whose params can be set via a desired prefix from the .env file or env vars. E.g., using

OllamaConfig = OpenAIGPTConfig.create("ollama")
ollama_config = OllamaConfig()

you can have a group of params prefixed by "OLLAMA_", to be used with models served via ollama. This way, you can maintain several setting-groups in your .env file, one per model type.

Source code in langroid/language_models/openai_gpt.py

@classmethod
def create(cls, prefix: str) -> Type["OpenAIGPTConfig"]:
    """Create a config class whose params can be set via a desired
    prefix from the .env file or env vars.
    E.g., using
    ```python
    OllamaConfig = OpenAIGPTConfig.create("ollama")
    ollama_config = OllamaConfig()
    ```
    you can have a group of params prefixed by "OLLAMA_", to be used
    with models served via `ollama`.
    This way, you can maintain several setting-groups in your .env file,
    one per model type.
    """

    class DynamicConfig(OpenAIGPTConfig):
        pass

    DynamicConfig.model_config = SettingsConfigDict(env_prefix=prefix.upper() + "_")
    return DynamicConfig

`OpenAIGPT(config=OpenAIGPTConfig())` ¶

Bases: LanguageModel

Class for OpenAI LLMs

Source code in langroid/language_models/openai_gpt.py

def __init__(self, config: OpenAIGPTConfig = OpenAIGPTConfig()):
    """
    Args:
        config: configuration for openai-gpt model
    """
    # copy the config to avoid modifying the original; deep to decouple
    # nested models while preserving their concrete subclasses
    config = config.model_copy(deep=True)
    super().__init__(config)
    self.config: OpenAIGPTConfig = config
    # save original model name such as `provider/model` before
    # we strip out the `provider` - we retain the original in
    # case some params are specific to a provider.
    self.chat_model_orig = self.config.chat_model

    # Run the first time the model is used
    self.run_on_first_use = cache(self.config.run_on_first_use)

    # global override of chat_model,
    # to allow quick testing with other models
    if settings.chat_model != "":
        self.config.chat_model = settings.chat_model
        self.chat_model_orig = settings.chat_model
        self.config.completion_model = settings.chat_model

    if len(parts := self.config.chat_model.split("//")) > 1:
        # there is a formatter specified, e.g.
        # "litellm/ollama/mistral//hf" or
        # "local/localhost:8000/v1//mistral-instruct-v0.2"
        formatter = parts[1]
        self.config.chat_model = parts[0]
        if formatter == "hf":
            # e.g. "litellm/ollama/mistral//hf" -> "litellm/ollama/mistral"
            formatter = find_hf_formatter(self.config.chat_model)
            if formatter != "":
                # e.g. "mistral"
                self.config.formatter = formatter
                logging.warning(
                    f"""
                    Using completions (not chat) endpoint with HuggingFace
                    chat_template for {formatter} for
                    model {self.config.chat_model}
                    """
                )
        else:
            # e.g. "local/localhost:8000/v1//mistral-instruct-v0.2"
            self.config.formatter = formatter

    if self.config.formatter is not None:
        self.config.hf_formatter = HFFormatter(
            HFPromptFormatterConfig(model_name=self.config.formatter)
        )

    self.supports_json_schema: bool = self.config.supports_json_schema or False
    self.supports_strict_tools: bool = self.config.supports_strict_tools or False

    OPENAI_API_KEY = os.getenv("OPENAI_API_KEY", DUMMY_API_KEY)
    self.api_key = config.api_key

    # if model name starts with "litellm",
    # set the actual model name by stripping the "litellm/" prefix
    # and set the litellm flag to True
    if self.config.chat_model.startswith("litellm/") or self.config.litellm:
        # e.g. litellm/ollama/mistral
        self.config.litellm = True
        self.api_base = self.config.api_base
        if self.config.chat_model.startswith("litellm/"):
            # strip the "litellm/" prefix
            # e.g. litellm/ollama/llama2 => ollama/llama2
            self.config.chat_model = self.config.chat_model.split("/", 1)[1]
    elif self.config.chat_model.startswith("local/"):
        # expect this to be of the form "local/localhost:8000/v1",
        # depending on how the model is launched locally.
        # In this case the model served locally behind an OpenAI-compatible API
        # so we can just use `openai.*` methods directly,
        # and don't need a adaptor library like litellm
        self.config.litellm = False
        self.config.seed = None  # some models raise an error when seed is set
        # Extract the api_base from the model name after the "local/" prefix
        self.api_base = self.config.chat_model.split("/", 1)[1]
        if not self.api_base.startswith("http"):
            self.api_base = "http://" + self.api_base
    elif self.config.chat_model.startswith("ollama/"):
        self.config.ollama = True

        # use api_base from config if set, else fall back on OLLAMA_BASE_URL
        self.api_base = self.config.api_base or OLLAMA_BASE_URL
        if self.api_key == OPENAI_API_KEY:
            self.api_key = OLLAMA_API_KEY
        self.config.chat_model = self.config.chat_model.replace("ollama/", "")
    elif self.config.chat_model.startswith("vllm/"):
        self.supports_json_schema = True
        self.config.chat_model = self.config.chat_model.replace("vllm/", "")
        if self.api_key == OPENAI_API_KEY:
            self.api_key = os.environ.get("VLLM_API_KEY", DUMMY_API_KEY)
        self.api_base = self.config.api_base or "http://localhost:8000/v1"
        if not self.api_base.startswith("http"):
            self.api_base = "http://" + self.api_base
        if not self.api_base.endswith("/v1"):
            self.api_base = self.api_base + "/v1"
    elif self.config.chat_model.startswith("llamacpp/"):
        self.supports_json_schema = True
        self.api_base = self.config.chat_model.split("/", 1)[1]
        if not self.api_base.startswith("http"):
            self.api_base = "http://" + self.api_base
        if self.api_key == OPENAI_API_KEY:
            self.api_key = os.environ.get("LLAMA_API_KEY", DUMMY_API_KEY)
    else:
        self.api_base = self.config.api_base
        # If api_base is unset we use OpenAI's endpoint, which supports
        # these features (with JSON schema restricted to a limited set of models)
        self.supports_strict_tools = self.api_base is None
        self.supports_json_schema = (
            self.api_base is None and self.info().has_structured_output
        )

    if settings.chat_model != "":
        # if we're overriding chat model globally, set completion model to same
        self.config.completion_model = self.config.chat_model

    if self.config.formatter is not None:
        # we want to format chats -> completions using this specific formatter
        self.config.use_completion_for_chat = True
        self.config.completion_model = self.config.chat_model

    if self.config.use_completion_for_chat:
        self.config.use_chat_for_completion = False

    self.is_groq = self.config.chat_model.startswith("groq/")
    self.is_cerebras = self.config.chat_model.startswith("cerebras/")
    self.is_gemini = self.is_gemini_model()
    self.is_deepseek = self.is_deepseek_model()
    self.is_glhf = self.config.chat_model.startswith("glhf/")
    self.is_openrouter = self.config.chat_model.startswith("openrouter/")
    self.is_langdb = self.config.chat_model.startswith("langdb/")
    self.is_portkey = self.config.chat_model.startswith("portkey/")
    self.is_litellm_proxy = self.config.chat_model.startswith("litellm-proxy/")

    if self.is_groq:
        # use groq-specific client
        self.config.chat_model = self.config.chat_model.replace("groq/", "")
        if self.api_key == OPENAI_API_KEY:
            self.api_key = os.getenv("GROQ_API_KEY", DUMMY_API_KEY)
        if self.config.use_cached_client:
            self.client = get_groq_client(api_key=self.api_key)
            self.async_client = get_async_groq_client(api_key=self.api_key)
        else:
            # Create new clients without caching
            self.client = Groq(api_key=self.api_key)
            self.async_client = AsyncGroq(api_key=self.api_key)
    elif self.is_cerebras:
        # use cerebras-specific client
        self.config.chat_model = self.config.chat_model.replace("cerebras/", "")
        if self.api_key == OPENAI_API_KEY:
            self.api_key = os.getenv("CEREBRAS_API_KEY", DUMMY_API_KEY)
        if self.config.use_cached_client:
            self.client = get_cerebras_client(api_key=self.api_key)
            # TODO there is not async client, so should we do anything here?
            self.async_client = get_async_cerebras_client(api_key=self.api_key)
        else:
            # Create new clients without caching
            self.client = Cerebras(api_key=self.api_key)
            self.async_client = AsyncCerebras(api_key=self.api_key)
    else:
        # in these cases, there's no specific client: OpenAI python client suffices
        if self.is_litellm_proxy:
            self.config.chat_model = self.config.chat_model.replace(
                "litellm-proxy/", ""
            )
            if self.api_key == OPENAI_API_KEY:
                self.api_key = self.config.litellm_proxy.api_key or self.api_key
            self.api_base = self.config.litellm_proxy.api_base or self.api_base
        elif self.is_gemini:
            self.config.chat_model = self.config.chat_model.replace("gemini/", "")
            if self.api_key == OPENAI_API_KEY:
                self.api_key = os.getenv("GEMINI_API_KEY", DUMMY_API_KEY)
            self.api_base = GEMINI_BASE_URL
        elif self.is_glhf:
            self.config.chat_model = self.config.chat_model.replace("glhf/", "")
            if self.api_key == OPENAI_API_KEY:
                self.api_key = os.getenv("GLHF_API_KEY", DUMMY_API_KEY)
            self.api_base = GLHF_BASE_URL
        elif self.is_openrouter:
            self.config.chat_model = self.config.chat_model.replace(
                "openrouter/", ""
            )
            if self.api_key == OPENAI_API_KEY:
                self.api_key = os.getenv("OPENROUTER_API_KEY", DUMMY_API_KEY)
            self.api_base = OPENROUTER_BASE_URL
        elif self.is_deepseek:
            self.config.chat_model = self.config.chat_model.replace("deepseek/", "")
            self.api_base = DEEPSEEK_BASE_URL
            if self.api_key == OPENAI_API_KEY:
                self.api_key = os.getenv("DEEPSEEK_API_KEY", DUMMY_API_KEY)
        elif self.is_langdb:
            self.config.chat_model = self.config.chat_model.replace("langdb/", "")
            self.api_base = self.config.langdb_params.base_url
            project_id = self.config.langdb_params.project_id
            if project_id:
                self.api_base += "/" + project_id + "/v1"
            if self.api_key == OPENAI_API_KEY:
                self.api_key = self.config.langdb_params.api_key or DUMMY_API_KEY

            if self.config.langdb_params:
                params = self.config.langdb_params
                if params.project_id:
                    self.config.headers["x-project-id"] = params.project_id
                if params.label:
                    self.config.headers["x-label"] = params.label
                if params.run_id:
                    self.config.headers["x-run-id"] = params.run_id
                if params.thread_id:
                    self.config.headers["x-thread-id"] = params.thread_id
        elif self.is_portkey:
            # Parse the model string and extract provider/model
            provider, model = self.config.portkey_params.parse_model_string(
                self.config.chat_model
            )
            self.config.chat_model = model
            if provider:
                self.config.portkey_params.provider = provider

            # Set Portkey base URL
            self.api_base = self.config.portkey_params.base_url + "/v1"

            # Set API key - use provider's API key from env if available
            if self.api_key == OPENAI_API_KEY:
                self.api_key = self.config.portkey_params.get_provider_api_key(
                    self.config.portkey_params.provider, DUMMY_API_KEY
                )

            # Add Portkey-specific headers
            self.config.headers.update(self.config.portkey_params.get_headers())

        # Create http_client if needed - Priority order:
        # 1. http_client_factory (most flexibility, not cacheable)
        # 2. http_client_config (cacheable, moderate flexibility)
        # 3. http_verify_ssl=False (cacheable, simple SSL bypass)
        http_client = None
        async_http_client = None
        http_client_config_used = None

        if self.config.http_client_factory is not None:
            # Use the factory to create http_client (not cacheable)
            http_client = self.config.http_client_factory()
            # Don't set async_http_client from sync client - create separately
            # This avoids type mismatch issues
            async_http_client = None
        elif self.config.http_client_config is not None:
            # Use config dict (cacheable)
            http_client_config_used = self.config.http_client_config
        elif not self.config.http_verify_ssl:
            # Simple SSL bypass (cacheable)
            http_client_config_used = {"verify": False}
            logging.warning(
                "SSL verification has been disabled. This is insecure and "
                "should only be used in trusted environments (e.g., "
                "corporate networks with self-signed certificates)."
            )

        if self.config.use_cached_client:
            self.client = get_openai_client(
                api_key=self.api_key,
                base_url=self.api_base,
                organization=self.config.organization,
                timeout=Timeout(self.config.timeout),
                default_headers=self.config.headers,
                http_client=http_client,
                http_client_config=http_client_config_used,
            )
            self.async_client = get_async_openai_client(
                api_key=self.api_key,
                base_url=self.api_base,
                organization=self.config.organization,
                timeout=Timeout(self.config.timeout),
                default_headers=self.config.headers,
                http_client=async_http_client,
                http_client_config=http_client_config_used,
            )
        else:
            # Create new clients without caching
            client_kwargs: Dict[str, Any] = dict(
                api_key=self.api_key,
                base_url=self.api_base,
                organization=self.config.organization,
                timeout=Timeout(self.config.timeout),
                default_headers=self.config.headers,
            )
            if http_client is not None:
                client_kwargs["http_client"] = http_client
            elif http_client_config_used is not None:
                # Create http_client from config for non-cached scenario
                try:
                    from httpx import Client

                    client_kwargs["http_client"] = Client(**http_client_config_used)
                except ImportError:
                    raise ValueError(
                        "httpx is required to use http_client_config. "
                        "Install it with: pip install httpx"
                    )
            self.client = OpenAI(**client_kwargs)

            async_client_kwargs: Dict[str, Any] = dict(
                api_key=self.api_key,
                base_url=self.api_base,
                organization=self.config.organization,
                timeout=Timeout(self.config.timeout),
                default_headers=self.config.headers,
            )
            if async_http_client is not None:
                async_client_kwargs["http_client"] = async_http_client
            elif http_client_config_used is not None:
                # Create async http_client from config for non-cached scenario
                try:
                    from httpx import AsyncClient

                    async_client_kwargs["http_client"] = AsyncClient(
                        **http_client_config_used
                    )
                except ImportError:
                    raise ValueError(
                        "httpx is required to use http_client_config. "
                        "Install it with: pip install httpx"
                    )
            self.async_client = AsyncOpenAI(**async_client_kwargs)

    self.cache: CacheDB | None = None
    use_cache = self.config.cache_config is not None
    if "redis" in settings.cache_type and use_cache:
        if config.cache_config is None or not isinstance(
            config.cache_config,
            RedisCacheConfig,
        ):
            # switch to fresh redis config if needed
            config.cache_config = RedisCacheConfig(
                fake="fake" in settings.cache_type
            )
        if "fake" in settings.cache_type:
            # force use of fake redis if global cache_type is "fakeredis"
            config.cache_config.fake = True
        self.cache = RedisCache(config.cache_config)
    elif settings.cache_type != "none" and use_cache:
        raise ValueError(
            f"Invalid cache type {settings.cache_type}. "
            "Valid types are redis, fakeredis, none"
        )

    self.config._validate_litellm()

`is_gemini_model()` ¶

Are we using the gemini OpenAI-compatible API?

Source code in langroid/language_models/openai_gpt.py

def is_gemini_model(self) -> bool:
    """Are we using the gemini OpenAI-compatible API?"""
    return self.chat_model_orig.startswith("gemini/")

`unsupported_params()` ¶

List of params that are not supported by the current model

Source code in langroid/language_models/openai_gpt.py

def unsupported_params(self) -> List[str]:
    """
    List of params that are not supported by the current model
    """
    unsupported = set(self.info().unsupported_params)
    for param, model_list in OpenAI_API_ParamInfo().params.items():
        if (
            self.config.chat_model not in model_list
            and self.chat_model_orig not in model_list
        ):
            unsupported.add(param)
    return list(unsupported)

`rename_params()` ¶

Map of param name -> new name for specific models. Currently main troublemaker is o1* series.

Source code in langroid/language_models/openai_gpt.py

def rename_params(self) -> Dict[str, str]:
    """
    Map of param name -> new name for specific models.
    Currently main troublemaker is o1* series.
    """
    return self.info().rename_params

`chat_context_length()` ¶

Context-length for chat-completion models/endpoints. Get it from the config if explicitly given, otherwise use model_info based on model name, and fall back to generic model_info if there's no match.

Source code in langroid/language_models/openai_gpt.py

def chat_context_length(self) -> int:
    """
    Context-length for chat-completion models/endpoints.
    Get it from the config if explicitly given,
     otherwise use model_info based on model name, and fall back to
     generic model_info if there's no match.
    """
    return self.config.chat_context_length or self.info().context_length

`completion_context_length()` ¶

Context-length for completion models/endpoints. Get it from the config if explicitly given, otherwise use model_info based on model name, and fall back to generic model_info if there's no match.

Source code in langroid/language_models/openai_gpt.py

def completion_context_length(self) -> int:
    """
    Context-length for completion models/endpoints.
    Get it from the config if explicitly given,
     otherwise use model_info based on model name, and fall back to
     generic model_info if there's no match.
    """
    return (
        self.config.completion_context_length
        or self.completion_info().context_length
    )

`chat_cost()` ¶

(Prompt, Cached, Generation) cost per 1000 tokens, for chat-completion models/endpoints. Get it from the dict, otherwise fail-over to general method

Source code in langroid/language_models/openai_gpt.py

def chat_cost(self) -> Tuple[float, float, float]:
    """
    (Prompt, Cached, Generation) cost per 1000 tokens, for chat-completion
    models/endpoints.
    Get it from the dict, otherwise fail-over to general method
    """
    info = self.info()
    cached_cost_per_million = info.cached_cost_per_million
    if not cached_cost_per_million:
        cached_cost_per_million = info.input_cost_per_million
    return (
        info.input_cost_per_million / 1000,
        cached_cost_per_million / 1000,
        info.output_cost_per_million / 1000,
    )

`set_stream(stream)` ¶

Enable or disable streaming output from API. Args: stream: enable streaming output from API Returns: previous value of stream

Source code in langroid/language_models/openai_gpt.py

def set_stream(self, stream: bool) -> bool:
    """Enable or disable streaming output from API.
    Args:
        stream: enable streaming output from API
    Returns: previous value of stream
    """
    tmp = self.config.stream
    self.config.stream = stream
    return tmp

`get_stream()` ¶

Get streaming status.

Source code in langroid/language_models/openai_gpt.py

def get_stream(self) -> bool:
    """Get streaming status."""
    return self.config.stream and settings.stream and self.info().allows_streaming

`tool_deltas_to_tools(tools)` `staticmethod` ¶

Convert accumulated tool-call deltas to OpenAIToolCall objects. Adapted from this excellent code: https://community.openai.com/t/help-for-function-calls-with-streaming/627170/2

Parameters:

Name	Type	Description	Default
`tools`	`List[Dict[str, Any]]`	list of tool deltas received from streaming API	required

Returns:

Name	Type	Description
`str`	`str`	plain text corresponding to tool calls that failed to parse
	`List[OpenAIToolCall]`	List[OpenAIToolCall]: list of OpenAIToolCall objects
	`List[Dict[str, Any]]`	List[Dict[str, Any]]: list of tool dicts (to reconstruct OpenAI API response, so it can be cached)

Source code in langroid/language_models/openai_gpt.py

@staticmethod
def tool_deltas_to_tools(
    tools: List[Dict[str, Any]],
) -> Tuple[
    str,
    List[OpenAIToolCall],
    List[Dict[str, Any]],
]:
    """
    Convert accumulated tool-call deltas to OpenAIToolCall objects.
    Adapted from this excellent code:
     https://community.openai.com/t/help-for-function-calls-with-streaming/627170/2

    Args:
        tools: list of tool deltas received from streaming API

    Returns:
        str: plain text corresponding to tool calls that failed to parse
        List[OpenAIToolCall]: list of OpenAIToolCall objects
        List[Dict[str, Any]]: list of tool dicts
            (to reconstruct OpenAI API response, so it can be cached)
    """
    # Initialize a dictionary with default values

    # idx -> dict repr of tool
    # (used to simulate OpenAIResponse object later, and also to
    # accumulate function args as strings)
    idx2tool_dict: Dict[str, Dict[str, Any]] = defaultdict(
        lambda: {
            "id": None,
            "function": {"arguments": "", "name": None},
            "type": None,
        }
    )

    for tool_delta in tools:
        if tool_delta["id"] is not None:
            idx2tool_dict[tool_delta["index"]]["id"] = tool_delta["id"]

        if tool_delta["function"]["name"] is not None:
            idx2tool_dict[tool_delta["index"]]["function"]["name"] = tool_delta[
                "function"
            ]["name"]

        idx2tool_dict[tool_delta["index"]]["function"]["arguments"] += tool_delta[
            "function"
        ]["arguments"]

        if tool_delta["type"] is not None:
            idx2tool_dict[tool_delta["index"]]["type"] = tool_delta["type"]

    # (try to) parse the fn args of each tool
    contents: List[str] = []
    good_indices = []
    id2args: Dict[str, None | Dict[str, Any]] = {}
    for idx, tool_dict in idx2tool_dict.items():
        failed_content, args_dict = OpenAIGPT._parse_function_args(
            tool_dict["function"]["arguments"]
        )
        # used to build tool_calls_list below
        id2args[tool_dict["id"]] = args_dict or None  # if {}, store as None
        if failed_content != "":
            contents.append(failed_content)
        else:
            good_indices.append(idx)

    # remove the failed tool calls
    idx2tool_dict = {
        idx: tool_dict
        for idx, tool_dict in idx2tool_dict.items()
        if idx in good_indices
    }

    # create OpenAIToolCall list
    tool_calls_list = [
        OpenAIToolCall(
            id=tool_dict["id"],
            function=LLMFunctionCall(
                name=tool_dict["function"]["name"],
                arguments=id2args.get(tool_dict["id"]),
            ),
            type=tool_dict["type"],
        )
        for tool_dict in idx2tool_dict.values()
    ]
    return "\n".join(contents), tool_calls_list, list(idx2tool_dict.values())

`OpenAICallParams` ¶

Bases: BaseModel

Various params that can be sent to an OpenAI API chat-completion call. When specified, any param here overrides the one with same name in the OpenAIGPTConfig. See OpenAI API Reference for details on the params: https://platform.openai.com/docs/api-reference/chat

`MockLM(config=MockLMConfig())` ¶

Bases: LanguageModel

Source code in langroid/language_models/mock_lm.py

def __init__(self, config: MockLMConfig = MockLMConfig()):
    super().__init__(config)
    self.config: MockLMConfig = config

`chat(messages, max_tokens=200, tools=None, tool_choice='auto', functions=None, function_call='auto', response_format=None)` ¶

Mock chat function for testing

Source code in langroid/language_models/mock_lm.py

def chat(
    self,
    messages: Union[str, List[lm.LLMMessage]],
    max_tokens: int = 200,
    tools: Optional[List[OpenAIToolSpec]] = None,
    tool_choice: ToolChoiceTypes | Dict[str, str | Dict[str, str]] = "auto",
    functions: Optional[List[lm.LLMFunctionSpec]] = None,
    function_call: str | Dict[str, str] = "auto",
    response_format: Optional[OpenAIJsonSchemaSpec] = None,
) -> lm.LLMResponse:
    """
    Mock chat function for testing
    """
    last_msg = messages[-1].content if isinstance(messages, list) else messages
    return self._response(last_msg)

`achat(messages, max_tokens=200, tools=None, tool_choice='auto', functions=None, function_call='auto', response_format=None)` `async` ¶

Mock chat function for testing

Source code in langroid/language_models/mock_lm.py

async def achat(
    self,
    messages: Union[str, List[lm.LLMMessage]],
    max_tokens: int = 200,
    tools: Optional[List[OpenAIToolSpec]] = None,
    tool_choice: ToolChoiceTypes | Dict[str, str | Dict[str, str]] = "auto",
    functions: Optional[List[lm.LLMFunctionSpec]] = None,
    function_call: str | Dict[str, str] = "auto",
    response_format: Optional[OpenAIJsonSchemaSpec] = None,
) -> lm.LLMResponse:
    """
    Mock chat function for testing
    """
    last_msg = messages[-1].content if isinstance(messages, list) else messages
    return await self._response_async(last_msg)

`generate(prompt, max_tokens=200)` ¶

Mock generate function for testing

Source code in langroid/language_models/mock_lm.py

def generate(self, prompt: str, max_tokens: int = 200) -> lm.LLMResponse:
    """
    Mock generate function for testing
    """
    return self._response(prompt)

`agenerate(prompt, max_tokens=200)` `async` ¶

Mock generate function for testing

Source code in langroid/language_models/mock_lm.py

async def agenerate(self, prompt: str, max_tokens: int = 200) -> LLMResponse:
    """
    Mock generate function for testing
    """
    return await self._response_async(prompt)

`MockLMConfig` ¶

Bases: LLMConfig

Mock Language Model Configuration.

Attributes:

Name	Type	Description
`response_dict`	`Dict[str, str]`	A "response rule-book", in the form of a dictionary; if last msg in dialog is x,then respond with response_dict[x]

`AzureConfig(**kwargs)` ¶

Bases: OpenAIGPTConfig

Configuration for Azure OpenAI GPT.

Attributes:

Name	Type	Description
`type`	`str`	should be `azure.`
`api_version`	`str`	can be set in the `.env` file as `AZURE_OPENAI_API_VERSION.`
`deployment_name`	`str \| None`	can be optionally set in the `.env` file as `AZURE_OPENAI_DEPLOYMENT_NAME` and should be based the custom name you chose for your deployment when you deployed a model.
`model_name`	`str`	[DEPRECATED] can be set in the `.env` file as `AZURE_OPENAI_MODEL_NAME` and should be based on the model name chosen during setup.
`chat_model`	`str`	the chat model name to use. Can be set via the env variable `AZURE_OPENAI_CHAT_MODEL`. Recommended to use this instead of `model_name`.

Source code in langroid/language_models/azure_openai.py

def __init__(self, **kwargs) -> None:  # type: ignore
    if "model_name" in kwargs and "chat_model" not in kwargs:
        kwargs["chat_model"] = kwargs["model_name"]
    super().__init__(**kwargs)

`AzureGPT(config)` ¶

Bases: OpenAIGPT

Class to access OpenAI LLMs via Azure. These env variables can be obtained from the file .azure_env. Azure OpenAI doesn't support completion

Source code in langroid/language_models/azure_openai.py

def __init__(self, config: AzureConfig):
    # This will auto-populate config values from .env file
    load_dotenv()
    super().__init__(config)
    self.config: AzureConfig = config

    if (
        self.config.azure_openai_client_provider
        or self.config.azure_openai_async_client_provider
    ):
        if not self.config.azure_openai_client_provider:
            self.client = None
            logger.warning(
                "Using user-provided Azure OpenAI client, but only async "
                "client has been provided. Synchronous calls will fail."
            )
        if not self.config.azure_openai_async_client_provider:
            self.async_client = None
            logger.warning(
                "Using user-provided Azure OpenAI client, but no async "
                "client has been provided. Asynchronous calls will fail."
            )

        if self.config.azure_openai_client_provider:
            self.client = self.config.azure_openai_client_provider()
        if self.config.azure_openai_async_client_provider:
            self.async_client = self.config.azure_openai_async_client_provider()
            self.async_client.timeout = Timeout(self.config.timeout)
    else:
        if self.config.api_key == "":
            raise ValueError(
                """
                AZURE_OPENAI_API_KEY not set in .env file,
                please set it to your Azure API key."""
            )

        if self.config.api_base == "":
            raise ValueError(
                """
                AZURE_OPENAI_API_BASE not set in .env file,
                please set it to your Azure API key."""
            )

        self.client = AzureOpenAI(
            api_key=self.config.api_key,
            azure_endpoint=self.config.api_base,
            api_version=self.config.api_version,
            azure_deployment=self.config.deployment_name,
        )
        self.async_client = AsyncAzureOpenAI(
            api_key=self.config.api_key,
            azure_endpoint=self.config.api_base,
            api_version=self.config.api_version,
            azure_deployment=self.config.deployment_name,
            timeout=Timeout(self.config.timeout),
        )

    self.supports_json_schema = (
        self.config.api_version >= azureStructuredOutputAPIMin
        and self.config.api_version in azureStructuredOutputList
    )

language_models

LLMConfig ¶

LLMMessage ¶

api_dict(model, has_system_role=True) ¶

LLMFunctionCall ¶

from_dict(message) staticmethod ¶

LLMFunctionSpec ¶

Role ¶

LLMTokenUsage ¶

LLMResponse ¶

to_LLMMessage() ¶

get_recipient_and_message() ¶

OpenAIChatModel ¶

AnthropicModel ¶

GeminiModel ¶

OpenAICompletionModel ¶

OpenAIGPTConfig(**kwargs) ¶

by upper-casing the name and prefixing with OPENAI_, e.g.¶

OPENAI_MAX_OUTPUT_TOKENS=1000.¶

If any of these is defined in this way in the environment¶

(either via explicit setenv or export or via .env file + load_dotenv()),¶

the environment variable takes precedence over the value in the config.¶

model_copy(*, update=None, deep=False) ¶

create(prefix) classmethod ¶

OpenAIGPT(config=OpenAIGPTConfig()) ¶

is_gemini_model() ¶

unsupported_params() ¶

rename_params() ¶

chat_context_length() ¶

completion_context_length() ¶

chat_cost() ¶

set_stream(stream) ¶

get_stream() ¶

tool_deltas_to_tools(tools) staticmethod ¶

OpenAICallParams ¶

MockLM(config=MockLMConfig()) ¶

chat(messages, max_tokens=200, tools=None, tool_choice='auto', functions=None, function_call='auto', response_format=None) ¶

achat(messages, max_tokens=200, tools=None, tool_choice='auto', functions=None, function_call='auto', response_format=None) async ¶

generate(prompt, max_tokens=200) ¶

agenerate(prompt, max_tokens=200) async ¶

MockLMConfig ¶

AzureConfig(**kwargs) ¶

AzureGPT(config) ¶

`LLMConfig` ¶

`LLMMessage` ¶

`api_dict(model, has_system_role=True)` ¶

`LLMFunctionCall` ¶

`from_dict(message)` `staticmethod` ¶

`LLMFunctionSpec` ¶

`Role` ¶

`LLMTokenUsage` ¶

`LLMResponse` ¶

`to_LLMMessage()` ¶

`get_recipient_and_message()` ¶

`OpenAIChatModel` ¶

`AnthropicModel` ¶

`GeminiModel` ¶

`OpenAICompletionModel` ¶

`OpenAIGPTConfig(**kwargs)` ¶

`model_copy(*, update=None, deep=False)` ¶

`create(prefix)` `classmethod` ¶

`OpenAIGPT(config=OpenAIGPTConfig())` ¶

`is_gemini_model()` ¶

`unsupported_params()` ¶

`rename_params()` ¶

`chat_context_length()` ¶

`completion_context_length()` ¶

`chat_cost()` ¶

`set_stream(stream)` ¶

`get_stream()` ¶

`tool_deltas_to_tools(tools)` `staticmethod` ¶

`OpenAICallParams` ¶

`MockLM(config=MockLMConfig())` ¶

`chat(messages, max_tokens=200, tools=None, tool_choice='auto', functions=None, function_call='auto', response_format=None)` ¶

`achat(messages, max_tokens=200, tools=None, tool_choice='auto', functions=None, function_call='auto', response_format=None)` `async` ¶

`generate(prompt, max_tokens=200)` ¶

`agenerate(prompt, max_tokens=200)` `async` ¶

`MockLMConfig` ¶

`AzureConfig(**kwargs)` ¶

`AzureGPT(config)` ¶