API Reference ============ This page provides detailed documentation for the ``local_llm_kit`` API. LLMClient --------- .. py:class:: LLMClient(model: str, **kwargs) The main client class for interacting with local language models. :param model: The name or path of the model to use :param model_path: Optional path to model weights :param context_length: Maximum context length (default: 2048) :param temperature: Sampling temperature (default: 0.7) :param top_p: Top-p sampling parameter (default: 0.9) :param backend: Model backend to use ('transformers' or 'llama.cpp') Chat Completions -------------- .. py:method:: LLMClient.chat.completions.create(**kwargs) Create a chat completion. :param model: Model to use for completion :param messages: List of message dictionaries :param temperature: Sampling temperature :param top_p: Top-p sampling parameter :param max_tokens: Maximum tokens to generate :param stream: Whether to stream the response :param functions: List of function definitions :param function_call: Function call behavior :param response_format: Specify response format (e.g., JSON) :return: CompletionResponse object Memory Management --------------- .. py:method:: LLMClient.enable_memory(max_tokens: int = 1000) Enable conversation memory management. :param max_tokens: Maximum tokens to store in memory .. py:method:: LLMClient.add_to_memory(messages: List[Dict]) Add messages to conversation memory. :param messages: List of message dictionaries .. py:method:: LLMClient.clear_memory() Clear all stored conversation memory. Response Objects -------------- CompletionResponse ~~~~~~~~~~~~~~~~ .. py:class:: CompletionResponse Represents a completion response. :param id: Response ID :param object: Object type :param created: Creation timestamp :param model: Model used :param choices: List of completion choices :param usage: Token usage statistics Choice ~~~~~~ .. py:class:: Choice Represents a completion choice. :param index: Choice index :param message: Message content :param finish_reason: Reason for completion Message ~~~~~~~ .. py:class:: Message Represents a chat message. :param role: Message role (user/assistant/system) :param content: Message content :param function_call: Optional function call Usage ~~~~~ .. py:class:: Usage Token usage statistics. :param prompt_tokens: Number of tokens in prompt :param completion_tokens: Number of tokens in completion :param total_tokens: Total tokens used Exceptions --------- .. py:exception:: ModelNotFoundError Raised when specified model is not found. .. py:exception:: InvalidRequestError Raised when request parameters are invalid. .. py:exception:: TokenLimitError Raised when token limit is exceeded. Configuration ----------- The following environment variables can be used to configure the client: - ``LOCAL_LLM_KIT_MODEL_PATH``: Default path to model weights - ``LOCAL_LLM_KIT_BACKEND``: Default backend to use - ``LOCAL_LLM_KIT_CONTEXT_LENGTH``: Default context length - ``LOCAL_LLM_KIT_CACHE_DIR``: Directory for caching model weights