> ## Documentation Index
> Fetch the complete documentation index at: https://dragonwingdocs.qualcomm.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Using with Langchain

The Qualcomm LLM/VLM containerized service can be used with Langchain to build agentic applications as it exposes an OpenAI API compatible interface.

## Simple LLM query (no streaming)

Let's start with a simple example that calls our API and gets a result.  In this example we are not using streaming mode, which means we won't get the result from our query until the LLM has generated all of tokens in response.

Let's start by creating a new venv and install some base packages for langchain:

```bash theme={null}
python3 -m venv venv-langchain
source venv-langchain/bin/activate
pip3 install langchain langchain-openai
```

Next, let's look at the code to call our locally hosted LLM.  Be sure to update the base URL to match your port # as well as the model name:

```python python langchain example theme={null}
import sys
from langchain_openai import ChatOpenAI

# Get phrase from command-line argument
if len(sys.argv) < 2:
    print("Usage: python langchainbasic.py '<your phrase>'")
    sys.exit(1)

phrase = sys.argv[1]

llm = ChatOpenAI(
    api_key="local-llm",                  # api key not used by our local service
    base_url="http://localhost:9001/v1",  # your server’s OpenAI path
    model="qwen3_4b_instruct_2507"                    # server’s advertised model name
)

print(llm.invoke(phrase).content)
```

Copy this code to a python file (langchainbasic.py in this example) and run using:<br />
`python langchainbasic.py "What is the capital of Texas?"`

## LLM query with streaming

While the above works fine, sometimes we want to display the tokens as they are being generated so that the user sees feedback sooner which requires us to call the LLM using streaming mode.   Let's modify our example to use this:

```python python langchain streaming example theme={null}
import sys
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage

# Get phrase from command-line argument
if len(sys.argv) < 2:
    print("Usage: python langchainstreaming.py '<your phrase>'")
    sys.exit(1)

phrase = sys.argv[1]

llm = ChatOpenAI(
    api_key="local-llm",                  # api key not used by our local service
    base_url="http://localhost:9001/v1",  # your server’s OpenAI path
    model="qwen3_4b_instruct_2507"        # server’s advertised model name
)

for chunk in llm.stream([HumanMessage(content=phrase)]):
        print(chunk.content,end="",flush=True)
print()
```

Copy this code to a python file (langchainstreaming.py in this example) and run using:<br />
`python langchainstreaming.py "Tell me about Qualcomm in 50 words or less."`

## LLM query with tool calling

In this final example, we will call the LLM using an example of tool calling.  We will define a mock api called `get_weather()` which the LLM will use when appropriate to look up the weather for a specific location.  In the example below this function will return hardcoded values, but your real implementation could call out to a network API or some other resource to look up the actual weather.

```python python langchain tool calling example theme={null}
"""
LangChain OpenAI-Style API Client
==================================
A generic script for interacting with OpenAI-style chat completion APIs using LangChain.
Follows LangChain's recommended patterns and best practices.

Dependencies:
    langchain>=0.1.16
    langchain-openai>=0.1.0
    langchain-core>=0.1.40
"""
import os
import json
from typing import Dict, List, Any, Optional
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, AIMessage, ToolMessage
from langchain_core.tools import tool


# ============================================================================
# Tool Definitions
# ============================================================================
@tool
def get_weather(location: str) -> str:
    """
    Get weather information for a specific location.
    Args:
        location: The name of the location to get weather for
    Returns:
        A JSON string containing weather information
    """
    # Mock implementation - replace with actual API call in production
    weather_data = {
        "temperature": 72,
        "condition": "sunny",
        "humidity": 65,
        "location": location
    }
    return json.dumps(weather_data)

# ============================================================================
# Main Client Implementation
# ============================================================================
def create_chat_client(
    base_url: str,
    model: str,
    api_key: str = "not-needed"
) -> ChatOpenAI:
    """
    Create a ChatOpenAI client configured for an OpenAI-style API endpoint.
    Args:
        base_url: The base URL of the API endpoint (e.g., "http://localhost:9000/v1")
        model: The model name to use (e.g., "qwen3_4b_instruct_2507")
        api_key: API key (default: "not-needed" for local endpoints)
        temperature: Temperature setting for response generation
    Returns:
        Configured ChatOpenAI instance
    """
    # Set API key in environment (required by langchain even if not used)
    os.environ["OPENAI_API_KEY"] = api_key

    chat = ChatOpenAI(
        base_url=base_url,
        model=model
    )

    return chat


def process_tool_calls(
    ai_message: AIMessage,
    available_tools: Dict[str, Any]
) -> List[ToolMessage]:
    """
    Process tool calls from the AI message and execute the corresponding tools.
    Args:
        ai_message: The AI message containing tool calls
        available_tools: Dictionary mapping tool names to tool functions
    Returns:
        List of ToolMessage objects with tool execution results
    """
    tool_messages = []

    if hasattr(ai_message, 'tool_calls') and ai_message.tool_calls:
        print("\n[TOOL CALLS DETECTED]")
        for tool_call in ai_message.tool_calls:
            tool_name = tool_call['name']
            tool_args = tool_call['args']
            tool_id = tool_call['id']

            print(f"  - Tool: {tool_name}")
            print(f"    Arguments: {json.dumps(tool_args, indent=6)}")

            # Execute the tool
            if tool_name in available_tools:
                tool_func = available_tools[tool_name]
                try:
                    tool_output = tool_func.invoke(tool_args)
                    print(f"    Output: {tool_output}")

                    tool_messages.append(
                        ToolMessage(
                            tool_call_id=tool_id,
                            content=tool_output,
                        )
                    )
                except Exception as e:
                    error_msg = f"Error executing tool {tool_name}: {str(e)}"
                    print(f"    Error: {error_msg}")
                    tool_messages.append(
                        ToolMessage(
                            tool_call_id=tool_id,
                            content=error_msg,
                        )
                    )
            else:
                error_msg = f"Tool {tool_name} not found"
                print(f"    Error: {error_msg}")
                tool_messages.append(
                    ToolMessage(
                        tool_call_id=tool_id,
                        content=error_msg,
                    )
                )

    return tool_messages


def run_conversation_with_tools(
    chat_client: ChatOpenAI,
    user_query: str,
    tools: List[Any],
    stream: bool = True
) -> str:
    """
    Run a complete conversation with tool calling support.
    Args:
        chat_client: The ChatOpenAI client instance
        user_query: The user's query/message
        tools: List of tool functions to make available
        stream: Whether to stream the response (default: True)
    Returns:
        The final response from the model
    """
    # Bind tools to the chat client
    chat_with_tools = chat_client.bind_tools(tools)

    # Create tool lookup dictionary
    available_tools = {tool.name: tool for tool in tools}

    print("=" * 80)
    print("USER QUERY")
    print("=" * 80)
    print(user_query)
    print()

    # Step 1: Send initial query
    print("=" * 80)
    print("MODEL RESPONSE (Initial)")
    print("=" * 80)

    if stream:
        # Stream the response and collect chunks
        chunks = []
        for chunk in chat_with_tools.stream(user_query):
            if chunk.content:
                print(chunk.content, end="", flush=True)
            chunks.append(chunk)
        print()

        # Reconstruct full message from chunks
        if chunks:
            ai_message = chunks[0]
            for chunk in chunks[1:]:
                ai_message += chunk
        else:
            print("No response received.")
            return ""
    else:
        # Non-streaming response
        ai_message = chat_with_tools.invoke(user_query)
        if ai_message.content:
            print(ai_message.content)

    # Step 2: Check for tool calls
    tool_messages = process_tool_calls(ai_message, available_tools)

    # Step 3: If there were tool calls, send results back to model
    if tool_messages:
        print("\n" + "=" * 80)
        print("SENDING TOOL RESULTS BACK TO MODEL")
        print("=" * 80)

        # Build complete message history
        messages = [
            HumanMessage(content=user_query),
            ai_message,
            *tool_messages
        ]

        # Print the message history being sent
        print("\nMessage History:")
        for i, msg in enumerate(messages):
            print(f"\n  [{i}] {type(msg).__name__}")
            if hasattr(msg, 'content') and msg.content:
                content_preview = msg.content[:100] + "..." if len(msg.content) > 100 else msg.content
                print(f"      Content: {content_preview}")
            if hasattr(msg, 'tool_calls') and msg.tool_calls:
                print(f"      Tool Calls: {len(msg.tool_calls)} call(s)")
            if hasattr(msg, 'tool_call_id'):
                print(f"      Tool Call ID: {msg.tool_call_id}")

        print("\n" + "=" * 80)
        print("MODEL RESPONSE (Final)")
        print("=" * 80)

        # Get final response with tool results
        if stream:
            final_response = ""
            for chunk in chat_with_tools.stream(messages):
                if chunk.content:
                    print(chunk.content, end="", flush=True)
                    final_response += chunk.content
            print()
            return final_response
        else:
            final_message = chat_with_tools.invoke(messages)
            if final_message.content:
                print(final_message.content)
                return final_message.content
    else:
        # No tool calls, return the initial response
        return ai_message.content if ai_message.content else ""

    return ""

# ============================================================================
# Example Usage
# ============================================================================
def main():
    """
    Main function demonstrating the usage of the langchain client.
    """
    # Configuration
    BASE_URL = "http://localhost:9001/v1"
    MODEL = "qwen3_4b_instruct_2507"
    API_KEY = "not-needed"  # Not required for local endpoints

    print("=" * 80)
    print("LANGCHAIN OPENAI-STYLE API CLIENT")
    print("=" * 80)
    print(f"Endpoint: {BASE_URL}")
    print(f"Model: {MODEL}")
    print("=" * 80)
    print()

    try:
        # Create the chat client
        chat = create_chat_client(
            base_url=BASE_URL,
            model=MODEL,
            api_key=API_KEY
        )

        # Define available tools
        tools = [get_weather]

        # Example 1: Query that should trigger tool calling
        print("\n" + "=" * 80)
        print("EXAMPLE 1: Weather Query (Tool Calling Expected)")
        print("=" * 80)

        weather_query = "Tell me about weather in San Diego"
        run_conversation_with_tools(
            chat_client=chat,
            user_query=weather_query,
            tools=tools,
            stream=False
        )

    except Exception as e:
        print(f"\n[ERROR] An error occurred: {e}")
        import traceback
        traceback.print_exc()


if __name__ == "__main__":
    main()
```

Copy this code to a python file and run using:<br />
`python toolcalling.py`

To exit venv:

```bash theme={null}
deactivate
```

For more information on how to use Langchain with LLMs to build Agentic AI applications, please see the [Langchain documentation](https://docs.langchain.com/).
