Programming Primitives
LLM Invocations
Section titled “LLM Invocations”LLM invocations in vagents are handled through the LM class, which provides a simple and consistent interface for interacting with language models. The framework supports both text-only and multimodal interactions.
Basic Text LLM Usage
Section titled “Basic Text LLM Usage”Creating an LM Instance
Section titled “Creating an LM Instance”from vagents.core import LM
# Create an LM instance with auto model selectionlm = LM(name="@auto")
# Or specify a particular modellm = LM( name="meta-llama/Llama-3.2-90B-Vision-Instruct", base_url="http://localhost:8000", api_key="your-api-key-here")Defining Prompt Functions
Section titled “Defining Prompt Functions”Prompt functions are regular Python functions that return a list of messages in OpenAI chat format:
def write_story(query: str) -> list: """Generate a story based on the given query.""" return [{"role": "user", "content": "write a story about " + query}]Invoking the LLM
Section titled “Invoking the LLM”Use the invoke method to call your prompt function with the LM:
import asyncio
async def main(): lm = LM(name="@auto") description = await lm.invoke(write_story, "a brave knight and a dragon") print(description)asyncio.run(main())Concurrent Invocations
Section titled “Concurrent Invocations”For better performance when making multiple requests, you can execute them concurrently:
async def concurrent_example(): lm = LM(name="@auto")
# Start multiple requests concurrently future1 = lm(messages=[{"role": "user", "content": "Hello, how are you? in one word"}]) future2 = lm(messages=[{"role": "user", "content": "What's the weather like? in one word"}]) future3 = lm(messages=[{"role": "user", "content": "Tell me a joke in one word"}])
# Wait for all to complete response1, response2, response3 = await asyncio.gather(future1, future2, future3)
print("Response 1:", response1) print("Response 2:", response2) print("Response 3:", response3)Multimodal LLM Usage
Section titled “Multimodal LLM Usage”For multimodal interactions (text + images), use the @multimodal decorator:
from PIL import Imagefrom vagents.core import multimodal, LM
@multimodal(input_type="image", param=["frame"])def narrate_frame(frame: Image.Image, *args, **kwargs) -> str: """Describe the contents of an image frame.""" return "Describing frame at index."
async def multimodal_example(): # Load an image frame = Image.open("path/to/image.jpg")
# Use a vision-capable model model = LM(name="meta-llama/Llama-3.2-90B-Vision-Instruct")
# Invoke with image and additional parameters description = await model.invoke( narrate_frame, frame=frame, temperature=0.7, max_tokens=100, stream=False ) print(description)Configuration
Section titled “Configuration”Environment Variables
Section titled “Environment Variables”You can configure the LM instance using environment variables:
VAGENTS_LM_BASE_URL: The base URL for the LM API (default:http://localhost:8000)VAGENTS_LM_API_KEY: The API key for authentication (default:your-api-key-here)
Supported Parameters
Section titled “Supported Parameters”When invoking LLMs, you can pass the following optional parameters:
temperature: Controls randomness in the outputtop_p: Controls nucleus samplingmax_tokens: Maximum number of tokens to generatestream: Whether to stream the responsestop: Stop sequences for generationn: Number of completions to generatepresence_penalty: Penalty for new tokens based on presencefrequency_penalty: Penalty for new tokens based on frequency
Example with parameters:
result = await lm.invoke( write_story, "a magical forest", temperature=0.8, max_tokens=200, stop=["\n\n"])