Skip to content

Programming Primitives

LLM invocations in vagents are handled through the LM class, which provides a simple and consistent interface for interacting with language models. The framework supports both text-only and multimodal interactions.

from vagents.core import LM
# Create an LM instance with auto model selection
lm = LM(name="@auto")
# Or specify a particular model
lm = LM(
name="meta-llama/Llama-3.2-90B-Vision-Instruct",
base_url="http://localhost:8000",
api_key="your-api-key-here"
)

Prompt functions are regular Python functions that return a list of messages in OpenAI chat format:

def write_story(query: str) -> list:
"""Generate a story based on the given query."""
return [{"role": "user", "content": "write a story about " + query}]

Use the invoke method to call your prompt function with the LM:

import asyncio
async def main():
lm = LM(name="@auto")
description = await lm.invoke(write_story, "a brave knight and a dragon")
print(description)
asyncio.run(main())

For better performance when making multiple requests, you can execute them concurrently:

async def concurrent_example():
lm = LM(name="@auto")
# Start multiple requests concurrently
future1 = lm(messages=[{"role": "user", "content": "Hello, how are you? in one word"}])
future2 = lm(messages=[{"role": "user", "content": "What's the weather like? in one word"}])
future3 = lm(messages=[{"role": "user", "content": "Tell me a joke in one word"}])
# Wait for all to complete
response1, response2, response3 = await asyncio.gather(future1, future2, future3)
print("Response 1:", response1)
print("Response 2:", response2)
print("Response 3:", response3)

For multimodal interactions (text + images), use the @multimodal decorator:

from PIL import Image
from vagents.core import multimodal, LM
@multimodal(input_type="image", param=["frame"])
def narrate_frame(frame: Image.Image, *args, **kwargs) -> str:
"""Describe the contents of an image frame."""
return "Describing frame at index."
async def multimodal_example():
# Load an image
frame = Image.open("path/to/image.jpg")
# Use a vision-capable model
model = LM(name="meta-llama/Llama-3.2-90B-Vision-Instruct")
# Invoke with image and additional parameters
description = await model.invoke(
narrate_frame,
frame=frame,
temperature=0.7,
max_tokens=100,
stream=False
)
print(description)

You can configure the LM instance using environment variables:

  • VAGENTS_LM_BASE_URL: The base URL for the LM API (default: http://localhost:8000)
  • VAGENTS_LM_API_KEY: The API key for authentication (default: your-api-key-here)

When invoking LLMs, you can pass the following optional parameters:

  • temperature: Controls randomness in the output
  • top_p: Controls nucleus sampling
  • max_tokens: Maximum number of tokens to generate
  • stream: Whether to stream the response
  • stop: Stop sequences for generation
  • n: Number of completions to generate
  • presence_penalty: Penalty for new tokens based on presence
  • frequency_penalty: Penalty for new tokens based on frequency

Example with parameters:

result = await lm.invoke(
write_story,
"a magical forest",
temperature=0.8,
max_tokens=200,
stop=["\n\n"]
)