Skip to content

OT Image

Load images and ask vision questions via an OpenAI-compatible API.

Short alias: img

Highlights

  • Load images from file paths, URLs, or the clipboard once; reference by handle for follow-up questions
  • Ask multiple questions in a single model call — answers returned as paired question/answer list
  • Structured summaries (text, mode, type, colours) extracted and cached in meta.json
  • Clipboard shortcuts clip_ask() and clip_view() — no handle juggling needed
  • Content-hash deduplication — loading the same image twice returns the existing handle
  • Session LRU cache keeps resized model bytes in-memory to avoid repeated disk reads
  • Storage at .onetool/images/ persists across sessions; purge() cleans up by age

Functions

Function Description
img.load(img, ...) Load a single image; return a stable handle
img.load_batch(img, ...) Load multiple images from a glob or list
img.ask(img, q, ...) Ask one or more questions about an image
img.clip_ask(q, ...) Shorthand: ask about the current clipboard image
img.clip_view() Shorthand: structured summary of the current clipboard image
img.summary(img) Extract and cache a structured summary of an image
img.list() List all loaded images with metadata
img.delete(handle) Delete an image and free session cache
img.purge(...) Delete images by age or delete all

Key Parameters

Parameter Type Description
img str Image source: file path, "https://..." URL, "clip" for clipboard, or "#handle" to reference an existing handle
handle str Custom handle name for load() (e.g. "logo"). Omit for auto-generated img_<8hex>
q str | list[str] Question(s) to ask. Multiple questions are batched into one model call
max_edge int Max longest edge in pixels for in-memory model resize. Default: 1568
all bool purge(all=True) deletes all images regardless of age
minutes int purge(minutes=N) deletes images older than N minutes. Default: 15

Requires

  • tools.ot_image.vision_model must be set for ask(), summary(), clip_ask(), and clip_view(). Set to an OpenAI-compatible model identifier (e.g. openai/gpt-4o-mini).
  • An API key: set tools.ot_image.api_key in onetool.yaml or store OPENAI_API_KEY as a secret.

Configuration

Required

Key Description
tools.ot_image.vision_model Vision model for ask() and summary() (e.g. openai/gpt-4o-mini)

Optional

Key Type Default Description
tools.ot_image.api_key str "" API key for the vision model. Falls back to OPENAI_API_KEY secret
tools.ot_image.base_url str "" OpenAI-compatible base URL. Falls back to tools.ot_llm.base_url
tools.ot_image.max_edge int 1568 Maximum longest edge (pixels) for model-upload resize
tools.ot_image.session_cache_size int 10 In-memory LRU cache cap (number of images)
tools:
  ot_image:
    vision_model: openai/gpt-4o-mini
    api_key: ""              # falls back to OPENAI_API_KEY secret
    base_url: ""             # falls back to tools.ot_llm.base_url
    max_edge: 1568
    session_cache_size: 10

Defaults

  • If tools.ot_image is omitted, load() and list() work without config. ask() and summary() require vision_model to be set.
  • API key and base URL fall back automatically to OPENAI_API_KEY secret and tools.ot_llm.base_url.

Examples

# Load from file and ask a question
result = img.load(img="~/screenshots/dashboard.png")
img.ask(img=result["handle"], q="What is the main metric shown?")

# Ask multiple questions in one call
img.ask(
    img="~/screenshots/dashboard.png",
    q=["What framework is shown?", "Is this dark mode?"]
)

# Load with a custom handle name
img.load(img="~/assets/logo.png", handle="logo")
img.ask(img="#logo", q="What colour is the logo?")

# Clipboard shortcuts (no load step needed)
img.clip_ask(q="Extract all text from this screenshot")
img.clip_view()

# Structured summary — cached after first call
img.summary(img="#logo")

# Load a batch from glob
img.load_batch(img="~/screenshots/*.png")

# Load a batch from a list
img.load_batch(img=["~/a.png", "~/b.png"])

# List all loaded images
img.list()

# Delete a single image
img.delete(handle="#img_a3f7b2c4")

# Purge images older than 1 hour
img.purge(minutes=60)

# Purge all images
img.purge(all=True)

See Also