OT Image¶

Load images and ask vision questions via an OpenAI-compatible API.

Short alias: img

Highlights¶

Load images from file paths, URLs, or the clipboard once; reference by handle for follow-up questions
Ask multiple questions in a single model call — answers returned as paired question/answer list
Structured summaries (text, mode, type, colours) extracted and cached in meta.json
Clipboard shortcuts clip_ask() and clip_view() — no handle juggling needed

Functions¶

Function	Description
`ot_image.load(img, ...)`	Load a single image; return a stable handle
`ot_image.load_batch(img, ...)`	Load multiple images from a glob or list
`ot_image.ask(img, q, ...)`	Ask one or more questions about an image
`ot_image.clip_ask(q, ...)`	Shorthand: ask about the current clipboard image
`ot_image.clip_view()`	Shorthand: structured summary of the current clipboard image
`ot_image.summary(img)`	Extract and cache a structured summary of an image
`ot_image.list()`	List all loaded images with metadata
`ot_image.delete(handle)`	Delete an image and free session cache
`ot_image.purge(...)`	Delete images by age or delete all

Key Parameters¶

Parameter	Type	Description
`img`	str	Image source: file path, `"https://..."` URL, `"clip"` for clipboard, or `"#handle"` to reference an existing handle
`handle`	str	Custom handle name for `load()` (e.g. `"logo"`). Omit for auto-generated `img_<8hex>`
`q`	str \| list[str]	Question(s) to ask. Multiple questions are batched into one model call
`max_edge`	int	Max longest edge in pixels for in-memory model resize. Default: `1568`
`all`	bool	`purge(all=True)` deletes all images regardless of age
`minutes`	int	`purge(minutes=N)` deletes images older than N minutes. Default: `15`

Requires¶

A vision model: set tools.ot_image.model or the top-level llm.model for ask(), summary(), clip_ask(), and clip_view().
OPENAI_API_KEY in secrets.yaml for vision model calls.

Configuration¶

Optional¶

Key	Type	Default	Description
`tools.ot_image.model`	str	`""`	Vision model for `ask()` and `summary()`. Falls back to `llm.model`
`tools.ot_image.base_url`	str	`""`	OpenAI-compatible base URL. Falls back to `llm.base_url`
`tools.ot_image.max_edge`	int	`1568`	Maximum longest edge (pixels) for model-upload resize
`tools.ot_image.session_cache_size`	int	`10`	In-memory LRU cache cap (number of images)

# Minimal — inherits model and base_url from top-level llm: block
llm:
  base_url: https://openrouter.ai/api/v1
  model: google/gemini-2-flash-preview

# Override just for ot_image (optional)
tools:
  ot_image:
    model: openai/gpt-4o-mini   # overrides llm.model for vision calls only
    max_edge: 1568
    session_cache_size: 10

Defaults¶

If tools.ot_image is omitted, load() and list() work without config. ask() and summary() require a model via tools.ot_image.model or llm.model.
model and base_url fall back to the top-level llm: config block. API key is always read from the OPENAI_API_KEY secret.

Examples¶

# Load from file and ask a question
result = ot_image.load(img="~/screenshots/dashboard.png")
ot_image.ask(img=result["handle"], q="What is the main metric shown?")

# Ask multiple questions in one call
ot_image.ask(
    img="~/screenshots/dashboard.png",
    q=["What framework is shown?", "Is this dark mode?"]
)

# Load with a custom handle name
ot_image.load(img="~/assets/logo.png", handle="logo")
ot_image.ask(img="#logo", q="What colour is the logo?")

# Clipboard shortcuts (no load step needed)
ot_image.clip_ask(q="Extract all text from this screenshot")
ot_image.clip_view()

# Structured summary — cached after first call
ot_image.summary(img="#logo")

# Load a batch from glob
ot_image.load_batch(img="~/screenshots/*.png")

# Load a batch from a list
ot_image.load_batch(img=["~/a.png", "~/b.png"])

# List all loaded images
ot_image.list()

# Delete a single image
ot_image.delete(handle="#img_a3f7b2c4")

# Purge images older than 1 hour
ot_image.purge(minutes=60)

# Purge all images
ot_image.purge(all=True)

OT Image¶

Highlights¶

Functions¶

Key Parameters¶

Requires¶

Configuration¶

Optional¶

Defaults¶

Examples¶

See Also¶