Expand description
Core types that map one-to-one to the Gemini Multimodal Live API wire format.
Structs§
- Automatic
Activity Detection - Automatic activity detection (VAD) settings.
- Blob
- A blob of inline data (audio, image, etc.) sent to or received from Gemini.
- Citation
Metadata - Citation metadata for a response.
- Citation
Source - A single citation source.
- Code
Execution Result - Result of code execution.
- Content
- A content message containing a role and a sequence of parts.
- Context
Window Compression Config - Context window compression configuration for long sessions.
- Executable
Code - Executable code returned by the model.
- File
Data - Reference to an uploaded file.
- Function
Call - A function call request from the model.
- Function
Calling Config - Configuration for function calling behavior.
- Function
Declaration - Schema for a single function that the model can call.
- Function
Response - A function call response sent back to the model.
- Generation
Config - Generation config sent in the setup message.
- Google
Search - Google Search tool configuration (empty — presence enables the feature).
- Google
Search Retrieval - Google Search retrieval tool configuration.
- Grounding
Metadata - Grounding metadata for server content with search results.
- Input
Audio Transcription - Input audio transcription configuration.
- Modality
Token Count - Token count breakdown by modality (text, audio, image, video).
- Output
Audio Transcription - Output audio transcription configuration.
- Prebuilt
Voice Config - Prebuilt voice selection.
- Proactivity
Config - Proactivity configuration — controls whether the model can initiate responses.
- Realtime
Input Config - Server-side VAD configuration for the setup message.
- Safety
Rating - Per-category safety assessment of generated content.
- Safety
Setting - Per-category safety configuration for content generation.
- Session
Config - Complete session configuration — the builder entrypoint.
- Session
Resumption Config - Session resumption configuration.
- Sliding
Window - Sliding window configuration for context compression.
- Speech
Config - Speech configuration for audio output.
- Thinking
Config - Configuration for model thinking/reasoning (Gemini 2.5+).
- Tool
- A tool declaration sent in the setup message. Each Tool object can contain one of: function declarations, urlContext, googleSearch, codeExecution, or googleSearchRetrieval.
- Tool
Code Execution - Code execution tool configuration (empty — presence enables the feature).
- Tool
Config - Controls how and when the model uses tools.
- UrlContext
- URL context tool configuration (empty — presence enables the feature).
- UrlContext
Metadata - URL context metadata for content sourced from URLs.
- Usage
Metadata - Usage metadata returned by the server on messages.
- Vertex
Config - Configuration for connecting through Vertex AI.
- Voice
Config - Voice configuration within speech config.
Enums§
- Activity
Handling - Controls how incoming audio interacts with model output.
- ApiEndpoint
- API endpoint selector — Google AI (direct), Google AI with OAuth token, or Vertex AI.
- Audio
Format - Audio encoding formats supported by the Gemini Live API.
- Finish
Reason - Why the model stopped generating.
- Function
Calling Behavior - Whether tool calls block model output or run concurrently.
- Function
Calling Mode - How the model should decide when to execute tool calls.
- Function
Response Scheduling - Scheduling mode for non-blocking function responses.
- Gemini
Model - Gemini models that support the Multimodal Live API.
- Harm
Block Threshold - Blocking threshold for safety settings.
- Harm
Category - Categories of potential harm in model output.
- Harm
Probability - Probability that content is harmful.
- Media
Resolution - Media resolution for image/video inputs.
- Modality
- Output modalities the model can produce.
- Part
- A single part of a
Contentmessage. Parts are polymorphic — discriminated by field presence, not a type tag. - Role
- Role in a conversation.
- Sensitivity
- Voice activity detection sensitivity level.
- Turn
Coverage - Controls which input counts toward a user’s conversation turn.
- Voice
- Available voice presets for Gemini Live audio output.
Traits§
- Tool
Provider - Declares tools for a Gemini session setup message. Implement this trait to provide tools from any source (runtime ToolDispatcher, etc.).
Type Aliases§
- Tool
Declaration - Backward-compatible alias for
Tool.