Steering Modes
How the SDK delivers phase instructions and per-turn context to the model during a Live session. This is the single most impactful configuration choice for multi-phase voice applications.
The Three Modes
ContextInjection (recommended)
The system instruction is set once at connect time and never updated. Phase instructions and per-turn modifiers are delivered as model-role context turns via send_client_content.
Live::builder()
.model(GeminiModel::Gemini2_0FlashLive)
.instruction("You are a restaurant reservation assistant at Sapore d'Italia.")
.steering_mode(SteeringMode::ContextInjection)
.phase("greeting")
.instruction("Welcome the guest warmly and ask how you can help.")
.done()
.phase("booking")
.instruction("Help the guest find an available time slot.")
.done()
.initial_phase("greeting")
What happens on phase transition:
- The phase instruction ("Welcome the guest...") is sent as a model-role content turn
- Per-turn modifiers (
with_context,with_state,when) are also sent as model-role turns - The system instruction ("You are a restaurant...") is never touched
When to use: Most multi-phase voice apps. The base persona stays stable across phases, and phase-specific behavior is guided through conversational context. Lower latency, no instruction re-processing spikes.
InstructionUpdate (default)
The system instruction is replaced on every phase transition. Per-turn modifiers are baked into the instruction text.
Live::builder()
.model(GeminiModel::Gemini2_0FlashLive)
.instruction("You are a helpful assistant.")
.steering_mode(SteeringMode::InstructionUpdate) // this is the default
.phase("receptionist")
.instruction("You are a medical receptionist. Schedule appointments.")
.done()
.phase("triage_nurse")
.instruction("You are a triage nurse. Assess symptom severity.")
.done()
.initial_phase("receptionist")
What happens on phase transition:
- The entire system instruction is replaced with the new phase's instruction
- Per-turn modifiers are appended to the instruction text
- The model re-processes its full context with the new instruction
When to use: When phases represent genuinely different personas or roles. The model needs a complete context reset to shift behavior convincingly.
Hybrid
System instruction is replaced on phase transition (like InstructionUpdate), but per-turn modifiers are delivered as model-role context turns (like ContextInjection).
Live::builder()
.steering_mode(SteeringMode::Hybrid)
.phase("sales")
.instruction("You are a sales representative.")
.with_context(|s| format!("Customer budget: {}", s.get::<String>("budget").unwrap_or_default()))
.done()
.phase("support")
.instruction("You are a technical support engineer.")
.with_context(|s| format!("Ticket: {}", s.get::<String>("ticket_id").unwrap_or_default()))
.done()
When to use: When you need persona shifts on transition but also want lightweight per-turn context updates within each phase. Uncommon in practice -- pick ContextInjection or InstructionUpdate unless you have a specific reason for both.
Decision Matrix
| Question | Yes | No |
|---|---|---|
| Does the model's core persona change between phases? | InstructionUpdate | ContextInjection |
| Is latency on phase transitions a concern? | ContextInjection | Either works |
| Do you need per-turn dynamic context (state summaries, conditional hints)? | ContextInjection or Hybrid | InstructionUpdate is fine |
| Are phases just different stages of the same conversation? | ContextInjection | -- |
| Are phases genuinely different agents (receptionist vs doctor)? | InstructionUpdate | -- |
Anti-Patterns
Using InstructionUpdate for minor context changes
Problem: Every phase has the same persona but slightly different focus areas. Using InstructionUpdate causes unnecessary instruction re-processing latency on each transition.
// Anti-pattern: same persona, different focus -- InstructionUpdate is overkill
Live::builder()
.steering_mode(SteeringMode::InstructionUpdate) // unnecessary latency
.phase("gather_name")
.instruction("You are a restaurant host. Ask for the guest's name.")
.done()
.phase("gather_party_size")
.instruction("You are a restaurant host. Ask for the party size.")
.done()
Fix: Use ContextInjection. The base persona is set once, and phase-specific focus is delivered as context turns.
// Better: stable persona, lightweight phase steering
Live::builder()
.instruction("You are a friendly host at Sapore d'Italia.")
.steering_mode(SteeringMode::ContextInjection)
.phase("gather_name")
.instruction("Ask for the guest's name for the reservation.")
.done()
.phase("gather_party_size")
.instruction("Ask how many guests will be dining.")
.done()
Using ContextInjection when personas differ radically
Problem: Phases represent genuinely different agent personas (e.g., switching from a receptionist to a clinical nurse). Context injection is too subtle -- the model may not fully shift behavior.
// Anti-pattern: radically different personas via context injection
Live::builder()
.instruction("You work at a medical clinic.")
.steering_mode(SteeringMode::ContextInjection) // too subtle for persona shift
.phase("receptionist")
.instruction("You are the front desk receptionist. Be warm and administrative.")
.done()
.phase("triage")
.instruction("You are a clinical triage nurse. Be precise and medical.")
.done()
Fix: Use InstructionUpdate so the model gets a clean persona reset.
Over-engineering with Hybrid
Problem: Using Hybrid when ContextInjection alone would suffice. Adds complexity without benefit.
// Anti-pattern: Hybrid when the persona doesn't actually change
Live::builder()
.steering_mode(SteeringMode::Hybrid) // unnecessary complexity
.phase("greeting").instruction("Welcome the user.").done()
.phase("main").instruction("Help with their request.").done()
Fix: Use ContextInjection. If the persona is stable, there's no reason to replace the system instruction.
Putting volatile state in the base instruction
Problem: The base instruction (set at connect time) includes dynamic state that changes every turn. With ContextInjection, this instruction is never updated.
// Anti-pattern: dynamic content in the base instruction
Live::builder()
.instruction(format!("You are helping {}. Their order has {} items.",
customer_name, order_count)) // stale after the first turn
.steering_mode(SteeringMode::ContextInjection)
Fix: Keep the base instruction static. Use with_context() modifiers for dynamic state.
// Better: static base, dynamic context via modifiers
Live::builder()
.instruction("You are a helpful order assistant.")
.steering_mode(SteeringMode::ContextInjection)
.phase_defaults(|d| d.with_context(|s| {
format!("Customer: {}. Items in order: {}.",
s.get::<String>("customer_name").unwrap_or_default(),
s.get::<u32>("order_count").unwrap_or(0))
}))
How It Works Under the Hood
The three-lane processor evaluates steering at two points in the turn lifecycle:
TurnComplete event
|
[Step 7] Phase machine evaluates transitions
| --> if transition fires, resolved_instruction is set
|
[Steps 7d/7e/7f/12/13] Context accumulation
| --> tool advisory, repair nudge, steering modifiers,
| phase instruction, on_enter_context all push into
| a single context_buffer (Vec<Content>)
|
[Step 14] Batched context send
| --> ONE send_client_content(context_buffer, false)
| --> eliminates burst of separate WebSocket frames
|
[Step 14b] prompt_on_enter (triggers model response)
| --> send_client_content([], true) — separate frame
Batched delivery: All model-role context turns are accumulated into a single Vec<Content> and sent as one atomic WebSocket frame. This eliminates the burst of 3-5 separate send_client_content calls that could confuse the model or clash with concurrent user input.
The key insight: with ContextInjection, step 12 sends the phase instruction as Content::model(instruction_text). The model sees it as its own prior speech, which naturally steers its behavior without the overhead of system instruction replacement.
Context Delivery Timing
By default, the batched context frame is sent immediately during TurnComplete processing (ContextDelivery::Immediate). For voice apps where isolated WebSocket frames during silence can cause glitches, use ContextDelivery::Deferred:
Live::builder()
.steering_mode(SteeringMode::ContextInjection)
.context_delivery(ContextDelivery::Deferred)
.phase("greeting")
.instruction("Welcome the guest")
.done()
.initial_phase("greeting")
How deferred delivery works:
- During TurnComplete, context turns are pushed into a
PendingContextbuffer (instead of sent) - The
DeferredWriterwraps the session writer at theLiveHandlelevel - When user code calls
handle.send_audio(),send_text(), orsend_video(), the writer drains the buffer and sends the context immediately before the user content - The context arrives in the same burst as user input — no isolated frames during silence
When context is sent immediately regardless:
If a prompt is needed (prompt_on_enter: true or a repair nudge on the first attempt), the context is sent immediately — you can't defer a prompt because the model needs to respond now.
Deferred delivery: Immediate delivery:
TurnComplete TurnComplete
| |
[context → PendingContext] [context → wire now]
| |
... silence ... ... silence ...
| |
User speaks User speaks
| |
DeferredWriter.send_audio() SessionHandle.send_audio()
1. flush PendingContext 1. send audio
2. send audio
Interaction with Other Features
| Feature | InstructionUpdate | ContextInjection | Hybrid |
|---|---|---|---|
with_context(fn) | Appended to instruction text | Sent as model-role turn | Sent as model-role turn |
with_state(&[keys]) | Baked into instruction | Sent as model-role turn | Sent as model-role turn |
when(pred, text) | Baked into instruction | Sent as model-role turn | Sent as model-role turn |
instruction_amendment | Appended to instruction | Appended to context turn | Appended to instruction |
instruction_template | Replaces instruction | Sent as context turn | Replaces instruction |
navigation() | Baked into instruction | Baked into instruction | Baked into instruction |
greeting() | Works normally | Works normally | Works normally |
prompt_on_enter | Works normally | Works normally | Works normally |
enter_prompt | Works normally | Works normally | Works normally |
See also
- Phase System — defining phases, modifiers, and lifecycle callbacks
- Phase Transitions Deep Dive — the full turn-complete pipeline where steering fires