VM Idle Management
Use VM Idle Management to reduce compute cost without changing workflow logic.
Path: Settings -> VM Management
Default behavior
Section titled “Default behavior”Auto-stop is enabled by default for all organizations. New orgs are created with:
Auto-Stop Idle Agents: enabledIdle Timeout: 30 minutesAuto-Wake on Job Dispatch: enabled
This ensures cost efficiency is the baseline operating mode. Adjust or disable in Settings as needed.
Configure org-level VM policy
Section titled “Configure org-level VM policy”- Toggle
Auto-Stop Idle Agentson or off. - Set
Idle Timeoutfrom 10 to 120 minutes. - Enable
Auto-Wake on Job Dispatchif stopped agents should boot automatically when work is queued. - Save VM settings.
What counts as activity
Section titled “What counts as activity”Idle logic uses lastActivityAt (work activity), not just connection heartbeat.
Activity is touched by:
- job dispatch
- step-status updates
- worker log ingestion batches via
POST /api/job-logs(resolved fromjobId) - handshake and healthy reconnect events
- workspace keepalive pings via
POST /api/agent/{id}/activity - other execution-affecting agent actions
Idle stop also checks for active jobs before stopping an agent. Jobs in queued, processing, or running state block idle-stop.
Per-agent override (Always-On)
Section titled “Per-agent override (Always-On)”On Agents -> Agent View, use the status chip toggle:
Auto-Stopmeans agent follows org idle policy.Always-Onmeans agent is exempt from auto-stop.
This sets idleAutoStopExempt for that agent.
Dispatch behavior with stopped agents
Section titled “Dispatch behavior with stopped agents”When dispatching via agent action endpoints, scheduled runs, or retries:
- If agent is
stoppedand auto-wake is enabled, Mimic starts it and waits untilonline(up to 2 minutes). - If auto-wake is disabled, dispatch returns a conflict/error response and the run is not started.
- The same auto-wake policy applies consistently across manual API calls, scheduled runs, and retry attempts.
From /app/agents, operators can issue Start and Stop directly from the fleet table. These actions map to POST /api/agent/{id}/action with start/stop and emit manual_start/manual_stop lifecycle events.
Resume latency: Starting a stopped agent takes 60-120 seconds. Schedule time-sensitive RCM workflows with this buffer in mind.
Lifecycle events
Section titled “Lifecycle events”Every idle stop and auto-wake transition is recorded as a persistent lifecycle event:
| Event Type | When |
|---|---|
idle_stop_attempt | Idle monitor detects an agent past its timeout |
idle_stop_success | Agent successfully stopped |
idle_stop_failed | Stop attempt failed (EC2 error) |
wake_requested | Auto-wake initiated for a job/schedule |
wake_blocked | Auto-wake disabled by org policy |
wake_ready | Agent came online after wake |
wake_timeout | Agent did not come online within timeout |
wake_failed | EC2 start command failed |
manual_start | User-initiated start via UI/API |
manual_stop | User-initiated stop via UI/API |
manual_reboot | User-initiated reboot via UI/API |
These events are visible in:
- Agent Events SSE stream (
GET /api/agents/{id}/events) aslifecycleevents - Agent Logs endpoint (
GET /api/agents/{id}/logs) merged into the unified log timeline - Agent View UI as status badges showing “Waking from idle” state
Agent View performance behavior
Section titled “Agent View performance behavior”Agent View now separates critical status polling from heavy tab data fetches:
- Live transition polling uses
GET /api/agents/{id}/status(lightweight payload). - Recent runs load from
GET /api/agents/{id}/runs. - Portals, functions, and script library data load on demand when tabs are opened.
This reduces DB load during provisioning/reboot transitions while keeping operator feedback near real time.
Verification checklist
Section titled “Verification checklist”- Enable auto-stop with a low timeout (e.g., 10 minutes) in dev.
- Wait for an idle agent to be stopped — confirm
idle_stop_successevent appears. - Dispatch a run to the stopped agent — confirm
wake_requestedandwake_readyevents. - Check the agent view UI shows the “Waking from idle” badge during resume.
- Verify the run completes successfully after wake.
Watchdog behavior for stale runs
Section titled “Watchdog behavior for stale runs”Background monitors enforce runtime limits:
- jobs with stale heartbeats and exceeded max runtime are failed
- hung worker process is killed on the VM
- agent status is reset for subsequent dispatch
- retry pipeline can requeue according to scheduling policy
Use this with RPA-first scripts to keep long-running automations deterministic and recoverable.
Session lock recovery (Windows)
Section titled “Session lock recovery (Windows)”Idle policy and session recovery solve different failure classes:
- idle policy controls compute stop/start economics
- session recovery prevents lock-screen induced mid-run termination
For guarded watchdog rollout and live patch procedures, see Windows Session Recovery.