Captures SubAgent execution events to JSONL files for offline analysis.
TraceLog attaches to SubAgent telemetry events and writes them to a JSONL file, enabling detailed debugging and performance analysis of agent executions.
Usage
The simplest way to capture a trace is with with_trace/2:
{:ok, {:ok, step}, trace_path} = TraceLog.with_trace(fn ->
SubAgent.run(agent, llm: my_llm())
end)
# Analyze the trace
events = TraceLog.Analyzer.load(trace_path)
summary = TraceLog.Analyzer.summary(events)For more control, use start/1 and stop/1:
{:ok, collector} = TraceLog.start(path: "my_trace.jsonl")
{:ok, step} = SubAgent.run(agent, llm: my_llm())
{:ok, path, errors} = TraceLog.stop(collector)Event Format
Each line in the JSONL file is a JSON object with:
{
"event": "run.start", # Event type (run|turn|llm|tool).(start|stop|exception)
"trace_id": "abc123...", # Unique trace identifier
"timestamp": "2024-01-...", # ISO 8601 timestamp
"measurements": {...}, # Telemetry measurements
"metadata": {...}, # Event-specific metadata
"duration_ms": 123 # Duration (for stop events)
}Process Isolation and Cross-Process Propagation
Traces are isolated by process. Only events from the process that called start/1
are captured. This allows multiple concurrent traces without interference.
Nested traces are supported - each with_trace call creates its own trace file,
and events are routed to the innermost active collector.
Cross-Process Tracing
When execution spans multiple processes (e.g., parallel tasks), use
join/2 to propagate trace context to child processes:
collectors = TraceLog.active_collectors()
parent_span = PtcRunner.SubAgent.Telemetry.current_span_id()
Task.async(fn ->
TraceLog.join(collectors, parent_span)
# Events from this process are now captured AND linked to parent
end)Note: The sandbox process inherits trace collectors via join/2, so tool
telemetry events (tool.start, tool.stop) emitted inside the sandbox are
captured directly by the trace handler.
See Also
PtcRunner.TraceLog.Analyzer- Load and analyze trace filesPtcRunner.TraceLog.Collector- Low-level file writingPtcRunner.TraceLog.Handler- Telemetry handler- Observability Guide - How TraceLog relates to
PtcRunner.Tracer
Summary
Functions
Returns all active collectors for the current process.
Returns all in-memory sinks active in the current process (innermost first).
Returns the collector for the current process, if any.
Joins the current process to existing trace collectors.
Records an already-built turn-event map to every active sink — every JSONL collector on the stack and every active in-memory sink.
Returns true if any turn-log sink (JSONL collector or in-memory sink) is active in the current process.
Starts trace collection for the current process.
Starts an in-memory turn-log sink and activates it for the current process.
Stops trace collection and closes the trace file.
Detaches an in-memory sink from the current process so new turn events no longer route to it. The sink process is left alive so its events stay queryable; stop it explicitly when done.
Executes a function while capturing a trace.
Writes a serialized event map to the innermost active collector in the calling process's collector stack.
Functions
@spec active_collectors() :: [pid()]
Returns all active collectors for the current process.
The list is ordered from innermost (most recent) to outermost.
@spec active_memory_sinks() :: [pid()]
Returns all in-memory sinks active in the current process (innermost first).
@spec current_collector() :: pid() | nil
Returns the collector for the current process, if any.
Examples
{:ok, _collector} = TraceLog.start()
collector = TraceLog.current_collector()
# collector is a pid
Joins the current process to existing trace collectors.
This is used for trace propagation across process boundaries. When spawning child processes (via Task.async_stream, Process.spawn, etc.), the parent's trace collectors are not automatically inherited. Call this function at the start of the child process to re-attach to the parent's trace session.
Parameters
collectors- List of collector PIDs to join (fromactive_collectors/0)parent_span_id- Optional span ID from parent process for span hierarchy
Example
# In parent process
collectors = TraceLog.active_collectors()
parent_span = PtcRunner.SubAgent.Telemetry.current_span_id()
Task.async(fn ->
TraceLog.join(collectors, parent_span)
# Now trace events from this process will be captured
# AND linked to the parent span hierarchy
SubAgent.run(agent, llm: llm)
end)Notes
- Only joins collectors that are still alive (stale PIDs are filtered out)
- Does not attach telemetry handlers (they are global and already attached)
- Safe to call multiple times or with an empty list
- When
parent_span_idis provided, sets up span hierarchy so new spans in this process have the parent span as their parent_span_id
@spec record_turn_event(map()) :: :ok
Records an already-built turn-event map to every active sink — every JSONL collector on the stack and every active in-memory sink.
Unlike write_to_active/1 (innermost collector only), this fans out to all
active collectors, matching the telemetry handler's routing: under nested
with_trace/2 scopes both the inner and outer trace files capture the turn,
so cross-session analysis works for either. Each collector stamps its own
trace_id/seq on its copy of the event.
This is the single emission point shared by both turn drivers
(PtcRunner.Session and the PtcRunner.SubAgent loop). It never raises and
is a no-op when nothing is recording. Build the event with
PtcRunner.TraceLog.TurnEvent.build/1.
@spec recording?() :: boolean()
Returns true if any turn-log sink (JSONL collector or in-memory sink) is active in the current process.
Turn drivers consult this before building a turn event so the (small) build cost is skipped entirely when nothing is recording.
Starts trace collection for the current process.
Returns a collector process that will capture all SubAgent telemetry events
from this process until stop/1 is called.
Options
:path- File path for the JSONL output. Defaults to a timestamped file.:trace_id- Custom trace identifier. Defaults to a random hex string.:trace_kind- Trace type discriminator (e.g.,"benchmark","analysis").:producer- Component that created this trace (e.g.,"demo.benchmark").:trace_label- Human-readable label (e.g., test case name).:model- LLM model identifier.:query- Input query or question.:meta- Producer-specific metadata underdata.
Examples
{:ok, collector} = TraceLog.start()
{:ok, step} = SubAgent.run(agent, llm: my_llm())
{:ok, path, errors} = TraceLog.stop(collector)
# With typed trace header
{:ok, collector} = TraceLog.start(
path: "/tmp/debug.jsonl",
trace_kind: "benchmark",
producer: "my_app",
query: "How many products?"
)
Starts an in-memory turn-log sink and activates it for the current process.
Returns the sink pid; query it with PtcRunner.TraceLog.MemorySink.events/1
(or the cross-session PtcRunner.TraceLog.Analyzer functions). Accepts the
same options as PtcRunner.TraceLog.MemorySink.start_link/1 (:max_bytes,
:name). Detach with stop_memory_sink/1.
@spec stop(pid()) :: {:ok, String.t(), non_neg_integer()}
Stops trace collection and closes the trace file.
Returns the path to the trace file and the number of write errors (if any).
Examples
{:ok, collector} = TraceLog.start()
# ... run SubAgent ...
{:ok, path, errors} = TraceLog.stop(collector)
@spec stop_memory_sink(pid()) :: :ok
Detaches an in-memory sink from the current process so new turn events no longer route to it. The sink process is left alive so its events stay queryable; stop it explicitly when done.
Executes a function while capturing a trace.
This is the recommended way to capture traces. It ensures the trace is properly started and stopped, even if the function raises an exception.
Options
Accepts all options from start/1: :path, :trace_id, :trace_kind,
:producer, :trace_label, :model, :query, :meta.
Examples
{:ok, {:ok, step}, trace_path} = TraceLog.with_trace(fn ->
SubAgent.run(agent, llm: my_llm())
end)
# With typed trace header
{:ok, {:ok, step}, path} = TraceLog.with_trace(
fn -> SubAgent.run(agent, llm: my_llm()) end,
trace_kind: "benchmark",
query: "How many products?"
)
@spec write_to_active(map()) :: :ok | :no_collector
Writes a serialized event map to the innermost active collector in the calling process's collector stack.
Used by call sites that need to emit a custom JSONL line from outside
the SubAgent telemetry path (e.g., the MCP server recording per-call
outcomes). The event map is forwarded as-is to
PtcRunner.TraceLog.Collector.write_event/2, which assigns a seq,
encodes via PtcRunner.TraceLog.Event.encode/1, and appends a JSONL
line.
Returns :ok if a collector was active and the event was queued for
writing, or :no_collector if no with_trace/2 / start/1 scope is
active in this process.
This function never raises. Errors during forwarding are swallowed (writes are async casts; encoding failures degrade gracefully).
Examples
TraceLog.with_trace(fn ->
TraceLog.write_to_active(%{
"event" => "mcp.call.stop",
"trace_id" => "abc",
"timestamp" => DateTime.utc_now() |> DateTime.to_iso8601(),
"data" => %{"reason" => "ok"}
})
end)