Telemetry reference
The canonical contract the SDK emits — span tree, full attribute table including token and cache-token usage, content events, and what Radicas computes from each field.
Everything Radicas computes binds to one canonical contract: the OpenTelemetry gen_ai
semantic conventions (OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental, which the
SDK sets for you). Frameworks that emit something else are normalized to this shape — see the
three-layer story on the index page.
The canonical span tree
invoke_agent {agent} → agent invocation
├─ generate_content {model} → LLM call (tokens × price book → estimated cost)
└─ execute_tool {tool} → tool call (status Error + error.type on failure)One invoke_agent per agent turn; generate_content children for each model call;
execute_tool children per tool execution. transfer_to_agent spans are control-plane
routing and are excluded from tool counts.
Resource attributes
| Attribute | Set by | Radicas uses it for |
|---|---|---|
service.name | init(service=...) | agent identity in the fleet views |
radicas.source | init(source=...) | origin marker (real agent vs synthetic traffic) |
radicas.tenant_id | SDK, LOCAL lab only | lab-shim tenancy — ONLINE tenancy comes from the API key (authentication) |
Span attributes
invoke_agent
| Attribute | Radicas uses it for |
|---|---|
gen_ai.agent.name | agent identity |
gen_ai.conversation.id | session/conversation grouping |
radicas.feature | feature cost allocation — stamped by the SDK (feature tagging) |
user.id | per-user cost allocation — stamped by the SDK when radicas.user(...) is active |
error.type | invocation failure classification |
generate_content (the cost-bearing span)
| Attribute | Radicas uses it for |
|---|---|
gen_ai.request.model | the cost key — joined against the price book |
gen_ai.system | provider (informational; cost keys on the model) |
gen_ai.usage.input_tokens | input side of estimated cost |
gen_ai.usage.output_tokens | output side of estimated cost |
gen_ai.usage.reasoning.output_tokens | reasoning-token visibility (subset of output) |
gen_ai.usage.cache_read.input_tokens | cache-hit pricing — see cache-token semantics below |
gen_ai.usage.cache_creation.input_tokens | cache-write pricing — see cache-token semantics below |
gen_ai.request.top_p, gen_ai.request.max_tokens | request profiling |
gen_ai.response.finish_reasons | truncation/stop analysis |
radicas.source_framework | which normalization adapter fired (stamped by the SDK/collector) |
execute_tool
| Attribute | Radicas uses it for |
|---|---|
gen_ai.tool.name | tool usage profiling |
gen_ai.tool.type | tool classification |
error.type + span status Error | tool failure rates |
Cache-token semantics
Cached prompt tokens are priced differently from fresh input tokens, so getting their semantics right changes the cost math:
gen_ai.usage.cache_read.input_tokensis a subset ofgen_ai.usage.input_tokens(canonical semantics):input_tokensis the total input, of whichcache_readwere served from cache at the discounted rate. Radicas computesfresh = input_tokens - cache_read.input_tokensand prices the two slices separately.gen_ai.usage.cache_creation.input_tokensis additive: cache-write tokens are billed on top (at the cache-write rate) and are not part ofinput_tokens.
Providers disagree here — notably the Anthropic API reports exclusive counters
(input_tokens excluding cache reads). Normalizing provider-exclusive counters to the
canonical subset semantics is part of the adapter layer's job; the registry card for each
source records what it actually reports (verified against captured fixtures; sources without a
fixture are marked expected, pending fixture on their page in the
support matrix).
Metrics and logs
- Metrics export with delta temporality for histograms (what the Radicas pipeline
expects), every
METRIC_EXPORT_INTERVAL_SECONDS(default 10s). - Content events (opt-in, configuration): prompts/responses ride the
logs signal as consolidated
gen_ai.client.inference.operation.detailsevents —gen_ai.input.messages,gen_ai.output.messages,gen_ai.system_instructions,gen_ai.tool.definitions. Kept out of the product cost views by design.
What Radicas computes from each piece
| Signal | Computation |
|---|---|
generate_content tokens + model | estimated cost (price book), per call, in near-real-time |
| estimated cost ↔ provider invoice | reconciliation: authoritative FOCUS billing data joined span-to-line-item (T+24–72h) |
radicas.feature, user.id | allocation: cost per feature, per user |
| span tree + statuses | agent fleet health, tool failure rates, latency profiles |