Telemetry reference

The canonical contract the SDK emits — span tree, full attribute table including token and cache-token usage, content events, and what Radicas computes from each field.

Everything Radicas computes binds to one canonical contract: the OpenTelemetry gen_ai semantic conventions (OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental, which the SDK sets for you). Frameworks that emit something else are normalized to this shape — see the three-layer story on the index page.

The canonical span tree

invoke_agent {agent}                 → agent invocation
  ├─ generate_content {model}        → LLM call     (tokens × price book → estimated cost)
  └─ execute_tool {tool}             → tool call    (status Error + error.type on failure)

One invoke_agent per agent turn; generate_content children for each model call; execute_tool children per tool execution. transfer_to_agent spans are control-plane routing and are excluded from tool counts.

Resource attributes

Attribute	Set by	Radicas uses it for
`service.name`	`init(service=...)`	agent identity in the fleet views
`radicas.source`	`init(source=...)`	origin marker (real agent vs synthetic traffic)
`radicas.tenant_id`	SDK, LOCAL lab only	lab-shim tenancy — ONLINE tenancy comes from the API key (authentication)

Span attributes

`invoke_agent`

Attribute	Radicas uses it for
`gen_ai.agent.name`	agent identity
`gen_ai.conversation.id`	session/conversation grouping
`radicas.feature`	feature cost allocation — stamped by the SDK (feature tagging)
`user.id`	per-user cost allocation — stamped by the SDK when `radicas.user(...)` is active
`error.type`	invocation failure classification

`generate_content` (the cost-bearing span)

Attribute	Radicas uses it for
`gen_ai.request.model`	the cost key — joined against the price book
`gen_ai.system`	provider (informational; cost keys on the model)
`gen_ai.usage.input_tokens`	input side of estimated cost
`gen_ai.usage.output_tokens`	output side of estimated cost
`gen_ai.usage.reasoning.output_tokens`	reasoning-token visibility (subset of output)
`gen_ai.usage.cache_read.input_tokens`	cache-hit pricing — see cache-token semantics below
`gen_ai.usage.cache_creation.input_tokens`	cache-write pricing — see cache-token semantics below
`gen_ai.request.top_p`, `gen_ai.request.max_tokens`	request profiling
`gen_ai.response.finish_reasons`	truncation/stop analysis
`radicas.source_framework`	which normalization adapter fired (stamped by the SDK/collector)

`execute_tool`

Attribute	Radicas uses it for
`gen_ai.tool.name`	tool usage profiling
`gen_ai.tool.type`	tool classification
`error.type` + span status `Error`	tool failure rates

Cache-token semantics

Cached prompt tokens are priced differently from fresh input tokens, so getting their semantics right changes the cost math:

gen_ai.usage.cache_read.input_tokens is a subset of gen_ai.usage.input_tokens (canonical semantics): input_tokens is the total input, of which cache_read were served from cache at the discounted rate. Radicas computes fresh = input_tokens - cache_read.input_tokens and prices the two slices separately.
gen_ai.usage.cache_creation.input_tokens is additive: cache-write tokens are billed on top (at the cache-write rate) and are not part of input_tokens.

Providers disagree here — notably the Anthropic API reports exclusive counters (input_tokens excluding cache reads). Normalizing provider-exclusive counters to the canonical subset semantics is part of the adapter layer's job; the registry card for each source records what it actually reports (verified against captured fixtures; sources without a fixture are marked expected, pending fixture on their page in the support matrix).

Metrics and logs

Metrics export with delta temporality for histograms (what the Radicas pipeline expects), every METRIC_EXPORT_INTERVAL_SECONDS (default 10s).
Content events (opt-in, configuration): prompts/responses ride the logs signal as consolidated gen_ai.client.inference.operation.details events — gen_ai.input.messages, gen_ai.output.messages, gen_ai.system_instructions, gen_ai.tool.definitions. Kept out of the product cost views by design.

What Radicas computes from each piece

Signal	Computation
`generate_content` tokens + model	estimated cost (price book), per call, in near-real-time
estimated cost ↔ provider invoice	reconciliation: authoritative FOCUS billing data joined span-to-line-item (T+24–72h)
`radicas.feature`, `user.id`	allocation: cost per feature, per user
span tree + statuses	agent fleet health, tool failure rates, latency profiles

On this page