Radicas Docs
Python SDK

Telemetry reference

The canonical contract the SDK emits — span tree, full attribute table including token and cache-token usage, content events, and what Radicas computes from each field.

Everything Radicas computes binds to one canonical contract: the OpenTelemetry gen_ai semantic conventions (OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental, which the SDK sets for you). Frameworks that emit something else are normalized to this shape — see the three-layer story on the index page.

The canonical span tree

invoke_agent {agent}                 → agent invocation
  ├─ generate_content {model}        → LLM call     (tokens × price book → estimated cost)
  └─ execute_tool {tool}             → tool call    (status Error + error.type on failure)

One invoke_agent per agent turn; generate_content children for each model call; execute_tool children per tool execution. transfer_to_agent spans are control-plane routing and are excluded from tool counts.

Resource attributes

AttributeSet byRadicas uses it for
service.nameinit(service=...)agent identity in the fleet views
radicas.sourceinit(source=...)origin marker (real agent vs synthetic traffic)
radicas.tenant_idSDK, LOCAL lab onlylab-shim tenancy — ONLINE tenancy comes from the API key (authentication)

Span attributes

invoke_agent

AttributeRadicas uses it for
gen_ai.agent.nameagent identity
gen_ai.conversation.idsession/conversation grouping
radicas.featurefeature cost allocation — stamped by the SDK (feature tagging)
user.idper-user cost allocation — stamped by the SDK when radicas.user(...) is active
error.typeinvocation failure classification

generate_content (the cost-bearing span)

AttributeRadicas uses it for
gen_ai.request.modelthe cost key — joined against the price book
gen_ai.systemprovider (informational; cost keys on the model)
gen_ai.usage.input_tokensinput side of estimated cost
gen_ai.usage.output_tokensoutput side of estimated cost
gen_ai.usage.reasoning.output_tokensreasoning-token visibility (subset of output)
gen_ai.usage.cache_read.input_tokenscache-hit pricing — see cache-token semantics below
gen_ai.usage.cache_creation.input_tokenscache-write pricing — see cache-token semantics below
gen_ai.request.top_p, gen_ai.request.max_tokensrequest profiling
gen_ai.response.finish_reasonstruncation/stop analysis
radicas.source_frameworkwhich normalization adapter fired (stamped by the SDK/collector)

execute_tool

AttributeRadicas uses it for
gen_ai.tool.nametool usage profiling
gen_ai.tool.typetool classification
error.type + span status Errortool failure rates

Cache-token semantics

Cached prompt tokens are priced differently from fresh input tokens, so getting their semantics right changes the cost math:

  • gen_ai.usage.cache_read.input_tokens is a subset of gen_ai.usage.input_tokens (canonical semantics): input_tokens is the total input, of which cache_read were served from cache at the discounted rate. Radicas computes fresh = input_tokens - cache_read.input_tokens and prices the two slices separately.
  • gen_ai.usage.cache_creation.input_tokens is additive: cache-write tokens are billed on top (at the cache-write rate) and are not part of input_tokens.

Providers disagree here — notably the Anthropic API reports exclusive counters (input_tokens excluding cache reads). Normalizing provider-exclusive counters to the canonical subset semantics is part of the adapter layer's job; the registry card for each source records what it actually reports (verified against captured fixtures; sources without a fixture are marked expected, pending fixture on their page in the support matrix).

Metrics and logs

  • Metrics export with delta temporality for histograms (what the Radicas pipeline expects), every METRIC_EXPORT_INTERVAL_SECONDS (default 10s).
  • Content events (opt-in, configuration): prompts/responses ride the logs signal as consolidated gen_ai.client.inference.operation.details events — gen_ai.input.messages, gen_ai.output.messages, gen_ai.system_instructions, gen_ai.tool.definitions. Kept out of the product cost views by design.

What Radicas computes from each piece

SignalComputation
generate_content tokens + modelestimated cost (price book), per call, in near-real-time
estimated cost ↔ provider invoicereconciliation: authoritative FOCUS billing data joined span-to-line-item (T+24–72h)
radicas.feature, user.idallocation: cost per feature, per user
span tree + statusesagent fleet health, tool failure rates, latency profiles

On this page