Builder Demo Artifact

workledger turns a run log into a work log.

This static artifact is based on the bundled agent-workflow demo. It shows the exact moment the project becomes intuitive: 14 raw spans turn into 3 work units, covering maintenance, product development, and internal automation, with one item queued for review instead of being forced into certainty.

14 source spans

3 work units

1 pending review item

1.56x average compression

Before

What the trace layer gives you

LLM calls, retrievers, tools, guardrails, and human review spans
Token counts, latency, and direct cost by span
No compact answer to “what work happened?”

After

What workledger gives you

Reviewable work units with titles and project context
Cost by work category, evidence, and review state
A review queue for the one item that still needs judgment

Total Direct Cost

$0.1550

Total Blended Cost

$0.1783

Average Confidence

0.842

Compression Proof

5 spans -> 1 unit

Work Units

Three outputs that a builder can reason about

Title	Category	Cost
Implement orchestration dashboard for customers	external_product_development	$0.0828
Patch customer API timeout regression	maintenance_bugfix	$0.0483
Automate release checklist workflow	internal_use_software	$0.0471

Review Queue

The ambiguous item stays reviewable

Automate release checklist workflow
internal_use_software -> pending review

Confidence: 0.768
Source spans: 3
Compression: 1.50x

Pending Review

Maintenance Contrast

Patch customer API timeout regression

This is the kind of work observability tools can describe technically but not semantically. workledger rolls six spans into one maintenance work unit with clear cost, evidence, and trace lineage.

maintenance work

Development Contrast

Implement orchestration dashboard for customers

Five spans become one external product development unit with a higher blended cost. The useful question becomes review and confidence, not just trace inspection.

external product development

Compression Story

What builders should remember

The best proof point in this demo is not a dashboard. It is the reduction from 14 spans to 3 reviewable units of work. That smaller unit is what makes cost attribution, evidence review, and downstream business analysis feel obvious instead of aspirational.