Builder Demo Artifact

workledger turns a run log into a work log.

This static artifact is based on the bundled agent-workflow demo. It shows the exact moment the project becomes intuitive: 14 raw spans turn into 3 work units, covering maintenance, product development, and internal automation, with one item queued for review instead of being forced into certainty.

14 source spans
3 work units
1 pending review item
1.56x average compression

Before

What the trace layer gives you

  • LLM calls, retrievers, tools, guardrails, and human review spans
  • Token counts, latency, and direct cost by span
  • No compact answer to “what work happened?”

After

What workledger gives you

  • Reviewable work units with titles and project context
  • Cost by work category, evidence, and review state
  • A review queue for the one item that still needs judgment
Total Direct Cost
$0.1550
Total Blended Cost
$0.1783
Average Confidence
0.842
Compression Proof
5 spans -> 1 unit

Work Units

Three outputs that a builder can reason about

Title Category Cost
Implement orchestration dashboard for customers external_product_development $0.0828
Patch customer API timeout regression maintenance_bugfix $0.0483
Automate release checklist workflow internal_use_software $0.0471

Review Queue

The ambiguous item stays reviewable

Automate release checklist workflow
internal_use_software -> pending review

Confidence: 0.768
Source spans: 3
Compression: 1.50x

Pending Review

Maintenance Contrast

Patch customer API timeout regression

This is the kind of work observability tools can describe technically but not semantically. workledger rolls six spans into one maintenance work unit with clear cost, evidence, and trace lineage.

maintenance work

Development Contrast

Implement orchestration dashboard for customers

Five spans become one external product development unit with a higher blended cost. The useful question becomes review and confidence, not just trace inspection.

external product development

Compression Story

What builders should remember

The best proof point in this demo is not a dashboard. It is the reduction from 14 spans to 3 reviewable units of work. That smaller unit is what makes cost attribution, evidence review, and downstream business analysis feel obvious instead of aspirational.