Tag Archives: llm

OTel Traces for the Win

This post was inspired by my Making Agents Work talk at Boston Code Camp #40, inspired by my OTel demo snafu at the live event.

Image stolen from Bala Subra – https://x.com/bsubra/status/2037887079804248504?s=20

Quoting from https://opentelemetry.io/ – “OpenTelemetry is an open source observability framework for cloud native software. It provides a single set of APIs, libraries, agents, and collector services to capture distributed traces and metrics from your application.”

Here we focus on Open Telemetry – or OTel for short – Traces.

The Anemic OTel Trace Antipattern

Due to an error in my demo prep, I ended up showing sparse OTel Traces – definitely not producing meaningful telemetry so observability will be subpar (or terrible). I am calling this the Anemic OTel Trace Antipattern. This antipattern comes through in the four screenshots that follow. The first screenshot shows the overall traces view listing one trace-per row. This is actually fine and normal as these are reasonable top-level traces. But drilling into any of these individual traces revealed no nesting and no context.

**Top-level OTel Traces** – shown using Aspire on local machine (note the **localhost** URL)

Click on “functions: POST api/jobs” trace – this is the detail after clicking on the one trace row

Click on “functions: GET api/jobs” trace – this is the detail after clicking on the one trace row

The Flat Trace OTel Trace Antipattern

Consider the traces below. If GetJob is triggered by an HTTP GET to the jobs endpoint, then my suggestion is they should be nested – GetJob under the corresponding HTTP GET /jobs/guid. As shown below they are flat, appearing as siblings rather than hierarchical. This is another OTel Trace Antipattern – let’s call it the Flat Trace OTel Trace Antipattern. We have this great “Trace”/”Span” nesting support, but still our signals look like old-school flat log entries. Definitely not optimal!

Properly Nested Traces and Spans

Let’s tighten up terminology. An OTel Trace represents the complete journey of a request through a system, and it is made up of one or more Spans that form a (logically nested) tree where each span is a unit of work. Within a trace it can make sense that some spans are siblings and others nested – it should mimic the actual flow through the system. The tree is reconstructed by following parent_span_id references. A trace can span multiple services (distributed tracing for the win!). Each service creates its own spans, propagating the Trace ID and parent Span ID via context propagation headers (e.g., traceparent in W3C Trace Context). Each Span in a Trace will share the same trace_id but have its own unique span_id.

So, using our vocabulary from above, the remedy for Anemic is to add more spans, and the remedy for Flat is to reuse span parents – passing them down to child processes rather than creating new spans.

With proper span structure, here is the SAME application again, except with OTel Traces and Spans more thoughtfully configured.

Now I can click on any of these and there will be spans nested within. You can tell the number and types of spans from the Spans column. The following span is from when the job was submitted: starts with an HTTP POST, stores some stuff in an Azure Storage Blob, creates a message in an Azure Storage Queue, then returns an HTTP 202 STATUS (“Accepted”) with a JobId.

Note above that movie we are requesting to assess is “best picture winner from 1988” – which is not a movie name you’ll find on IMDB. But a human will at least know what you mean. As will an LLM.

Now let’s double-click on the “RunJob” trace for the same movie – this is also around 20 seconds after the job was created since processing is asynchronous – and starts when our movie makes it to the front of the Azure Storage Queue queue:

Since we have an AI Agent, the movie request we made earlier (via HTTP POST) was asking to assess “best picture winner from 1988” and the name of the movie actually assessed was “Rain Man” as you can see. AI is working for us. For visibility in our monitoring and debugging, we added those details as properties in the OTel span. The helps us know exactly which business operation we are looking at when we review the telemetry.

Here’s one more span from the RunJob trace, this one showing some OTel Semantic Conventions for GenAI – the gen_ai.request_model and other span properties – but see also the next section for more on this:

OTel GenAI Semantic Conventions in action ☝

Finally, here’s the trace for that same movie request being retrieved after processing has completed:

OTel GenAI Semantic Conventions

OTel drives consistency across solutions and vendors by specifying semantic conventions. Specifically for GenAI they specify many identifiers (see example above – two screenshots ago). In this screenshot you can see a bunch of them in action. For more information, check out these resources:

Grabbed from Traces view in Aspire (click on a row, this appears in the right-most pane)

Source Code

Repo is here: https://github.com/CrankingAI/goat-agent

Presentation

PowerPoint Deck is here:

MakingAgentsWork-BostonCodeCamp-28-Mar-2026 Download

Connect

Connect with Bill

Connect with Boston Azure AI

Connect with Agent Framework Dev Project

Talk: Making Agents Work – Boston Code Camp #40

1 Reply

I had the opportunity (28-Mar-2026) to present at the 40th running of Boston Code Camp. Thank you to the incredible pros running these events, twice yearly, making it happen for a grateful greater-Boston tech community.

Thank You to the Speakers, Sponsors, and Organizers

Thank you to all the speakers:

Anirban Tarafder · Bala Subra · Bill Wilder · Bob German · Bryan Hogan · Chris Seferlis · Cole Flenniken · Dave Davis · Dave Finn · Dekel Cohen Sharon · Fnu Tarana · Gleb Bahmutov · Harry Kimpel · Jason Haley · Jeff Blanchard · Jesse Liberty · Jim Wilcox · John Miner · Joseph Parzel · Josh Goldberg · Juan Pablo Garcia Gonzalez · Keith Fitts · Matt Ferguson · Matthew Norberg · Michael Mintz · Pavan Kumar Kasani · Richard Crane · Sunil Kadimdiwan · Taiob Ali · Ty Augustine · Udaiappa Ramachandran · Varsham Papikian · Vijaya Vishwanath · Viswa Mohanty

And thank you sponsors:

Hosting: Microsoft · Gold: MILL5 · Silver: Pulsar Security · Progress Telerik · Triverus · Brightstar · In-kind: Sessionize

Making Agents Work

My session was Making Agents Work which highlighted some of “the boring side” of building an AI Agent – but these boring details can be super-valuable. The talk was inspired by work I did in my day job as CTO at Open Admissions. I am using an AI Agent to scale a 30 year-old methodology that can be used to help people understand themselves better and use those insights to choose a more aligned college, major, job, or other consequential life decision. Doing this with an AI Agent is a huge responsibility and, as I shared, putting together the initial agent was the easy part – being confident it is consistent, accurate, well behaved, robust if attacked or misused – but still easy to use – that was the hard and boring part!

The talk uses a different AI Agent – a simple one that accepts a movie and returns a rating summary – to illuminate some of the points. For example, it uses Agent Framework and has a fan-out/fan-in workflow in the internal agent architecture, uses Microsoft Foundry, a modern tech stack, and Azure Monitor for OTel-aligned Observability.

The full description, link to github repo, and slides follow.

But first, please find some elaboration on OTel Traces, inspired by my OTel demo snafu at the live event. That blog post is here: https://blog.codingoutloud.com/2026/03/30/otel-traces-for-the-win/

OTel Traces for the Win

Speaking of OTel… Due 100% to user error (that would be me!), the demo I had prepared to show the incredible power of OTel had a technical glitch. So I have attempted to remedy that with a blog post I’m calling OTel Traces for the Win. So please hop over there if you are interested.

Making Agents Work – the official talk description

Building more powerful AI Agents seems to be getting easier by the day. They are powered by incredible models, have access to tools, and can work in teams. But how can we have confidence in non-deterministic systems that make consequential decisions?

This talk explores four approaches for building that confidence.

1. Observability platforms – You can’t improve what you can’t see. We’ll explore tools that make the hard-to-see stuff visible.

2. Evals (evaluations) – Moving beyond LGTM (looks good to me), evals wrap agents in formal testing structures to measure accuracy, consistency, and edge case handling – both before and after your Agent goes live.

3. Safety guardrails – Content filtering, PII detection, and hallucination detection from both platform vendors and standalone models. Let’s see how they fit into your agent stack.

4. Selective determinism – Sometimes we make better AI solutions by knowing when NOT to use AI. We will discuss mixing in deterministic logic with our non-deterministic behaviors.

Concepts are platform-agnostic, but demos will use Microsoft Foundry and the Agent Framework (currently in preview). (In case you haven’t been following along, Microsoft Foundry was previously know as known as Azure AI Foundry, and before that was Azure AI Studio. And Agent Framework is the next generation of both Semantic Kernel and AutoGen.)

Target audience: Those new to building production agent systems seeking approaches beyond the “hello world” tutorials – which described me not too long ago.

Source Code

The code used to implement the Movie Rating Agent is in the GitHub repo here: https://github.com/CrankingAI/movie-trivia-agent

Presentation

The slides I presented are here:

MakingAgentsWork-BostonCodeCamp-28-Mar-2026 Download