On Friday morning at 8 AM in Odense, another speaker’s travel had fallen through, and their 90-minute Saturday slot at the conference was suddenly empty. I pitched filling it. The organizers said yes, and were apparently thrilled to have me do it.
So I had 27 hours to architect, build, and deliver a brand-new 90-minute technical talk. For a developer audience. On one of the most contentious topics in our industry right now: how to actually use AI coding agents without fooling yourself.
I walked off stage Saturday feeling good about it, but honestly unsure where it had actually landed. The attendee ratings came in a few hours later: 5.0 out of 5 on Speaker. I’m not mentioning that to brag, and I almost cut it from this post entirely. The reason it stays is that this whole post is about measurement improving craft. If I’m going to argue that speakers should share their numbers, I should probably share mine.
Here’s what’s been sitting with me since: I didn’t walk off stage thinking that was perfect. I walked off thinking that felt good, but I have no idea what actually worked and what didn’t. Because imposter syndrome is a mean little gremlin that won’t shut up – it made me wonder how many other speakers are walking off stages all over the world with the same gobbo chewing on them, never quite sure what they did right.
A detour through Brandon Sanderson
If you don’t follow fantasy literature, Brandon Sanderson is one of the bestselling living authors on earth. In January 2026, he gave a keynote called We Are the Art. The short version: he argues that AI can generate outputs that look like art, but the AI doesn’t grow from making them. Only humans grow from making things.
The art, Sanderson says, isn’t the output. It’s what happens to the person in the process of making it. The photographer becomes a better photographer by photographing. The writer becomes a better writer by writing. The AI, no matter how good its outputs get, ends each session exactly as it started.
I find that really impactful. I teach in my content that you throw out the intern when you’re done – but sometimes I remind people that the skills YOU develop are the real product you have to fight for. I encourage using the AI tools to make you better.
That’s the angle I’ve been working on, and it’s what this post is actually about.
The problem most speakers quietly live with
I’ve been speaking at conferences for a few years. Most of us have very little real data on how our talks actually land.
We get applause, and we get attendee ratings, and a handful of hallway comments, weighted toward the folks who liked it enough to approach us. And a gut feeling that’s almost entirely governed by whether the demo worked.
But I found myself really wondering – what can I do better? I tell people, I want to be better than the me of last week. “You can’t improve what you can’t measure.” All that makes me question myself: How fast did I actually speak in Act 1? Did the callback to the opening metaphor land in Act 3? How many times did I say actually when I didn’t need to? Did the planned punchline make it out of my mouth, or did I swap it for something worse on the fly? Ninety minutes of signal, and most speakers walk offstage with almost none of it captured.
So thinking all that after my Friday talk, I made sure to record Saturday. Preparing.
What I built, and what it told me
Here’s what came out of an afternoon of running the right tools against that recording:
Speaking pace over time, charted in two-minute windows. I ran at 167 words per minute overall, which is energetic but controlled. Act 1, though, was cooking at 200+ WPM for six straight minutes. That’s fast enough to lose non-native listeners right when I’m introducing the central metaphor of the talk. Good to know.
Filler word frequencies with timestamps. My um and uh floor turned out to be about 1 per 10 minutes, which is trained-speaker territory. I felt good about that for about thirty seconds, until I saw that I’d said like 60 times and actually 42 times in the same 87 minutes. Both audible. Both fixable. And I had no idea.
A pause inventory, separating dramatic pauses (over 3 seconds) from significant ones (1.5 to 3). Nine dramatic pauses across the talk. Two of them were right after punchlines, and they were 3.2 and 4.2 seconds long. The laughs landed and I didn’t step on them. I didn’t know I was doing that on purpose, but apparently I am.
A landmark tracker against the intended outline. A specific planned line, “I just fired that intern, but I stuffed him into storage, desk and all,” never showed up in the transcript. I either skipped it, mumbled it past Whisper, or the metaphor evolved live. Worth a spot-listen.
Signature phrase frequencies. The thesis word of the entire talk was discipline. I used it twice in 87 minutes. The audience felt the concept — the ratings say so — but they didn’t hear the word as a refrain. One of the most fixable coaching notes I’ve ever received on myself.
Improvisation capture. Several lines I delivered live were genuinely better than anything I’d scripted. “Maybe when the robots uprise, I’ll be spared as pet,” got a laugh and telegraphed a whole attitude about how I relate to my tools. That’s in my pocket for every future delivery now.
None of that existed days ago. All of it came from a recording and a few scripts running locally on my laptop.
Then a friend made it better
After I put the first version of this together, Steve Endow (fellow BC MVP, fellow presenter, and honestly one of the most consistent voices in our community with his weekly podcast) saw what I’d built and handed me a ten-page synthesis of evidence-based presentation evaluation. Kirkpatrick’s training evaluation model. Cialdini’s influence principles. Kahneman’s peak-end rule. Bandura on self-efficacy. Edmondson on psychological safety. The whole stack.
He said something like “you might want to see what you’re missing,” and he was right.
What I’d built was a solid tool for one slice of the problem: delivery mechanics. It wasn’t touching the psychological design of a talk at all. Was the thesis phrase appearing often enough to work as a refrain? Were the influence principles getting deployed, or left on the table? Did the close include a specific, trackable call to action, or did it fade out like most talks do?
I added a sixth dimension to the kit to handle that layer. It scans for the linguistic signals of Cialdini’s principles, measures refrain discipline against intended thesis phrases, flags peak-end candidates, counts psychological-safety markers, and audits call-to-action clarity. It’s not a substitute for a trained speech coach — these are signal detectors, not verdicts — but it gives you concrete anchors for the kinds of coaching conversations most speakers never actually get to have.
I’ll be honest about what the kit doesn’t do, too. Kirkpatrick’s full model has four levels: reaction, learning, behavior, results. My kit handles part of level 1. Levels 2 through 4 (did people actually learn, did they change behavior, did business results follow) need separate instruments that a transcript alone can’t provide. If you care about that layer, and I’d argue you should, you’ll be building beyond what this post covers.
Thanks, Steve.
The thing I keep turning over
The whole time I was working through this, I kept coming back to Sanderson.
The AI didn’t get better at evaluating speakers. I got better at evaluating myself. The AI didn’t grow from reading my transcript. I did. The measurement was cheap, but the interpretation was mine — my outline, my intent, my understanding of what each moment was supposed to do.
That feels like the division of labor that actually works. The AI handles the rote part, the counting, the pattern-matching across 14,000 words. I handle the part where I decide what to change next time, and why.
You are the art. In this case, the AI helps you see yourself more clearly while you’re making it.
Why I’m sharing the whole thing
Here’s where I got to.
I’m proud I’ve worked my way up to scoring talks like this one. Years of practicing (2022 was my first on-stage talk!), real customer pressure, good mentors, a willingness to embarrass myself (6 people in a room for 1000+ has happened!). Worth every minute of it.
But I kept thinking: if the tools and the framework that helped me refine this are genuinely useful, keeping them to myself feels wrong. I know too many good speakers in this community who are operating entirely on vibes, and who would level up fast if they had this kind of lens on their own material. I’d much rather attend conferences full of speakers who’ve sharpened themselves this way.
So the whole kit is going into my public “Copilot Junk Drawer” repo. Scripts, agent instructions, an example config, the works. Free. Local. Private to whoever runs it. Please steal it!
Record your next talk. Run it through. Bring what you find to the talk after that. If it surfaces something useful, tell me. If it surfaces nothing, tell me that too, because I want to make this better and I can’t do it alone.
The long arc of this
I’ve been noticing something over the last six months of working with these tools. The people getting real leverage from AI aren’t the ones generating the most output. Those people are mostly just producing mediocre work faster. The people getting leverage are the ones who’ve figured out how to put the AI around their expertise instead of in place of it. To leverage AI tools at the key places in the processes to empower your wisdom and vision.
There’s a pattern to that, and it’s one I’m going to be talking about a lot more in the coming months. For now, just notice this post is an example of it: the AI didn’t review my talk for me, it helped me review my talk for myself. My judgment drove the analysis. The AI did the measurement.
That’s the part that matters if Sanderson is right about anything. The art is the transformation of the person, and the AI doesn’t do that for you. It clears away some of the friction so you can do it more often.
If you give talks at all
Record your next one. Even the bad one. Maybe especially the bad one. Run it through the kit, or through any pipeline you trust, and see what surfaces.
The art isn’t the recording. It’s not the transcript. It’s not even the talk.
It’s you, across years of them, getting better.
The kit: https://github.com/JeremyVyska/copilot-junk-drawer/tree/main/talk-eval-kit
Sanderson’s keynote: We Are the Art is on YouTube: https://youtu.be/mb3uK-_QkOo?si=M6vyma8M8Km_Coaa Worth the 20 minutes even if you’re not a fantasy reader.
If you try this on your own talks, I want to hear about it. My email and socials are on the about page. This only gets more useful as more people use it and tell me what breaks.


Love this post. The “you are the art” framing is exactly the right counterweight to the current AI hype cycle.
What stood out to me most is that you didn’t stop at philosophy — you built a practical feedback loop for craft. Most speakers finish a talk with applause, a rating, and a feeling. You showed what happens when you add real instrumentation: pacing over time, filler patterns, pause quality, callback/refrain tracking, and whether planned lines actually made it to delivery. That shift from “I think it went well” to “here’s what actually happened” is huge.
I also appreciated the honesty about scope: delivery mechanics are measurable from recordings, but learning/behavior/results need separate instruments. That nuance makes the whole framework more credible, not less.
And the best part: the data isn’t there to flatten voice into a robot script. It’s there to help the human speaker improve intentionality — keep what works, fix what drifts, and preserve the spontaneous moments that are better than the draft.
Really strong example of using agents the right way: not as a replacement for the artist, but as a mirror that helps the artist get better.