A Joust Tool. Free. No email required.

AI Pilot Pre-Flight Checklist

Twelve questions that tell you whether your AI pilot has the operational prerequisites to actually ship.

Why a pre-flight check?

Most AI pilots fail for reasons that have almost nothing to do with the AI. The model is fine. The tools are fine. What is missing, almost every time, is one or more operational prerequisite: a workflow owner, a defined exception path, a business metric to measure against, a plan for how the team's day-to-day actually changes.

This checklist surfaces those prerequisites before you commit budget. Twelve yes/no questions across four categories. Each "no" is a specific, named gap you can either close or scope around. The score gives you a simple readiness signal. The category breakdown tells you where the gaps cluster.

It is not a substitute for a real roadmap engagement. It is the same diagnostic we run in the first 30 minutes of every Joust engagement. If the rough check says "ready," the deeper work usually confirms it. If the rough check says "not yet," the deeper work usually agrees, and the pilot would have stalled anyway.

  1. Pick one specific pilot you are considering or already running. Not your AI strategy in general, one specific workflow.
  2. Answer each question yes or no. Click the question or the checkbox to mark "yes." Be honest. A "we are working on it" is a no.
  3. Read the score and category breakdown. Each unchecked question shows specific guidance on what to fix. Save your assessment as a PDF for your team.
Or load a sample to see what the output looks like:

The twelve questions

0
of 12
Not started
Answer the questions above to see your readiness score.
Each "yes" earns one point. The score zone tells you whether to greenlight the pilot, plug specific gaps first, or pause and do prep work.

Category breakdown

Each category contains three questions. The bars below show where your readiness is strong and where the gaps cluster.

What each category means

Workflow ownership
Someone owns the work itself, not the AI tool. Without an owner the pilot quietly stops working when the first thing breaks, and three months later nobody has noticed.
Workflow definition
The work has been mapped end-to-end, including exception paths. AI does not replace a workflow, it slots into one. If the workflow is fuzzy, the AI bolt-on will be too.
Measurement
Success ties to a number the business already reports on. Accuracy and F1 scores are diagnostics, not victory conditions. The CFO will not approve a second pilot on F1.
Change management
The team whose work changes has been part of the design, and there is budget for the workflow change, not just the tool. The redesign is usually the largest line, and the most often missed.
The methodology behind the checklist

This checklist comes from a single observation we make in nearly every engagement: AI pilots fail for reasons that have almost nothing to do with AI. The model is rarely the bottleneck. The bottleneck is workflow design, ownership, measurement, and operating-model fit.

We grouped the questions into four categories because the failures cluster that way. Pilots without a named owner tend to also be missing the senior sponsor and the end-to-end map; pilots without a business metric tend to also be missing the kill criteria. The category breakdown surfaces which cluster you are in, so the prep work is concrete instead of vague.

On scoring. One point per "yes," twelve maximum. We use thresholds at 4 and 8 because that is where the roadmap engagement data clusters: pilots scoring 9-12 ship at meaningfully higher rates; 5-8 ship if the specific gaps get plugged before kickoff; 0-4 almost never ship as conceived. The recommendation text is conservative on purpose. We would rather a high-readiness team double-check than a low-readiness team rush.

On honesty. The single biggest failure mode of self-assessment tools is grading on a curve. A "we are working on it" is a no. A "this is implicit, everyone knows" is a no. A "the vendor handles that" is usually also a no, because the vendor handles the model, not the workflow it lives inside.

On scope. This checklist applies to one pilot at a time. A platform rollout (multiple workflows, multiple teams) needs a different conversation. If you are planning that, the checklist is still useful per workflow, but the score does not aggregate cleanly.

What we are not checking for

Three things this checklist deliberately does not assess. They matter, but later in the process:

  • Model selection. Which LLM, which vendor, which deployment. This becomes a meaningful question after the workflow is designed, not before.
  • Technical integration depth. API patterns, data pipelines, observability. Important for execution. Not a precondition for the go/no-go decision.
  • Regulatory and compliance review. Required in regulated verticals. Track on a separate workstream; do not let it become the reason you do not start.
What Joust does with this checklist

We run a deeper version of this diagnostic across three to seven workflows in five weeks. Same prerequisites, sized to your operating model, with the implementation roadmap and ROI math attached. Senior-led. Fixed scope. The deliverable is a model your CFO can read and a roadmap your COO can run.

References and where this approach comes from

Joust internal sources

  • The Audit Is Not a Product (Joust, 2026), the long-form companion to this checklist. joustagency.com/blog/the-audit-is-not-a-product/
  • Workflow-First. Why Most AI Pilots Stall Before They Ship. (Joust, 2026), the methodology paper. joustagency.com/blog/workflow-first/
  • Joust AI Operations Roadmap findings, aggregated patterns across mid-market engagements (2024-2026).

External research informing the questions

  • Davenport, Thomas H. and Bean, Randy. The MIT Sloan Management Review and BCG annual AI surveys, 2017-2024. Documents the persistent gap between AI investment and value capture, and the workflow-integration cluster of root causes. sloanreview.mit.edu
  • McKinsey & Company. The state of AI (2023, 2024). Documents that the majority of mid-market AI pilots do not progress to production, with workflow ownership and metric design cited as primary blockers. mckinsey.com/capabilities/quantumblack
  • Brynjolfsson, Li, Raymond. Generative AI at Work. NBER Working Paper 31161 (2023). Highlights the importance of well-defined task structure for AI productivity gains, supporting the "workflow definition" category. nber.org/papers/w31161
  • RAND Corporation. The Root Causes of Failure for Artificial Intelligence Projects (2024). Independent root-cause analysis of failed enterprise AI projects; ownership, measurement, and change management feature prominently. rand.org

What we deliberately exclude

  • Model-quality benchmarks. Useful later; not relevant to whether the pilot can ship at all.
  • Vendor-published case studies. Survivorship bias is severe in this category.
  • Maturity models. Useful for board reporting; not actionable at the workflow level.
Get in touch