Part 5 of 5

Part 5 - Prompting Reasoning models

I recently had a call with a founder who was frustrated that the new AI models were “getting stupider”.

They were plugging in information as before but the AI was tripping over itself, thinking in loops and generally making a pigs ear of it all.

I asked them to show me.

Turns out they were adding "think step-by-step" to every prompt, just as they'd learned from countless prompting guides!

They weren't doing anything wrong per se. The AI landscape had simply shifted beneath their feet.

Increasingly we have so-called reasoning models.

What we're witnessing is fascinating – modern reasoning models now do much of their thinking internally, automatically breaking problems into steps before responding. Adding explicit step-by-step instructions can actually disrupt this process, like interrupting someone who's already deep in thought and asking what they are thinking about.

As we conclude this Playbook on prompting, let's explore how reasoning techniques are evolving and the new best practices that will keep you ahead of the curve.

Let's get started:

✍️

Summary

Understanding the evolution of AI reasoning capabilities

What Chain of Thought actually is and why it revolutionised AI performance

The new approach to reasoning prompts in the modern AI landscape

Six best practices for working with reasoning models

Tree of Thoughts and other advanced techniques for complex problems

Wrapping up our week of prompting mastery

Understanding AI Reasoning

At its core, reasoning in AI is about breaking complex problems into manageable steps, considering multiple perspectives, weighing evidence, and drawing logically sound conclusions.

(Kinda the same with humans. But that’s a different discussion!)

Very crudely we have non-reasoning models and reasoning models:

Early AI models typically jumped straight to conclusions without methodically working through problems. This led to the development of techniques like Chain of Thought prompting, which served as external scaffolding to guide the AI toward more thorough analysis. GPT 3.5 and 4 are examples.

Modern reasoning models now have much of this analytical capability built in. They automatically:

Break complex problems into logical steps

Consider multiple approaches before settling on one

Check their work for errors or inconsistencies

Revise their thinking when they spot potential issues

This internal reasoning happens before you see the response – like the model drafts a detailed analysis, then edits it down to just the essential conclusions.

It’s pretty magical.

Understanding this evolution is crucial for effective prompting. What worked brilliantly with previous generations of models might now be redundant or even counterproductive. We need to adapt our techniques to match the capabilities of the models we're using.

Chain of Thought: What It Is and Why It Matters

Chain of Thought (CoT) prompting was one of the most significant breakthroughs in getting better performance from AI systems. Before we talk about how it's evolving, let's understand what it actually is and why it was so revolutionary.

What Is Chain of Thought?

At its simplest, Chain of Thought is a prompting technique that asks the AI to break down its thinking process into visible steps before providing an answer. Instead of jumping straight to conclusions, the model shows its reasoning – like working out a math problem on paper instead of just writing the answer.

Here's what CoT looks like in practice…

Standard prompt: "What's the average monthly payment on a £300,000 mortgage at a 4.5% interest rate over a 25-year term?"

This will (more often than not) go horribly wrong. Older models are a bit crap at this sort of mathematical/logical reasoning. You might run the prompt 10 times and get 10 different answers. No bueno.

Instead with a Chain of Thought prompt: "What's the average monthly payment on a £300,000 mortgage at a 4.5% interest rate over a 25-year term? Let's work through this step by step."

With the CoT prompt, the model might respond with something like this:

Let's work through this step by step:

First, I need to convert the annual interest rate to a monthly rate: 4.5% per year ÷ 12 = 0.375% per month

Next, I'll convert this to a decimal: 0.375% = 0.00375

The number of payments will be: 25 years × 12 months = 300 payments

Now I can use the mortgage payment formula: M = P × [r(1+r)^n] ÷ [(1+r)^n - 1] where:
- M = monthly payment
- P = principal (£300,000)
- r = monthly interest rate (0.00375)
- n = number of payments (300)

Substituting these values: M = 300,000 × [0.00375(1.00375)^300] ÷ [(1.00375)^300 - 1] M = 300,000 × [0.00375 × 3.0991] ÷ [3.0991 - 1] M = 300,000 × 0.0116 ÷ 2.0991 M = 3,486.36 ÷ 2.0991 M = £1,660.88

Therefore, the monthly payment would be approximately £1,661.

Why Was CoT Revolutionary?

Chain of Thought transformed AI performance for several critical reasons:

1. Error Reduction: By breaking problems into steps, models became much less likely to make calculation errors or logical mistakes. Each intermediate step serves as a checkpoint, keeping the reasoning on track.

2. Complex Problem Solving: CoT allowed models to tackle much more complex problems than they could previously handle. Problems that require multiple steps of reasoning became solvable.

3. Transparency: The visible reasoning gave users insight into how the AI was approaching problems, making it possible to spot where things might be going wrong.

4. Educational Value: The step-by-step approach made AI outputs more useful for learning, as users could follow the reasoning process rather than just seeing the answer. It’s sort of like writing your “workings out” in a maths exam.

5. Confidence Assessment: Users could evaluate the soundness of the AI's reasoning, rather than having to blindly trust its conclusions. Traditional models give us an answer and we have to just hope it’s true! Which is tricky because hallucinations exist!

The impact of CoT can't be overstated – it turned models from simple text predictors into systems capable of sophisticated reasoning across mathematics, logic, planning, and more. It was (and is, in certain situations) a very clever hack to get better results from AI.

Now though we have models that do this for us: reasoning models.

The New Approach to Reasoning Prompts

Modern reasoning models now generate their own internal chain-of-thought before responding. Think of the o Series by ChatGPT - o1 and o3 for example. Or any AI that has a “Deep Research” function.

They think through problems step by step, check their work, and sometimes even revise their thinking – all before showing you a single word.

They are running a chain of thought process internally. With some bells and whistles of course!

This changes everything about how we should prompt these models. Rather than forcing the model to show every step of its reasoning, we can now focus on guiding its attention and shaping its output.

Six Best Practices

This is an evolving field so for now we’ll stick to best practices rather than hard and fast rules. As with everything in AI it’s fluid! Working with modern reasoning models requires a new set of best practices:

1. Guide, don't babysit The AI already “thinks”—just give it a clear job.

Example: "You're a growth-marketer. Suggest 3 paid channels for a $10k budget."

Rather than scripting every step of the thinking process (as we will do with non-reasoning models), focus on clearly defining the task and letting the model's internal reasoning take care of the rest. So in the context of the RISEN framework we can drop the S!

2. Lead with the role Start by telling the model who it is.

Example: "You're a CFO explaining cash flow to non-finance staff..."

Role-based prompting provides context that shapes how the model approaches the problem, without micromanaging its reasoning process. This remains valid with reasoning models - we’re just telling it who to reason as. We retain the R of the RISEN framework.

3. State the output first Spell out format and style before asking.

Example: "Return a 5-bullet checklist, each bullet < 20 words."

By clearly defining what you want the final output to look like, you can let the model handle the reasoning process while ensuring you get a result in exactly the format you need. This is the E of RISEN - still valid!

4. Prototype hot, deploy cold Tinker with loose settings, then lock them down.

Example: Draft ideas at temperature 0.7, final runs at 0.2.

When developing prompts, use higher temperature settings to explore different approaches. Once you've found what works, lower the temperature for consistent, reliable results. This is generally good advice for all prompt engineering but the great variability of reasoning models makes it even more powerful.

5. Budget tokens like cash Extra words = extra cost. Trim the fat.

Example: Paste the exec summary, not the 30-page report.

Since modern models handle reasoning internally, you can focus on providing just the essential information needed, rather than including verbose instructions. This is even more important than with non-reasoning models because of the additional costs associated with them.

6. Build your fallback Use cheaper models for easy jobs, premium for tough ones.

Example: Use advanced models for strategy, simpler models for spell-check.

This practice recognises that not all tasks require sophisticated reasoning – match the model to the complexity of the task. We talk about the capability cliff before. This is particularly the case with reasoning models which tend to be more expensive.

Advanced Techniques for Complex Problems

While basic reasoning is now baked into many models (via chain of thought), certain complex problems still benefit from specialised approaches. These are just additional ways to nudge how the AI to think “deeper”. They work with reasoning and non-reasoning models and are worth adding to your toolkit.

Tree of Thoughts: Exploring Multiple Paths

Tree of Thoughts encourages the AI to consider multiple solutions before committing to one – similar to how chess players evaluate different moves before choosing.

How it works:

For this [problem/challenge], please:
1. Generate 3 distinct approaches to solving it
2. Briefly evaluate each approach's strengths and limitations
3. Select the most promising approach and develop it into a full solution

This technique is particularly effective for:

Open-ended problems with multiple valid solutions

Creative challenges requiring exploration of different ideas

Complex planning scenarios

Situations where the initial approach might lead to dead ends

Step-Back Prompting: Gaining Perspective

Step-back prompting asks the AI to consider the broader context before addressing specifics – like taking a step back to see the whole forest before examining individual trees.

How it works:

Before addressing this specific question about [topic], first consider the broader context, relevant principles, and frameworks an expert would apply. Then provide a focused answer.

This approach works particularly well for:

Problems that benefit from contextualising

Situations where jumping to specifics might miss important considerations

Coding problems when the AI is hyper-fixating on something minor(!)

Self-Consistency: Improving Reliability

Self-consistency involves having the AI verify its own work using different approaches – like double-checking a calculation using a different method.

How it works:

I need a reliable answer to this problem. Please:
1. Solve it using your primary method
2. Verify the solution using a different approach
3. If there are discrepancies, determine which approach is more reliable and why

This technique is valuable for:

High-stakes decisions where accuracy is crucial

Problems with multiple valid solution methods

Questions where you've previously received inconsistent answers

If you are building workflows or software you might use a less advanced (read:cheaper!) model to check the work of the more advanced model.

These are just supplementals ways to get our AI to solve problems for us. Honestly these are solid human thinking methods! They just become prompting techniques because of the particular context. Remember, this is all ultimately communicating what we want the AI to do for us!

← Part 4 - Which AI model is best?

AI with Kyle

AI with Kyle

Part 5 - Prompting Reasoning models

Understanding AI Reasoning

Chain of Thought: What It Is and Why It Matters

What Is Chain of Thought?

Why Was CoT Revolutionary?

The New Approach to Reasoning Prompts

Six Best Practices

Advanced Techniques for Complex Problems

Tree of Thoughts: Exploring Multiple Paths

Step-Back Prompting: Gaining Perspective

Self-Consistency: Improving Reliability