Using AI in Evaluation: What’s Changing, What Isn’t, and What Matters

By Amanda Deuchars

Learning Coordinator, Twende Mbele

Setting the Scene: A High-Demand Training in Dakar

A recent training in Dakar, hosted at the Centre Africain d’Études Supérieures en Gestion (CESAG) and organised by CLEAR Francophone Africa (CLEAR-FA) in partnership with the Agence Française de Développement (AFD), brought together evaluation practitioners from across Africa to explore how artificial intelligence can be used in monitoring and evaluation. The level of interest alone said a lot. More than 2,500 applications were received for a relatively small group of participants.

The Real Shift: Speed, but Not Quality

There is clearly momentum behind AI in the evaluation space. But after several days working through practical applications, one thing became clear. The real shift is not the technology itself. It is how we choose to use it.

What stood out immediately was how quickly AI can be integrated into everyday evaluation work. Across the sessions, participants explored how AI can support structuring theories of change, drafting Terms of Reference, identifying indicators, synthesising qualitative data, and even generating first drafts of evaluation reports. In practical terms, this changes the pace of work. Tasks that would typically take days can now be completed in a fraction of the time. For teams working under pressure, particularly in rapid evaluations, this is not a marginal improvement. It is a significant shift. But speed is not the same as quality.

One of the most important lessons from the training was how convincing AI outputs can be, even when they are wrong. AI can generate statistics that were never calculated, assign percentages to qualitative findings without any basis, and produce outputs that look structured and methodologically sound while lacking any real analytical grounding. The risk is not obvious failure. It is subtle inaccuracy.

Credibility at Risk: Why Judgement Still Matters

In one exercise, AI produced a clean and confident synthesis of qualitative responses, complete with quantified findings. On the surface, it looked credible. On closer inspection, those numbers had no methodological basis. This matters because evaluation depends on trust. If the analytical process is weak, the credibility of the findings is compromised, regardless of how polished the output looks.

The takeaway was simple. AI can support analysis, but it cannot replace analytical judgement.

Context and Accessibility: Where AI Helps (and Where It Doesn’t)

Another theme that came through strongly was the importance of context. AI systems do not naturally understand local realities. Without clear guidance, they default to generic, often Western-oriented assumptions. In evaluation, this is not a small issue. Context shapes how problems are defined, how programmes are designed, and how results are interpreted.

The training introduced structured ways of engaging with AI, including the use of frameworks to clearly define context, role, and purpose. This improves the relevance of outputs, but it also reinforces a deeper point. AI does not bring contextual intelligence. The evaluator does.

Beyond efficiency, one of the more interesting areas explored was accessibility. The training included demonstrations of AI-powered assistants that can deliver information through voice interfaces. In contexts where access to evaluation findings is often limited to lengthy reports, this opens up new possibilities. It becomes possible to imagine local officials accessing findings through a phone call, or practitioners querying lessons learned without needing to read full reports.

This is where AI starts to move beyond productivity and into something more meaningful. It has the potential to expand who can access and use evidence.

The Reality Check: Systems Matter More Than Tools

At the same time, the training reinforced something that is often overlooked. Technology is not the main constraint in evaluation systems. Across discussions, familiar challenges came up repeatedly. Weak data quality, limited access to administrative data, fragmented systems, and capacity gaps in analysis remain significant barriers.

AI does not solve these issues. In many cases, it exposes them more clearly. Advanced tools, dashboards, and automation are only as effective as the data that feeds them. Without strong data systems, even the most sophisticated tools have limited value. For organisations working in evaluation, this is an important reality check. AI is not a shortcut around system weaknesses. It depends on system strength.

What AI Changes, and What It Doesn’t

AI is changing how quickly we can work, how we interact with information, and how we can potentially expand access to evidence. What it is not changing is the core of evaluation. Evaluation still depends on sound methodology, contextual understanding, critical thinking, and professional judgement. If anything, these become more important in an AI-enabled environment.

There is a tendency to frame AI as either transformative or risky. The reality is more grounded. AI is a powerful tool, but its value depends entirely on how it is used.

The Real Takeaway: A Shift in Mindset

The most useful shift coming out of the training was not a specific platform or technique. It was a mindset. To use AI critically, deliberately, and with a clear understanding of its limitations. Because in evaluation, credibility is everything, and that still sits with the evaluator.

Using AI in Evaluation: What’s Changing, What Isn’t, and What Matters

Setting the Scene: A High-Demand Training in Dakar

The Real Shift: Speed, but Not Quality

Credibility at Risk: Why Judgement Still Matters

Context and Accessibility: Where AI Helps (and Where It Doesn’t)

The Reality Check: Systems Matter More Than Tools

What AI Changes, and What It Doesn’t

The Real Takeaway: A Shift in Mindset

Leave a Reply Cancel reply

Activities

Countries

Newsletter Sign Up

+27 11 717 3455

University of Witwatersrand