Why don’t we replace the testers with AI?

Publication date Read time 5 min Tags AST / testing

The Association for Software Testing is crowd-sourcing a book, Navigating the World as a Context-Driven Tester. The book is edited by Lee Hawkins, who posts questions on Mastodon, LinkedIn, BlueSky, Twitter, Slack, and the AST mailing list. You don’t have to be a member to give your answer to the most recent prompt, “Why don’t we replace the testers with AI?”. Mine is below.

We can’t replace testers with AI because, as of April 2024, there’s no AI capable of doing all the things that testers are doing.

Now, that’s the answer that misses the point of the question. Obviously, the intent is: assuming AI develops rapidly and may get to the point where it is capable of performing some / most / all of the things that testers do, are there any reasons to not replace testers with AI?

That “assuming” is doing some heavy lifting here. Unfortunately, AI discourse is ridden with hype, daydreamers and straight-up con artists. I think it’s important to try to tip the scales by highlighting the problems that need to be solved. Let’s dive into what exactly “assuming” here means.

We assume that AI of the future will not hallucinate. “Hallucination” in the context of AI is a marketing term used for situations where AI system outputs statements that are not true. This is one of the most widely known problems, as it takes only a few minutes of interacting with AI bot for it to say things that are demonstrably false. This is also a heavily researched topic. There is some research showing that “hallucinations” might be inevitable in the current breed of AIs. For all we know, no company has been able to solve that yet - OpenAI, Microsoft, Google and everyone else just add an increasing number of layers around their models, hoping that users encounter canned response earlier than a problematic one.

We assume that we have not yet reached the limit of what current AI technology can do. OpenAI takes a position that all the problems in AI space right now are resource-bound. What they say is that we just need a little larger models and little faster computers; what they imply is that we already have all the theory needed to build any AI we might need. If you go to any computer science or psychology department in the world, I’m not sure if you will find a single person that shares that belief.

We assume that if there are any problems that are fundamentally qualitative in their nature (i.e. won’t be solved by a little faster computer), we will solve them. The tricky thing about unknown unknowns is that you don’t know about them. And the thing about breakthroughs is that you don’t know when they are going to happen, if at all. Hilbert’s problems are a good exemplification - some of the problems are still unresolved after decades of work, while others were solved soon after their posting. When Hilbert announced them, it was impossible to predict which are going to be which.

We assume there are sustainable business models that can support AI. All current AI products are developed and run at the loss. This is not a necessarily a bad thing - every research takes resources before it turns into something that can generate profit, and most of companies have to lose money before they can earn enough to sustain themselves (this is what “investment” means). However, right now we still don’t know what kind of problems AI is going to be good at solving, and many of AI leaders seem to be fundamentally disinterested in engaging with this question.

We assume that AI natural resource usage can be lowered and kept at manageable level. AI models are very resource-intensive, especially during the training. Data centers are well-known sponges for electricity and water. As I’m writing these words, we are in the streak of 10 record warm months, and judging by the first half of April, it’s going to be at least 11 record warm months. The climate change is the biggest challenge that humanity faces right now. The nations are finally starting to create and enforce laws that attempt to limit the natural resources usage, many companies pledge to achieve net-zero emissions within a next decade (including Microsoft and Google), and customers are actually taking environmental costs into consideration in their purchasing decisions.

We assume that AI can be trained on legal material, and AI output will not create additional legal risks for users. Creative workers were first to be hit by AI work, and many were not happy to learn that these systems are trained on their pieces without explicit consent. New York Times is in legal battle against OpenAI on the ground of copyright infringement. This, and many more lawsuits in the future will establish the limits, legal rules and total costs of AI usage and training. Smarter companies are holding off wide deployments of AI tools until someone else fights these legal battles and establishes precedent. Actual creators of these tools count themselves among smarter companies.

We assume that general quality of the work AI tools produce is going to be acceptable. Every time I see anyone pitching an AI tool, I have a single question - if the tool is so great, why don’t you use it yourself? You saw Microsoft talking a lot about purported time-savings coming from using GitHub Copilot, but you never saw Microsoft talking about using Copilot for Windows, Office or Visual Studio development. Makes you wonder why.

All of above make me think that “AI taking testers jobs” might not be a problem that we are going to have anytime soon.