Chatbot Testing: Specifics and Techniques

Conversational interfaces, or chatbots, are expected to revolutionize the way we interact with companies and brands. It has even been suggested that these could replace websites and some apps since they use the most natural method of human interaction — a dialogue.

Right now, there is a lot of hype surrounding chatbots and their potential applications. Yet, creating a machine that can still be useful once the novelty factor disappears is challenging. Ensuring quality for such an entity in a way that invites users to regular interaction comes with different obstacles when compared to existing software.

KISS — One Robot, One Job

The underlying principle for a well-performing chatbot is to keep it simple, stupid. Don’t expect to develop a bot that can answer any question. Instead, define in detail the things they need to perform flawlessly and focus first on the most frequent cases, then on possible cases, and lastly address infrequent requests. Make sure your chatbot can redirect the conversation back to the original scope to avoid unwanted hijacking or abandonment due to inutility. It is already difficult enough to foresee all scenarios and user inputs, so don’t add complexity by trying to answer any out of scope input.

Define Your Testing Methods

Testing a chatbot should address each of its components, starting with an input, the knowledge base, intelligence and reasoning, taking into consideration the infrastructure where the bot is hosted, as well as other premises like connectivity and voice communication.

To test usability, it is useful to create a list of possible user inputs, together with the chatbot’s expected answer as well as potential problems such as alternative spellings or misspelling to ensure it still produces the same correct outcome or ask for further clarification.

Define Metrics

All testing should be based on clear expectations defined by metrics. Even in UX, which can be highly subjective, having clear KPIs can speed up the development process significantly. Other possible useful parameters include the number of steps to perform a specific request, the percentage of returning visitors, the average time spent by the user in one session, retention rates, click-through rates, and the handling of confusion.

These targets, which could trigger optimizations, are a mix of quality indicators for websites, metrics derived from social media, and KPIs for video platforms. Currently, there are no chatbot-specific metrics, but we can expect QA companies to develop those shortly.

Chatbot-specific Testing

Although general web application testing, including functional, compatibility, security and performance testing, is required, as defined by an expert in web app testing services from A1QA, for a great chatbot, it’s necessary to remember that usability comes first. Here are some of the top features to test when creating a new chatbot to create an engaging experience and make users come back for more.


Most users, even when they are aware they are talking to a chatbot, have the tendency to treat the app as a real person due to the novelty of this technology and our brains being accustomed to human interaction. That gives developers the opportunity to endow the chatbot with personality and give it a voice that is appropriate to the brand they represent. Here, testing should be focused on the consistency of keeping the same voice over the entire conversation. The user shouldn’t feel like they are talking with different customer representatives.

Also, there should be an upfront disclaimer about the chatbot’s abilities, limitations and preferred way of interaction for best results. If the bot is using voice, it should be able to see past noise or accents; if it accepts pictures as an input, it should guide users about the picture specs required for the underlying algorithm. Testing should also be performed in high-stress conditions to detect the system’s limits.


Following the defined testing methodology, the chatbot should perform according to the input table. Testing should detect any ambiguous commands or duplicate keywords. With the introduction of NLP, chatbots can do more than if/then controls, as they can parse text and create their own answers. Testing should play with different inputs and variations of the same input to identify the system’s ability to understand.

More advanced chatbots allow different types of data, not only plain text. Each of these should be adequately tested and debugged. As described in the previous point, non-text data should be first checked for quality and then for compliance with input requirements.


Compared to a website or an app, a chatbot is less intuitive regarding navigation, and the user’s wish to go back to a previous step could create frustration and quick abandonment. A chatbot with great UX makes the rules clear from the start by explaining to the user how to go back to an earlier point in the conversation or how to skip to the next one if they hit a dead end. Test if your users can change their selected topic, start over, or look for something else.

Handling Errors

It is necessary for a chatbot to understand when the user made a mistake or to get the frustration arising from not getting the answer they are looking for. A simple observation of getting more than two error codes in a row could be a good trigger that the bot is not performing according to its intended function. Try defining different types of errors such as no answer in the database, wrong topic, invalid response and so on.

Agile and Continuous Testing

Chatbots are excellent examples of software that can be developed using the Agile approach. The minimum viable product can be enriched during each iteration with new phrases captured by the error management functions. To ensure no bugs are crawling into the bot, testing should also be performed at each iteration.

While in the first stages manual testing ensures business logic, in later phases automation can save time and help developers and QA teams get new and improved versions to market. Companies that have foreseen the power of these new interfaces are already investing in testing automation and starting to create value through chatbot-powered conversation.