Building an IBM Watson Powered AI Chatbot

Developing a speech-to-text bot using IBM’s Supercomputer.

Philipp Langhans
Chatbots Magazine

--

Everyone is talking about chatbots these days. Telegram has them, Facebook got them recently, Slack has them and many more want them (for a good reason). It was time to figure out what these chatbots are and how they can help and make life easier.

We are at a point where artificial intelligence is a big thing again. Recent movies like “Ex Machina” and “Her” pictured super smart and dangerous AI-driven bots that could easily become a threat to humanity. At the same time, Stephen Hawking, Elon Musk, and Bill Gates Warn About Artificial Intelligence, we are reading about self driving cars or about Google beating a world champion in Go with their bot. And suddenly, we don’t know anymore how far we are from this depicted alternative future. And then there comes a new hype about chatbots. A multi-billion dollar hype. No wonder, fear and expectations are quite high.

UPDATE: Telegram just announced a $1,000,000 challenge for bot developers!

perfect timing!

Feel free to use my code and build an exciting project for the challenge!

https://telegram.org/blog/botprize

Chatbot Status Quo

Browsing through the list of available bots, however, one will find little innovation but see a recognizable pattern: saturation is nothing app specific. It seems many developers missed the initial app gold rush and are now jumping on the bandwagon named “Bots are the new apps. The bot store is the new app store.

When the Web had its ‘Appification’ at some point, apps see a ‘Botification’ these days. It works like this: take a successful app (name), append ‘bot’ to the end, add AI to the description tags and wait for success. But it doesn’t work like this. Users don’t work like this. They are not stupid.

…death to the bots

These bots are not just useless as they are not providing any additional value but they also clash with the expectations we’ve developed from reading the news. Most users don’t want a command line interface where they could have a shiny, simple to use and UX-tested app instead (unless they are mega-nerds or masochists). And users sense when they are fooled and it makes them angry… and everyone knows what happens when users don’t like bots anymore..

HitchBOT, hitchhiking robot, gets beheaded in Philly

But who am I to complain a lot about bots? I’m not a journalist. I’m an engineer, so I better develop a better solution or stfu.

Building Intelligent Bots

While a “cats and titties” bot can provide fun for some time there should be more and we should aspire to build something more meaningful. If users want AI they should get AI! Right? But how hard is it for someone to put true artificial intelligence into a chatbot? I mean artificial intelligence is a very scientific discipline with some of the brightest computer scientists and mathematicians working on it for almost 60 years now. The answer is: it requires 50 lines of code.

But first, lets find a good use case. As described above, the botification of an already successful app introduces more pain than relief. So what could be a good use case in the context of chat, relevant for the broad mass and not just productivity nazis or angry support seeking customers?

Featured CBM: How to Make a Chatbot Intelligent?

The Problem: Voice Messages

Many chats do have a voice message feature. The reason is obvious: typing is not the most joyful task on your smartphone keyboard and often it is just easier and faster to speak your thoughts into the chat. However, it isn’t joyful listening to all these voice messages either. Sometimes, you are in a quiet space or meeting or you just can’t stand the voice of one of your friends. And then there are times where you are searching through the chat history and you need this piece of information: should I buy milk or was it yoghurt - but you can’t find the answer within all these voice messages.

The Solution: A Speech-2-Text Bot

Speech recognition is considered an AI-hard problem And now I had this problem and wanted a solution. And while writing a speech recognition engine from scratch would have been an ambitious undertaking for my weekend project it seemed reasonable to try to build a chatbot that has some help from… Watson.

If we measure the intelligence of a bot we either compare to human intelligence or the smartest bot on the planet. The smartest bot on the planet is of course Watson. Watson was able to win jeopardy (must be the smartest) and it was broadcasted on TV (so we know Watson). Easy. ;-)

Setting up Watson

Watson has an API. I signed up for 30 days trial. The hardest part was actually getting the account working. A rule of thumb is that if you haven’t seen a feature in the Web interface yet, it will not work in the cl tools. So first setup your company and region to avoid errors and click a little bit in the Web UI. Then follow this guide:

The documentation is quite easy to read and understand. At some point you create a speech-to-text service through the command line. Head back to Web interface, find the newly created service and obtain the credentials. Done.

Setting up a Telegram Bot

Setting up a Telegram bot is super easy and straight forward. First, add the BotFather to your contacts. use the /newbot command and follow through the instructions. Write down your API token. Done.

An Ai Bot in 50 Lines of Code

So here are the 50 lines of code that enable your bot to understand speech:

Short and hacky enough for a hackathon.

Try It Out

You can try out my bot i.e. Watson by adding @speech2textbot (https://telegram.me/speech2textbot) to your contacts or any chat or groupchat in telegram. Supported languages are: ar-AR, en-UK, en-US, es-ES, ja-JP, pt-BR, zh-CN.

When added to a group chat it will automatically create a transcript of all voice messages. You can also forward voice messages to it if not added to the conversation. The result should look like this:

The code is open source, feel free to use it and create your own bot that for example translates voice messages into different languages: Speech-To-Text-> Translation -> Text-To-Speech. All APIs are available through Watson. Share your ideas or requests in the comments.

Conclusion

  • Building intelligent bots is not too hard
  • Providing different interfaces such as speech input makes the experience with your app more comfortable and interesting
  • Having the right use case is key
  • Apps and bots have different context. It makes no sense to turn every app into a bot
  • As so often, don’t trust the hype. While the science behind AI is making insane progress the practical results are not as good as in the Hollywood movies. Don’t be disappointed.
  • Try before you buy: use my bot to test how good Watson really is ;-)

We’ve come a long way with apps- give bots a chance:

Featured CBM…

About the Author

Software Engineer, founder of Autobeat Player: a new music app, studied at 4 universities simultaneously, Masters Thesis in Computer Vision at MIT.

He has a blog on bitcrunch.de and is now trying out Medium.

👏👏Clap below to recommend this article to others👏👏

--

--