Social researchers examine chatbot Eli’s performance: findings and future plans

In late 2020, UNESCO Institute for Information Technologies in Education (UNESCO IITE) launched Eli, an AI-powered educational chatbot. Thanks to Eli, hundreds of young people have received stigma-free and anonymous answers to their questions about the physiology and psychology of puberty, relationships, and sexual health.

In 2021, the number of user interactions with the chatbot exceeded 300,000, and the project attracted researchers from the Higher School of Economics Centre for Modern Childhood Studies. A group of social scientists set out to analyze user interactions with Eli to determine how the chatbot’s performance could be enhanced. The team of Eli’s creators were happy to support the initiative.

The idea was to examine a representative data sample (consisting of anonymized user conversations with the chatbot) to identify questions that Eli was unable to answer straightaway and identified as “failed interactions” and to find ways to limit such failures. The initial review indicated two main reasons for “failed interactions” – either the required information was missing from Eli’s knowledge base, or the query was worded in a way the AI could not recognize.

The two main objectives of the study were:

  • to identify popular user questions which do not have answers in the chatbot’s knowledge base;
  • to identify the types and specifics of questions asked by users in a way the chatbot cannot recognize.

The researchers randomly selected 5500 failed interactions with the chatbot and grouped them into three categories: 0 – “noise,” i.e., a meaningless combination of words or letters; 1 – queries about something absent from the chatbot’s knowledge base; and 2 – queries on topics covered in the knowledge base but not recognized by Eli. It was found that 40% of failed interactions were “noise,” 19% concerned topics that are currently not in Eli’s knowledge base, and 41% were about covered topics but expressed in terms that the chatbot failed to recognize.

Growth points: navigation and Eli’s background

More than half of the questions which Eli failed to answer due to lack of information concerned the chatbot’s operation and functionality (35%) and popular psychology (23%), e.g., temperament types, self-help techniques, emotional intelligence, preventing burnout, and chronic fatigue, and managing one’s emotions.

As for other topics, such as romance, family planning, reproductive health, female and male physiology, and sex, the proportion of unanswered questions never exceeded 10%, which means that Eli can recognize and handle these popular questions quite well.

Graph 1. Distribution of questions for which the chatbot currently does not have answers

User queries about the chatbot’s operation mainly concerned navigation and communication, e.g., how to return to the main menu, take one step back, pause a conversation or change the topic. In addition, some users wanted to know more about the chatbot’s “identity,” history, and developers:

“What is UNESCO?”

“Hi, why do you have this name?”

“How old are you?”

“Who trains you?”

Some users want to know more about how the chatbot works, they can be concerned about the privacy of their interactions with the AI and wonder how Eli recognizes and processes queries:

“Am I talking to a live person or a chatbot?”

“How do you operate?”

“Sometimes I feel that chatbots are just people on the other side of the screen”

“Can you read a voice message?”

Eli explains all these things briefly in the welcome message, but apparently, that is not enough for the target audience.

Psychology and relationships

As for the educational component, most unanswered questions concerned general psychological knowledge, e.g., stress, fatigue, temperament, and personal boundaries. In many instances, users sought extensive information on matters such as interpersonal communication, confronting verbal aggression (including trolling and bullying), resolving conflicts with parents or in a couple, online dating, and dealing with passive aggression:

“My parents are telling me that I am selfish and arrogant, and therefore I do not have friends. Like I put myself above others, but it is not true at all! In fact, I am often afraid to say something wrong when I talk to someone.”

“How can I talk to my parents about how I feel?”

“What is the best way to deal with a bully after both my and his parents and our coach have talked to him, but his mean jokes and bullying continue?”

“My boyfriend rarely texts me, and I miss his attention.
Could you share some relationship lifehacks?”

It turns out that not only teens and young people but their parents as well ask chatbot questions, e.g., about the best way to talk to their children about sex and being safe, and seek general parenting advice:

“When and how should I talk to my son about sex education?”

“Hello! What should I do to raise a good person?”

“I am raising a granddaughter who lives with HIV”

“What can I do when my son refuses to talk to me openly and honestly?”

It is common for users to ask questions about family planning, gender roles, or whether not wanting children is normal. Some user questions concern education and future career, school life, and preparation for exams.

By capturing and analyzing user queries, Eli’s team has developed a better understanding of the audience’s needs that will inform an update of the chatbot’s content policy.

Lost in translation: when AI cannot recognize queries covered in its knowledge base

Most queries that Eli has failed to answer, although the information is in its knowledge base, concern relationships (36.3%) and psychology (16.4%). The problem is usually in the way querents ask such questions: they use long sentences, add unnecessary details and repeat the same thing over and over in different ways as if talking to a live person. In such cases, despite extensive prior training in answering user queries formulated in different ways, the chatbot struggles with the multitude and variety of keywords and fails to recognize the content.

Perhaps by using long and convoluted sentences in asking Eli about relationships, users are worried about demonstrating their fear, vulnerability, and perceived incompetence in such matters. Another reason why some people find it difficult to formulate queries about being in love and relating to a romantic partner may be that this experience lies mainly in the sphere of emotions and rarely gets expressed in verbal constructions which AI can process.

Oxana Mikhaylova

Junior Research Fellow, Center for Modern Childhood Research

Conclusions and future plans

Based on findings from the above analysis of user interactions and advice from the researchers, Eli’s developer team will be working to further improve the chatbot’s knowledge base, architecture, and logic.

The updated version will focus more on Eli’s ability to answer questions about its own operation and history, e.g., who developed it and why, how it works, who its main audience is, and how it ensures the confidentiality of user interactions. The chatbot’s navigation will be streamlined to make it easier for users to switch between topics and levels of information so that Eli’s answers match precisely what the query is about.

Popular queries on the topics “Psychology” and “Relationships” will inform a separate new project – an interactive educational course. UNESCO IITE, in collaboration with Eli’s team and the DVOR digital media outlet, will produce a series of educational materials for the course; the most popular and challenging user queries will be examined in detail with the help of thematic tests, visuals, and mini-chatbots. This interactive course will be designed to help young people find answers to complex questions about aggression, bullying, discrimination, personal boundaries, shame, self-image, family and couple communication, and much more. Meanwhile, Eli will focus on what it can do best: give concise, informative, and unambiguous answers to questions about physiology, sexual and reproductive health, and safety.

Thanks to sociologists of the HSE Center for Modern Childhood Research, the chatbot’s team gained a new perspective on Eli’s strengths and weaknesses and received helpful advice on how to make it even better. The chatbot is currently undergoing a major upgrade. A new version of Eli will be launched this autumn, and the partner project DVOR will be posting an interactive course series on psychological, ethical, and value-related aspects of interpersonal relations on its platform before the end of 2022.