A collaboration between the College of Communication and Information PhD student and School of Information Sciences professor has resulted in innovative research that reaches into the future of AI-powered voice digital assistants and explores how it can be more diverse, equitable, and inclusive.
This research resulted in a co-authored paper and an award for the doctoral student, Jessica Barfield. Barfield was named the 2021 Emerging Social Informatics Researcher Award from the Association for Information Sciences and Technology (ASIS&T) Special Interest Group in Social Informatics, in part for the paper, “Hey There! What Do You Look Like? User Voice Switching and Interface Mirroring in Voice-Enabled Digital Assistants (VDAs),” which she co-authored with SIS Professor Dania Bilal. Barfield submitted her whole portfolio of work she completed during her first year at CCI, which is what garnered her the award. The paper will appear in ASIS&T proceedings and will also be presented at the ASIS&T 2021 Annual Meeting in November.
“It means a lot to me personally because it is an emerging scholars award, and to receive this kind of validation, that everything I did in my first year was noteworthy, and to be recognized so early in my career with about three years to go in my program is really exciting,” Barfield said. “And I’m also super grateful to SIS professor, Dr. Dania Bilal, and everyone who gave me a platform to conduct research so early on; some programs don’t let students get involved in research until much later in their program.”
Their research explores why people choose to change the voice on their digital assistants such as SIRI, Alexa, and Google, but adds a new twist: using photographs of people of diverse backgrounds and cultures as embodied voice interfaces (EVIs). Voice switching behavior was coined by Dr. Bilal and this research is the first on this topic.
“Voice switching was an idea I had from my master’s program. I started working with Dr. Bilal the summer before I started the PhD program and we were able to fine-tune our research and add an embodiment to make the voice more tangible to users. Dr. Bilal had the idea of having the photos of diverse people, and that idea came from another [SIS] professor’s paper, Dr. Jiangen He, who did something similar,” Barfield said. However, in this research, we used the photos to embody the voice interface, exploring user preferences and decisions to switch or not switch the voice interface.
While there is a lot of research on voice digital assistants’ (VDAs), there isn’t any focusing on user switching behavior in voice of digital assistants, Bilal said. All digital assistants provide a variety of voices in different genders, languages, and accents, but not age, race, ethnicity, or cultural backgrounds so Bilal and Barfield investigated the behaviors of people surrounding voice switching.
They conducted a pilot study that resulted in the aforementioned paper, which was published in and also presented at the HCI International conference in July. The participants used in the study published in ASIS&T proceedings were recruited from Amazon mTurk and had diverse backgrounds., Their results showed that adding an image to a voice did, indeed, affect users’ preferences in selecting the embodied voice interfaces and decisions in switching the voices on digital assistants.
“We coined a new term, interface mirroring, which means that today people want digital assistants that match their age, gender, and, desire people with certain looks as well,” Bilal said.
As AI becomes more and more a part of daily life for much of the world, there is a need to ensure that it is created in such a way that it reflects the users’ needs and wants; and, that diversity, equity, and inclusion are considered by designers when creating personalities for these VDAs and other AIs. Because the evolution of AI devices will continue, it is likely that images will be added to VDAs and other AI systems in the future as a way to humanize and personalize the technology. Once you humanize a system, then the actual humans using it need to also be considered.
“We live in a world that is global and very diverse, and people have different needs and preferences. AI is known for having an algorithmic bias in its data.. In order to provide equality and inclusivity in the design, we need to make human-centered voice digital assistants that are more inclusive and universal to ensure the design is fair and equal,” Bilal said.
Bilal and Barfield said the research received high ratings from ASIS&T reviewers and they’re now continuing that research into the next phase. This phase will observe specifically how users’ behaviors in voice switching are affected by their own racial and ethnic backgrounds as well as the racial and ethnic makeup of the people featured in the photos matched with the voices.
“There are new articles every day about technology that’s coming out and trying to see what we can do to make these technologies more diverse and easy to access, it’s an exciting time,” Barfield said.