Rana el Kaliouby on teaching computers to read our emotions

Read the full transcript of our Science Focus Podcast interview with Rana el Kaliouby about making machines emotionally intelligent – listen to the full episode at the bottom of the page.

Published: August 17, 2020 at 8:10 am

Amy Barrett: So Girl Decoded was published earlier this year by Penguin Business. Can you tell me, what is your book about?

Rana el Kaliouby: So my book is a memoir. It's a juxtaposition of my personal journey intertwined with my journey building emotional intelligence into technology.

AB: What made you actually want to start writing it?

ReK: So the initial idea was to talk about emotion A.I. or artificial emotional intelligence and kind of tease apart the different applications of the technology and the ethical and moral implications of building technology like that. But very early on, I remember meeting with the publisher Penguin, Random House, and the editor there said, you know, your story is really fascinating.

I grew up in the Middle East, found my way to the US by way of studying in the UK, actually. Ane he said, that's the story, you got to interweave your personal stories. So it ended up being this, again, kind of inter woven mix of my personal background and how I went from what I call "a nice Egyptian girl" to a CEO of a tech company. And the story of the technology as well.

AB: And what some of the biggest challenges you say you faced to getting where you are today?

ReK: I think the biggest kind of challenge is that I was always kind of doing some... I'm a misfit. Right. Like, I grew up in the Middle East, but I really wanted to be a computer scientist. I left home to do my PhD, which was quite unusual at the time because my husband at the time had to stay back in Cairo for work.

And then I came to the United States, thinking I'd be an academic. But then quickly decided to be an entrepreneur. So I think it's just I've taken a very unusual path. It's not what my family expected and it's not what society expected, I guess.

And so, you know, I'm a tech entrepreneur and CEO in a very male dominated community. And so I had to kind of figure that out, too. Like, we've raised over 53 million dollars of venture funding, but I'm often pitching – actually almost exclusively pitching – to male investors. And I run an emotion company. Right. So there's this like... it's not your unusual path. And I've had to figure out how to navigate that.

AB: And you've mentioned the phrase emotional intelligence. Can you just kind of explain what that actually is?

ReK: Yeah. I often like to start with human intelligence. So when we think about human intelligence, there's your IQ, your cognitive intelligence, which is, of course, super important, but your emotional intelligence is equally important as well. And people who have higher EQs tend to –they're more likeable, they're more persuasive, they're better in their personal, professional lives.

And I believe that that's true for technology. You know, technology is becoming mainstream. Obviously, A.I. is becoming mainstream and it's taking on roles that were traditionally done by humans, like helping with your healthcare and productivity and maybe driving your car or hiring your next co-worker. Well, guess what? It's important for that technology to be automated and effective and efficient, but it also has to be human-centric. It has to understand people. So I'm all about like marrying the IQ and the EQ and technology.

AB: There's a lot of controversy around IQ and how it's measured and things like that. Is that the same problem with EQ?

ReK: I would say there aren't like standard measures of EQ. The definition of emotional intelligence is a person's ability to understand their own, but also others' emotional and nonverbal signals and be able to adapt in real time to this information.

There are a number of tests that measure EQ. But, you know, nothing's standardised. And I honestly, I agree with you. Like, I think it's hard to come up with a test that works for everyone.

I remember when I first started my PhD at Cambridge, I got connected to Professor Simon Baron-Cohen at the University of Cambridge. He still runs the Autism Research Centre there, and he had the 'windows into the eyes', so the empathy task. Which is basically these images that are just the eyes region and you select like what emotion that image was showing. And it's such a hard test. It's fascinating. And I try to build an algorithm to pass that test and I've yet to do that. And, you know, 20 years on, it's really hard. But humans are kind of generally good at it. So I yeah. I don't know what's the best way to measure EQ? It's a tough one.

AB: But you've mentioned autism there. What's that relationship with EQ?

ReK: So individuals on the autism spectrum really struggle with reading the non-verbal signals of other people. And actually even connecting with their own emotions. When I got to Cambridge, I gave a presentation about like the challenges of building emotional intelligence into machines and how can I build a computer that can read all of the different facial expressions that people make. And someone in the audience said, "you know, you got to look into autism". And that was how I got connected to Simon Baron-Cohen's lab.

And the project that then, afterwards, brought me to M.I.T. was I wanted to design a device like Google Glass that has a little camera and kids on the spectrum could wear it and it would give them real-time feedback on the emotion of the person they're interacting with.

And so that was the proposal that brought me over to M.I.T., actually, and we ended up building these glasses and now it's being commercialised by a partner company of ours called Brain Power. And we're seeing that the kids are really improving when they're using these augmented glasses and they're making better face contact and they're better able to read these nonverbal signals.

AB: And does that improvement continue without the glasses?

ReK: That's the question. So that's totally the question. So Brain Power, that company that does in fact use Google Glass and our technology, they've deployed it in about 400 different homes. And while we're seeing a lot of progress when the kids are wearing the glasses. The question is, does this generalise when when they take it out? Is it a training tool or is it a prosthetic? And, question mark.

AB: I see. And so let's go back to the actual tech that you're talking about, because it's something that can look at an image, a moving image of the face. How does it take that and turn it into some emotion that someone's feeling?

ReK: So we use an approach called, in artificial intelligence, called machine learning or deep learning. And basically, the idea is we feed the algorithm hundreds of thousands of examples of people smiling or smirking or frowning. And the more diverse the data, the smarter the algorithm is going to be.

And the algorithm essentially looks for all of the things that are similar between all the smiles. So maybe it's like your lip corners are pulled upwards and you can see a little bit of teeth and all of the things that are common, you know, in an eyebrow furrow. So it's like these wrinkles between your your eyebrows. And it learns.

And so the next time it sees an image or a video of somebody it's never seen before, it's able to say, 'oh, the lips are turned upwards and outwards and I see a little bit of teeth. It must be a smile.'

And you repeat this, you know, you rinse and repeat, always improving the accuracy of the algorithm, but also improving the repertoire of what the algorithm can read. So when we when we first started, it could only read like a few expressions. It could do like a smile or a frown. And now it can read over 30 different facial expressions and mental and emotional states. So, you know, we keep improving it.

AB: But those expressions, do they directly correlate with emotions? Because there are some people who are less expressive, more expressive, how do you deal with that?

ReK: This is actually a great question because there is very little controversy around these building blocks or these facial expressions. So in the late 70s, Paul Ekman and his team, they published the facial action coding system, which maps every facial muscle movement to a code.

So when you smile, it's action unit 12, which is the zygomatic muscle, the frown is action unit 4, it's the currugator muscle and so on. And to become a certified face reader, you go through 100 hours of training and then you're able to accurately say, OK, I see, action unit one plus two plus seventeen. Right.

So there's very little controversy about that. Where it becomes a little tricky is then mapping that facial expression to the underlying emotional state. Right. So if I see you curling your lips upwards and you know, there's these crow's feet wrinkles around your eyes. Can I say that you're happy? Well, you're smiling, but you may or may not be happy, right. And I think that layer of inference is really complex. You have to consider the context. You have to know a little bit about the person and what they're doing. You have to maybe even consider the temporal like how it unfolds over time. Maybe other gestures or vocal intonation. So I think it's actually a really complex problem. And we were working on it, but I wouldn't say anybody's cracked it yet.

AB: I mean, there's so many humans that still... Like there's so much going on. I still don't always feel like I can read someone accurately. How do you go about making a machine do that? What humans can't.

ReK: Yeah. So we've got our certified, we have about 50 of these facts, certified face coders, and they are trained to annotate these videos. We also like tap into self report. So we will sometimes ask people to report what they felt during a certain experience. And we use that as the ground truth.

But essentially, you know, the algorithm needs labelled examples so that it knows, "OK. That's like a red apple and that's a green apple" or "that's a smile and that's a smirk". So, yeah. So the approach we currently use is to use data annotators. And their job is to, you know, watch these videos day in and day out and annotate four different expressions.

And, you know, we're also interested in cognitive states. So states that are not necessarily what you typically think of as an emotion, but they're very important, like we do a lot of work in the automotive industry. And we're interested in, like, driver drowsiness or distraction. So, yeah. So if you start seeing a certain blink rate or, you know, your head is bobbing. That's a very distinctive telltale sign of drowsiness. And we all know that. Right? So we're trying to build algorithms that can detect that.

AB: My goodness. So there are lots of real world applications, but it does seem to me that there could be some dangers associated with the technology as well.

ReK: Yes. So like any technology, you know, it's neutral valence. It can be it can be used for good and it could be abused. This technology could...

There are, you know, there are amazing potential for good. You know, there's applications in mental health and autism, detecting stress and anxiety and Parkinson's, making our roads safer. Like the list goes on and on.

But there are definitely applications where it could be abused. And for me, we think about it as the ethical development of emotion AI. But also the ethical deployment.

The biggest concern I have right now is accidentally building bias into these algorithms. Right. And the way you, you know, you accidentally do that is if your data is biased, the algorithm is biased. And if the algorithm is biased and then you go deploy it at scale because it's technology, you can very quickly deploy it all around the world. Then you've now perpetuated biases that exist in society. And you've done that, you've multiplied that, you know, at scale.

So that's something I'm really worried about. And we try, at Affectiva, my company, we try to really be thoughtful about the diversity of the data, how we approach this problem to avoid bias, but also the diversity of the team, because we we each have our own blind spots. And yeah, I don't know, like, you know, the more diverse the team, the more perspectives you have. And we've seen examples on our team where, you know, people were like, "I don't see anybody in this database that looks like me". And we're like, oh yeah, we didn't realise that. So I think it's very important that we underscore the importance of diversity when it comes to building AI.

AB: That makes sense because, I mean, facial recognition software famously has some biases and it still struggles now with people of colour. kike, how is that going to relate to, I guess your technology must rely on some face facial recognition, is that right?

ReK: So facial recognition uses the same underlying techniques that we use. So a lot of computer vision and machine learning. The difference is with facial recognition, you're interested in identity recognition. We don't really care about who you are. We just want to understand what your experience is like. How are you feeling? What's your experience like?

However, you're absolutely right. There has been a lot of evidence that face recognition systems are, you know, because they're primarily trained on, you know, middle aged white guys. It may or may not generalise two women like the two of us or women of colour. Right. But, you know, somebody like, that's brown skin like me. Right. And so that's a real issue.

And again, the way to combat that is really in the data and just being thoughtful about the data we use to train these algorithms.

AB: And are there any ways that the computers could become, that AI could become emotionally intelligent, that isn't reliant on faces? For someone that, you know, there's people that I know that have facial deformities. How is the technology going to get around stuff like that?

ReK: Yeah. So the breakdown, as it turns out, the way we communicate our kind of emotional and mental states is only 10 per cent of the signal is in the choice of words we use, 90 per cent is nonverbal. And it's almost equally between your facial expressions, your gestures and also your vocal intonations. So how fast are you speaking? How much energy is in your voice? So that's that's an important signal, too. And we've been investing in that, you know, in this multi-modal approach. So the best result is going to be when someday we can combine all of these signals just the way we as humans do.

I would say I mean, this entire field is still very nascent. So there's a lot of work to be done, but that's what makes it exciting. Yeah. But sometimes, you know, sometimes the face is available, sometimes it's not. Sometimes the voice is available, sometimes it's not.

So the more information you have, the better.

AB: What prompted you to start working on this?

ReK: I would say, I mean, I grew up in a family of technologists. So both my parents, were in technology. I grew up in Cairo and around the Middle East. So I've always been comfortable. You know, my dad would buy all these, like the latest gadgets and video gaming consoles.

And actually, one of my earliest memories is my dad bought one of the very first like video VHS cameras. And he would seat me on my blue chair. It was literally like a plastic blue chair. And I would just, like ramble. I was probably like three or four years old. And I would just like talk. And he would record all these videos, I call it, it was like my first practise at a TED talk.

So I grew up very comfortable with technology. But I've always... For us, technology was about bringing our family together. So it was it was more about the human connection than it was about the actual technology. And I would say that's been a common thread throughout my career.

But the real 'aha!' moment came when I moved from Cairo to Cambridge, UK for my PhD and realised that I was spending more time in front of my laptop than I did interacting with any other human beings. And I realised, wow, this computer has absolutely no clue how I'm feeling. And even worse it's like the most, it's the de facto mode of communication I have with my family back home. But all of this nonverbal communication and all of these like rich signals, were kind of lost in cyberspace. I don't know. Like, it created an illusion of a connection. Right. I felt like I was connected. But it wasn't a real substantive connection. Yes.

So that set me on this path. I started asking, like, oh, wow. Wow. Like, what if computers can recognise human emotions just the way we do? Yeah. That incident was over 20 years ago.

So I've been... It's a long journey.

AB: In terms of that kind of interaction between two people on the other end of, whether it's two computers or two phones, how would my phone being emotionally intelligent actually change the way I text someone or how is that going to factor in my day to day life?

ReK: Yeah, there's a lot. I mean, if you think of social media platforms today, a lot of it is emotion blind. Like, you send out all these, like, text comments and tweets and whatnot. Yeah. Sometimes you can add a little emoji at the end to kind of clarify what your intent is. But most of the time, also you send this message out and you have no idea how it falls on the recipient. Right? You don't know if it's hurtful, if it's interesting, if it makes somebody laugh. So we're missing all of that layer that you would get if we were together in a in-person conversation.

So I think, you know, I think social media platforms need an overhaul and just need a redesign, a human-centric redesign that considers all of these rich nonverbal cues. And I actually think that will result in more empathy. Or I hope it will result in more empathy and less of this empathy crisis, that I feel we're experiencing.

AB: But will it be that like, say, a social media site could know how I'm feeling, even though I don't know it myself?

ReK: No. So that's a great conversation to have, because I think it's going to be so important that we designed this with opt-in and consent, and that people have final control on what gets visualised or what gets communicated. I think that's actually really key. So everything we've done so far has been on an opt in and consent basis.

And I think that's really it's a critical part of deploying this technology. But I'll give you an example. So the book came out on the 21st of April. And, you know, right in the midst of this pandemic, I was supposed to do like an actual book tour and meet people and present to people. And so we had to pivot to doing all of this virtually. And the format it usually takes is I'm in a book conversation with a moderator, and there's usually like hundreds, sometimes even thousands of people tuned in and watching. But I can't see them. Like if I was with them in the same room, you would riff off of the energy of the audience. You can personalise your answer. You can't do that virtually. And I find it like really unsettling. So if emotion AI was integrated into these platforms like Zoom or, you know, all these other livestream platforms, I keep envisioning like a real time graph or some visual – that's very simple and easy to consume – that says, oh, yeah, people are laughing right now or they're bored to death, like pick it up a little bit.

AB: Well, sounds kind of scary. You know, in a big talk and I can tell half the audience is asleep.

ReK: Yeah, I know. Maybe, you don't want to do that. Sure. Right. You don't want to know.

AB: How far we actually away from this being a reality?

ReK: So the technology's there. I mean, a lot of people don't realise, we are deployed in 90 countries around the world. The first product we brought to market was in the media analytics space. So we are able to, again, with people's consent, understand how people respond to online video content, be it an online video ad, could be a movie trailer, a TV show.

We're able to quantify anonymously like we don't, again, we don't want to know who you are, but anonymously, we aggregate everybody's responses and we're able to see moment-by-moment, how did people respond? Where they engaged, where they offended? Were they laughing? And, you know, a quarter of the Fortune 500 companies use this technology every day to assess the emotional engagement their users and consumers have with their content. So the technology is out there.

We're very focussed on the automotive industry. So we're trying to integrate this into cars to detect driver safety and distraction and drowsiness.

So that's, I would say, a couple of years out. But then there's a lot of applications in mental health. We even talk about like, smart home appliances.

Like imagine if you walk up to your fridge – I think we need this actually during this pandemic – so imagine if you walk up to your fridge and it says, you know, "you're going to have your third Ben and Jerry's ice cream tub. I'm not going to let you do that." And it just locks itself up.

AB: I definitely don't need that during, I mean, I need my Ben and Jerry's during this time. Other ice cream options are of course, available.

ReK: Right. Right, right.

AB: But during this time, like, you know, we've all been so separated from people, from face to face things. So it seems like now is when we need empathy in our machines the most, right?

ReK: Absolutely. I mean, I feel I've been teleported, you know, that time in Cambridge when I felt so lonely and so homesick and just so disconnected despite having access to technology. I feel we've been transported back into that same moment where we're connecting with our teams. Virtually my kids are learning virtually. I'm connecting with my family virtually. And we're craving for this human connexion. Right.

We want to have the sense of a shared experience. And sometimes it's so hard to do that using technology, even, you know, even I mean, I'm so grateful for platforms like Zoom and other video conferencing platforms because it has connected us, but it's not the same.

And I think coming out of this pandemic, we're gonna see a lot of innovation, technology, innovations that, you know, bring these videoconferencing platforms into the next level. So, yeah, I'm excited to see what comes out of this.

AB: And you've mentioned the term affective computing. Can you just explain what that means?

ReK: Yeah, affective computing. So 'affect' is kind of a synonym for emotion. And my mentor and co-founder, Professor Rosalind Picard of M.I.T., she wrote the book Affective Computing in 1996. I read it a few years later. And there she posits that computers and technology need to be able to understand and adapt to affect.

So that could be, again, your facial emotions, your physiological signals, like your heart rate or, you know, stress levels, but also your vocal intonations and your gestures. So she was really the one who coined the term. And I read the book and it changed the trajectory of my life.

AB: We've talked about if AI can be empathetic, but for empathy, at least in humans, it's that we share emotions. We can recognise our emotions and someone else. Does this mean AI have to be capable of feeling their own emotions?

ReK: I do not think so. I do not think that for machines to have empathetic responses that they need to have... You know, we're simulating empathy, basically. And I think that that's important. But it doesn't mean that, you know, your toaster, your emotion-enabled toaster has to have emotions. Because then we run the risk of the toaster saying, you know what? I'm taking a break today. I'm so exhausted. Eat something else.

AB: And you were named by Forbes as one of America's top 50 women in tech. But you've mentioned that it's still a very male, very white male industry. How does that kind of make you feel? Where do you see the industry going?

ReK: I am very passionate about bringing more women in technology, but not just women, just more diversity in technology. Could be, you know, gender diversity, ethnic diversity, diversity of backgrounds. So we need to make it more accessible for non-techies and also diversity of age.

Like I'm very passionate about involving young people into the whole AI conversation because they need to be part of how we design all of this, because they're going to be the ones who use it the most.

But with gender diversity I'm very, very vocal, I'm very actively involved in a number of organisations. There's one called All Raise, it's mostly U.S. based. But we focus on bringing more funding towards female founders, but also creating more of an ecosystem of female investors, because that's what, we need that, too, right?

So, yeah, I think we have a lot of work to be done. But I think it's very it's very needed. And it's it's an every aspect. It's in the founder community and the investor community at the board level, the pipeline level, like the young girls and women who are interested in technology. It's everywhere. We need to work all angles of it.

AB: What gives you hope the most?

ReK: Generally speaking or with regard to diversity? I'll start with diversity, actually, I do feel that the world is at this moment of reckoning where we are realising that there are just so many systemic injustices and we have an opportunity to really fix that. And so I'm excited about that. I also feel like perhaps one of the silver linings of this global pandemic is that we're all going through this at the same time. We're experiencing it in very different ways. But I do hope that this is an opportunity to rediscover and re-celebrate empathy.

AB: And then what's next for you? You've got the book out, are you writing another one?

ReK: Not yet. Although I didn't think I'd have another book in me, but I have a backlog of book ideas. Now it's... I'm not ready to start writing another book. But I'm you know, I originally thought that writing this book was going, you know, that I write the book, launch the book, and then it'll be the end. And I'm finding that the book is the beginning. And I don't know what beginning is it, but I'm excited to embark on this new journey with the book. It's starting a lot of conversations and connections with all sorts of people all around the world. So I'm very grateful for that.

AB: And if someone wants to go in, test out your emotionally intelligent AI. Could they do that?

ReK: Oh, yes. We have an interactive demo on our website so they can just go to www.Affectiva.com and give it a try and do let us, yeah. If you give it a try, let us know what you think. And often this sparks a lot of ideas in people's brains on where they would like the technology to be used, and I love hearing all of these creative ideations.

AB: What's the best use for you personally?

ReK: I think the biggest potential honesty around mental health. So as it turns out, there are facial and vocal biomarkers of things like stress, anxiety, depression, even suicidal intent, Parkinson's. So I think there's a lot of opportunity to use this technology to understand a person's baseline. And then when they deviate from it, the technology can flag t to the individual, to a loved one, clinician. Lots of privacy questions, but I think there's a lot of potential there.

AB: Mm hmm. Have you found that your own, kind of, EQ has increased over the years?

ReK: I think so. There's a moment in the book where I talk about – I'm divorced – I talk about how it's pretty ironic how I spent my entire career teaching technology how to read emotions, and yet I missed that my husband at the time was so unhappy.

I totally, I mean, we have no video footage of this, but I think I was listening to his words when he always said, "oh, it's OK. It's fine. You go do your PhD, you go start a company." And I just wonder what the non-verbal were, I, you know, was he, where his nonverbals different? And I don't know the answer to that.

So I think, yes, this journey of teaching machines how to have better EQ has taught me to both listen better and watch for these nonverbal signals better, but also, like, embrace my own emotions. I really didn't do that throughout my career. It's only in the past few years, as I started writing the book that I realised there's like real power in being authentic, both to one's self, but also being vulnerable to others.

And I've totally embraced that in the book and also in the way I lead lead the company and my team.

This podcast was supported by brilliant.org, helping people build quantitative skills in maths, science, and computer science with fun and challenging interactive explorations.

Listen to more episodes of the Science Focus Podcast: