- What role AI plays in healthcare?
- How to create AI ethics programs?
- How to define fairness?
- How can fair synthetic data help create fair models and predictions?
- What's next for the ethical AI team and their algorithm?
Alexandra Ebert: Welcome back to the Data Democratization Podcast. I'm Alexandra Ebert, your host and MOSTLY AI's Chief Trust Officer. This is episode number 33. Today you're in for a real treat. I'll be joined by not one but two guests from Humana, one of the largest US health insurers and an organization that is not only at the forefront of deploying artificial intelligence at scale, but also a clear leader when it comes to governing the responsible, ethical, and fair use of artificial intelligence across an enterprise organization.
It's my true pleasure to welcome Laura Mariano, Humana's Lead Ethical AI Data Scientist, and Principal AI Architect Brent Sundheimer on the show. Today, the three of us will discuss the potential of artificial intelligence in healthcare, also, best practices of building up an AI ethics program at scale with a particular focus on AI fairness, and the roles synthetic healthcare data has to play for both privacy-preserving machine learning, as well as for improving fairness in artificial intelligence.
Lastly, and you won't want to miss that, Laura and Brent will tell you about our joint project on fair synthetic data to promote fairness and equity in healthcare resource allocation. Make sure you stick around until the end. With that said, let's dive right in.
Hi, Laura. Hi, Brent. It's so great to have you on the podcast today. I was already very much looking forward to our conversation. Before we jump into today's topic, which will be AI fairness in the healthcare sector, could you both briefly introduce yourself and maybe also share with our listeners, what makes you so passionate about AI fairness, AI ethics and the work you do?
Brent Sundheimer: Yes, yes. Thanks so much. Thanks for having us on here. I am Brent Sundheimer. I'm the Principal AI Architect here at Humana. I lead our ethical AI team. Specifically, I've been in AI for a number of years. What I've seen more and more is that there is a huge potential for AI, both in and what it can help us do and how it can help our society, but it's become increasingly clear that that same technology also has the possibility to cause major harms by magnifying disparities that are here today.
In order to actually advanced AI, we really need to be very conscious about how it's applied in order to prevent those harms. It's not easy. I think that's just something that it makes it a very interesting problem, it also makes it very, very important. I'm very happy and proud to be able to work on this.
Alexandra: I can understand and so great to have you here today.
Laura Mariano: Yes, thanks for having us. Laura Mariano. I'm the Lead Ethical AI Data Scientist at Humana. I've also been in this data science machine learning role for many, many years, back from before it was really so popular and everyone knew what data science was. I've worked in a lot of different domains and because of that, I've seen as Brent said, the incredible impact and potential that AI has, but also, at the same time, the potential risk of harm that could come from it, especially when it's used in a naive way and because of that, I've become very passionate over the years about ensuring that AI is used responsibly.
That's what led me to Humana to work, particularly in the healthcare industry on this topic because it's so important as AI becomes more ubiquitous in every industry that we need to start developing these processes and frameworks to make sure that it's used in a responsible way.
Alexandra: I can only agree that's super important to get right. Maybe another question before we dive into today's topic. This won't be necessary for US listeners, obviously, as Humana is the fourth largest health insurance in the United States, but could you briefly give an introduction about Humana for our European listeners, maybe Brent?
Brent: Yes. Humana, we focus on whole-person health for our members. We have millions of members with a large number of those coming from Medicare, which in the US is an insurance program for folks over 65. Through that we're really dedicated to improving the health outcomes for the people that we serve. We do this through a growing network of primary care facilities, in-home care, programs which seek to really help those with poor health to try to get better health outcomes.
Part of how we do this is creating augmented intelligence systems, which help our clinical staff sort through these just absolutely massive amounts of data that we have in order to find the best ways to help our members.
Alexandra: That's really cool to see how health care's approached at Humana, I always loved listening to any podcast episodes about your proactive approach to health care. As mentioned today, I would love to focus on AI fairness in health care, and also talk a little bit about our project on fair synthetic data that we did together. Just to set the stage, how do you see the role of AI in healthcare? You just mentioned massive amounts of data, but what's the status quo? Also, where do you see future potential?
Brent: Yes. AI is still fairly nascent in the world of healthcare. It's something that has been very cautiously approached because I think everyone is very aware that, especially when it comes to their health, we want to be very risk-averse in how we approach it. There are a number of things that we can do and that we do today, that can help our members or help patients in general, by becoming an aide to the doctors and nurses that serve people.
One of the main things we can do is look through the data and use that augmented intelligence to try to find people who are at a particular risk for some adverse health event. If you think about somebody with diabetes, it's a very complex disease that has a lot of ways that needs to be managed and an adverse event there might be somebody losing a toe. We can help to identify people who are at risk for these events and help to prevent that.
Another way that we're starting to see this be used more and more is to really be a more of an active aide to a doctor. Not just in the big data world, but also in the world of medicine, we have more data, more tests, treatments, technologies that can be used, and that's a great thing, but it also presents a challenge.
AI is really starting to come to the forefront to say how can we help reduce the workload, reduce the information load on doctors to make sure that they are seeing the most relevant information, and that we are augmenting their capabilities of these medical practitioners in a very transparent way?
Alexandra: That sounds fascinating. You mentioned that AI is a more nascent technology in the healthcare space, but still, when looking at Humana, I think it becomes apparent that you're really at the digital forefront, and that AI is also used at scale within your organization. Could you give our readers or listeners a few stats about the scale of AI within Humana?
Brent: Yes, so we've been doing this for a number of years. Now, we have hundreds of models in production, with more than 100 data scientists that are working on this every day. We do this across a wide variety of applications. Whether it be helping to advance our clinical programs, which help our members into our in-home programs, which actually help to go into the homes of people who maybe can't make it to a doctor, to also some of the more traditional applications, whether it be marketing or in finance, but we use it across the company. It is something that we are very focused on pushing forward here in the next few years.
Alexandra: Wonderful. Maybe, Laura, question to you, you've potentially answered this already quite some times, but why is AI fairness important, in general, but also in particular for Humana?
Laura: The concept of fairness itself for us it often comes down to a measure of how equitable the distribution of a positive or negative resource is. If an AI is involved in deciding or making a recommendation about who should receive that resource then great care has to be taken to ensure that the AI is distributing it fairly especially if it's operating without a human in the loop which is not the case for the AI that Humana develops and implements. It's so different than if a human alone is making the decision and this is especially important in healthcare industry because that resource is healthcare and these decisions can literally have life and death consequences.
Alexandra: Absolutely. Do you think that the industry is currently so far that already AI should be applied in an area where you say there is no human in the loop because I really love what, Brent and you stated that with Humana is always about augmenting the capabilities of doctors and having this human oversight component to make sure that AI is also helping, but not making independent decisions that have impact on so important things like healthcare?
Laura: I would say definitely not. The human should stay in the loop and I think most physicians, certainly practitioners, and even the members that we speak to all agree with that. They do not want the AI to be completely in control and in fact much prefer that if AI is going to be involved, that there's still a human in the loop making the final decision.
Alexandra: I can completely understand that. I think that's also the way to go currently and also something that will help the adoption and reaping all the benefits that AI can bring to this field and really freeing up the time of doctors to, as Brent stated, make sure that they can spend more time on seeing the relevant information points and doing the work that's most important. Maybe Brent, what's would you say are the biggest challenges when it comes to AI fairness?
Brent: I think that the biggest challenge comes in when we look at the landscape of ethical AI where we have lots of people have been thinking for decades really that they're very high-level philosophy everywhere from "How do we make sure that robots don't come and take over the planet?" to "Should we have people in the loop?" Then on the other end of the spectrum, there are techniques and these algorithms that people come up with that can help us create fair AI but in the middle, there is still a fairly large gap.
Data gap is how do we actually put this into practice? How do we decide what is actually fair? This is hard because you start to break it down. There are dozens of different ways that you could actually calculate fairness. Each one has its own nuance as to why it's appropriate. For instance, do you want to measure the accuracy of the AI? Do you want to measure how many people were positively identified with it? Do you want to measure it with respect to equity which is having the distribution of the results of the AI being dictated on the statistics of the population.
To each according to the need or equality where we want everyone to be equal and so you start to-- The deeper you dive into it, the more complex it is and it it's something that I think is a challenge.
Alexandra: I can understand with all these different, I think there were a few dozens mathematical fairness definitions out there. How do you actually then come to a fairness metric to apply and decide in a given scenario that's the way to go down? Is it regulations? Is it just a group of people carefully considering which fairness definition would be the best one to choose? How do you do that?
Brent: It's something that I think needs to be approached in very specific to the problem that you're looking at and specific also to the industry you're in. For Humana in particular, what we can do is start with our higher-level principles which say first we want to do no harm. What we can do then is look at each of our systems and say "Is there potential for harm?" Looking at through that lens, things become much more clear for us so that we can then choose how we are going to specifically measure it.
Being in healthcare, we can also say that most of what we may be doing, we want things to be equitable. Health equity is a very, very important philosophy that we have to the point that we have our chief health equity officer and so for us, we're looking to make sure that things are equitable for our members and it is not going to be causing harm and through that is how we do it. In other industries, you're going to have some very similar sorts of drill-downs that folks can use to get to a very specific definition.
Alexandra: Understood. Laura, Brent already mentioned doing no harm and that equity plays an important role in healthcare in general. What are the fairness concerns that you observe in healthcare that might differ when we look into other industries like financial services or I don't know, telecommunications, retail. Can you expand a little bit on that?
Laura: I think the end result of what the AI can do positively or negatively to impact a human's life certainly very specific and special to healthcare because it's the health itself. Anytime you have this situation where you have this finite amount of resources that has to be distributed and then that resource makes the difference in someone's lives, you have to think about fairness and it's not unique to healthcare. Obviously, there are many examples from the financial industry too where this is a major issue to consider.
In fact, there's a very recent example of a housing discrimination lawsuit filed against a company called Redfin which is an online real estate listing service in the US and Canada. They host real estate listings and offer things like extra services, realtor services, photos, virtual tours, and things like that. They have this algorithm that had a minimum home price requirement to be eligible for their services at all. They may say, "I'm not even going to host your house listing or I'm not going to connect you with a realtor because your house doesn't cost enough."
This is a example of something that a policy that disproportionately affected the poorest communities and communities of color of course. These are people whose property values were already depressed due to historical injustices in the housing market. A simple decision by an algorithm to apply a threshold on a number resulted in a policy that's a digital redlining. Small decisions like that have huge impact on people's lives in the financial industry and other industries and then similarly in healthcare industry, it's just the effect is different but can be equally devastating.
It all just comes back to the same concept of fairness and equitable distribution of a resource, whether that's money or your healthcare as well.
Alexandra: Makes sense. How to do it right, how to spot from the beginning how a small change or decision will actually impact different groups down the line.
Laura: How do we do that? [chuckles] With great care and having an eye on the ball of fairness from the beginning, from the minute you are talking about what is the problem I'm going to solve with AI. First of all, why am I using AI in the first place? Is this the right solution to the problem. If I want to even ask that question, what problem am I trying to solve with this? You should be thinking about fairness from the beginning.
Then every step along the way as well to get to that final goal of fairness. You just can't lose sight of that as part of your end goal even when your goal is to make money very often. Our goal is to give the best healthcare to our members, of course. Right from the beginning, you have to be thinking about that.
Alexandra: Absolutely. Out of curiosity because oftentimes it's stated that diversity in data science teams in general and organizations that deploy AI is a factor that needs improvement. What are your experiences? Have you managed or how did you approach this diversity issue to help you to be more inclusive and also collect and gather different perspectives to make sure that you look at the problem and the potential consequences from more angles?
Laura: I think in my experience, I haven't been at Humana a very long, but I've already seen the very purposeful inclusion of a broad diversity of people in a conversation when you are first formulating the problem you're trying to solve. We bring in our equity diversity inclusion team, as early as we can. We bring in our health equity people, Brent can speak to this more clearly because he helped develop a lot of these policies, but I have seen this firsthand is the importance of bringing in the diverse voices right from the beginning of a project starting up.
Alexandra: Yes. Maybe Brent, if you want to add to that, how this all was created.
Brent: Yes. What we discovered, we didn't really discover earlier, but what we made important from the beginning is the fact that it's not just about having diverse voices in the room, but it's giving them the safety to be able to speak to things which may be uncomfortable. Especially when we're working in a space where there are going to be lots of very sensitive societal factors that play a role in how we approach a problem within healthcare. How we approach the AI. It's really important that people feel like they have a safe space to speak up and be honest about their experiences so that we can make sure that those experiences have been thought about and incorporated into design.
Making sure that we talk to people with different backgrounds, whether in the socioeconomic status, race, industry even right is important to be able to start to pull out any real roadblocks, not roadblocks-- To navigate the minefield because you can go wrong in any number of ways with this, because it is a very complex subject.
Alexandra: Absolutely. Out of curiosity, do you have these different perspectives mainly in place in the beginning of an AI project when you're in the development or even conceptualizing how it could look like? Or do you also have some kind of let's say monitoring or intervention steps where certain aspects are reviewed? The reason I'm asking is that with the current AI act that's drafted in the European Union as AI legislation, one of the critique points there is that it not really suggests organizations to have also mechanisms in place that would allow the users of an AI in that case, potentially a healthcare practitioner, to provide feedback on their experiences or concerns or questions they have. Just curious how you are approaching this.
Brent: It's throughout the entire process. The beginning is one of the most important times because if you find any sort of issues or challenges with fairness in the beginning, that's the easiest place to solve them, but as you mentioned these things change. If an AI system has been deployed for a number of years, the data is going to change sometimes the facts surrounding the AI is going to change. If you don't monitor that for, not just to make sure it's still doing well, but to make sure that it's also still performing fair, that the assumptions that you had made previously are still correct and also to make sure that the way that it's implemented and used is according to how it was designed.
All of those are very important to be sure that the fairness of how it was designed is still correct and going to be preventing harm because AI doesn't stop at, "It's breakage of model." It continues through the people who use it and how would they use it. If that shifts, I need to make sure that that doesn't change how we would want to approach fairness
Alexandra: Makes sense. Can you maybe frame this as a takeaway for our listeners that AI ethics and particularly fairness is nothing that you consider once and then take off the check mark, but actually have to think about right from the beginning and during the whole life cycle of an AI system.
Brent: Yes, absolutely.
Alexandra: Perfect. One thing I would also ask you to share, because I really like the example when we had a conversation prior to this recording was in the healthcare context of an AI algorithm that despite good intentions exhibited biases, I think it was a 2019 science paper. Could you tell our listeners the story of what happened here and how bias creeped into the system?
Brent: Yes, this was a paper that looked at an AI system at a healthcare insurer that was not Humana. They were seen to do something that is becoming more common in the industry. That is to help identify people who are at most at risk to have some adverse health events and to help them manage that chronic condition with more assistance. To give them more resources to help manage their health. They had found with this particular algorithm that Black patients were much less likely to actually get this extra help by a fairly large margin. It was something that I opened the eyes of a lot of folks who were not aware with how easily these sorts of issues can creep into a system because it wasn't necessarily something that was just the data.
It wasn't something that was just the implementation. It wasn't one thing, but it was a matter of a few things that came together that ended up causing the issue. Specifically here, overall health is something that is very hard to bring down to a single number. What they had done is say instead of trying to measure overall health as a number we can instead use the projected cost of a member to say how good their overall health is and you'll use that as a proxy for health. They are two numbers that are correlated, but unfortunately, here that correlation is very problematic. The reality that the researchers really surfaced here is that when you use this proxy what you're talking about is healthcare utilization and not health.
Really, it came down to the fact that the Black patients on average face, many more barriers to access of that healthcare right. Structural and historical biases have really been a key part in keeping people from being able to access the healthcare that they need. Their Black members in particular would utilize less healthcare when they were at a very comparable health status to their White counterparts. This meant that because they were looking at healthcare utilization, that a Black and White member who had approximately the same level of overall health that the Black member would get much less access to those additional resources.
Alexandra: Yes. Definitely one of the stories how things went wrong and that it's really important to think about fairness quite thoroughly to see how you can tackle these situations and also these injustices that we have in our society and also in the data that represents this. One other thing that I wanted to talk about today is the holistic approach that Humana is taking not only for AI fairness but with your AI ethics program in general. Brent, could you maybe walk us through your approach here and also how you achieve AI ethics at the scale that you deploy AI with?
Brent: Sure. As we've mentioned here, it's important to look at the entire life cycle. It starts with even forming a program saying what you're going to try to do and getting involved very, very early with the business teams who have a need to use AI. Looking very, very early to help to make sure that from the beginning that you're designing it in a way that is going to lend itself to be fair and to be responsibly developed.
It continues throughout the program and it's about making sure that the ethical AI team is part of the process from the beginning all the way in. From there, looking at, "All right, now that we have designed the program, how is the AI going to be designed from there?" and making sure that you're not falling into any pitfalls that could cause issues and from there, the AI as it's being developed, what we can do is then put in place some gates that help us make sure that we have a place where we can look very, very deep. We can look at the results. We can make sure that we are seeing this from a number of different perspectives and that there is it's no bias, there's no ethical issues in an AI system after it's been developed and before it's been used. From there, making sure that we have the tools in place to monitor it and that we are continuing to make sure that nothing changes with how the model is used and with the data itself so that as the model is being used, that we're sure that that's going to be fair as well.
Alexandra: Makes sense. When you talk about gates, do you mean these points in the life cycle where you validate also how a model was developed in context of fairness before it goes into production?
Brent: Yes. Gates I think are very, very important because it offers an ability to really do this at scale. Our ethical aid team, we're not huge. We can't be involved in every single conversation but the gates allow us to make sure that we have a single point that allows us to go in dig deep, and make sure that all the hard questions are being asked so that we can make sure that nothing proceeds into use without that analysis being done. Really, those gates are a key way for us to make sure that we are able to do this at scale.
It gives us also a point of clarity so that the teams know this is really a safeguard here, that's in a place where we're going to have the conversation and can be clear that from that point that they either may need to make some adjustments or that they are cleared to go ahead and put it into a production use case.
Alexandra: Understood. Maybe a question to both of you, what do you consider as the bigger challenge? Is it really the tools and techniques and all the research that's currently done on how to, for example, actually achieve fairness or is it the human factor with all this education and awareness of, okay, why is it actually important that it goes some extra step and think about some additional things that I maybe haven't had on my plate before?
Brent: Yes. I think it's really the human factor is probably the biggest challenge because as we've been talking about here, this is a very, very complex space. Helping people to understand all of the nuance and the complexities of it, and really getting them to a point where they can also start to ask these hard questions and think about the design of it from their perspective of fairness is something that's a challenge and it's not because people aren't enthusiastic and willing but it's difficult. That's something that is-- I think the bigger challenge tools themselves they'll say, this still a fairly new space. The tools are still a bit nascent.
That presents its own challenge but I think that is something that we are really making some good progress on.
Alexandra: From your experience, Laura, maybe also from previous points in your career.
Laura: Yes, I would say the human aspect is the most difficult part from not as much at Humana, I'd say that our data scientist in general, and about [unintelligible 00:34:13] brought here has been evangelizing the importance of this for years already, which is to do the-- I've already been doing this for years is like hugely ahead of the curve for one thing. Many of our data scientists are knowledgeable already about this field, but not so much in my previous experience. I'd say that the education part of it, it's difficult at Humana too because we have different business areas.
People that work in pharmacy or home healthcare or various other places, and they might have different things to be worrying about from fairness perspective and when they look at their own AI models that they're developing because what you really want is each person to be their own little internal monitor as they're going through the process of developing their model. You want the data scientist to have the agency and the suspicion for lack of a better word of their own model and be constantly questioning it and looking out for fairness. Each team might have different things to consider or red flags to be looking out compared to other teams.
You really need to educate in a broad way, but also more specific to the individual business areas that could become difficult. We have hundreds of data scientists so getting that same word out to everyone and getting everyone on the same level of compliance and feeling how important it is can be a bit of a challenge, but like I said, Humana has already been doing this for a couple of years, so they're pretty advanced compared to every other place that I've worked. Essentially, as most companies, I would say in general in the world maybe.
Alexandra: It definitely sounds like that from my conversation as well. I think of course either of you want to train people to do as much as they can themselves, but then also with these validation points, gates, and so on, you also have this checkpoint to make sure, okay, are we really reaching the organizational standards for fairness as we have set out to achieve that's really looking, or sounding great. Besides this what Brent mentioned being involved early on and creating the awareness and providing the training. Do you have any other best practices for setting up a holistic AI ethics program that you could share with our listeners?
Brent: Yes, I think the biggest key to success is making sure that you have real buy-in from the most senior leadership, and that includes, the C-suite, CEO because in any ethically I work, you're going to have trade-offs. You're going to have, it'll take more time to dig deep where you need to, you might have to pivot or make some modifications if you find some disparity in the model or some sort of potential harm, and without having that support people are going to be a lot more likely to make some compromise there. Pressure, I think is always going to be a factor in creating AI and the industry. If you don't have that buy-in then you don't end up getting that freedom to be able to actually make the correct choice.
We've seen some headlines with companies who I'll just remain nameless who have made headlines with some issues in their ethical AI either programs or in their products. A lot of that is because they didn't have that buy-in, they didn't have that freedom to be able to do what needed to be done in order to make sure that the work that they were doing didn't cause any disparities or harm.
Alexandra: Do you expect that to change with these emerging AI regulations that we see in Europe and China, UK, and also in the United States that this is a more priority topic for senior leaders C-suite?
Brent: Absolutely. I think that's very, very quickly, we're approaching a point where having a very strong ethical AI program is not just going to be something that is good to do, but is almost going to become a cost of doing business which I think is excellent. These regulations that are coming down as long as they're all implemented in a sound way, I think are really going to help to push companies to make sure that they are doing the right thing, that they are asking the hard questions and that they are really focusing on doing what's right.
Alexandra: Agreed. I also like the direction we are heading here that AI definitely is not only a nice to have anymore. Laura, do you have to add any best practices or maybe also some pitfalls that somebody should avoid when building up this AI ethics program within their organization?
Laura: I would agree strongly with what Brent said about getting buy-in from the top, the most important thing you need to be, you need to have agency and the ability to push the big red button if you find something that's wrong along the way, and that maybe slows things down or stops the model, that's ready to go in production tomorrow and the expectation was that it's going to be ready and we find something and we need it to be reevaluated. That is the most important thing is you have to be able to and then not feel a repercussion of doing that and not feeling a pressure to not do it. That's like the most important thing is to have the buy-in there.
Pitfalls are to think you have it all figured out and to think you have your framework all set up and that everything's good. You don't keep looking for the newest technology and not keeping up with the literature because things are changing so fast, this is a very emerging field. Most companies don't have this at all and ones that have an AI product have any kind of ethical AI concerns at the moment. Just being willing to be constantly evaluating where you are in terms of what the state of the art is and what new technologies are available. Don't get to married to the software that you're using or any particular metrics and say, well, I got it all figured out. I'm good now let's just take the hands off the wheel. That's a pitfall there is to make sure you're always up on the latest savior [unintelligible 00:41:04] what's happening in the field.
Alexandra: Yes, I think that's another important point to remember. Coming to fair synthetic data we recently worked on a POC on fair synthetic data together. Maybe for our listeners, could you share what the fairness challenge was that you wanted to solve and why you were interested in looking into fair synthetic data?
Brent: When we approach fairness and models we have a number of tools in our toolkit to help us create more fair models. This is either in the design of the model, curing bias that may have been found in the output. While there are some tools that can help us deal with bias, that's just inherent in the data. In many healthcare applications, these inherent biases are a result of societal biases being very deeply encoded in data. One example is that here in the US one of the best predictors of overall health is zip code where people live and this is something that is incredibly huge in the data, but also really hard to remove. Since relationships to the data are so complex we don't have tools to cure the data itself of bias.
This is just something that is in the data in general. It's not specific to us. [cross talk]
Alexandra: To interrupt here, I just think I just last week found a stats that said that I think in certain parts of the US, you really could have a difference between 10 years of expected life span just from one zip code to the next one to certain circumstances that are prevalent in the different zip code areas. I think that's really telling.
Brent: Yes, it's absolutely huge. What we wanted to be able to do in this exploration was see could we create entirely new data that was fair by design. Instead of trying to fix the data that we have, what if we started from data that was created explicitly to be fair.
Alexandra: In your words, Laura, maybe since Humana has some experience with synthetic data, how would you explain to our listeners the difference between synthetic data versus fair synthetic data?
Laura: Yes, so synthetic data is fake data usually generated by an algorithm that was trained on a real data set to make new samples of itself that are indistinguishable from real data. Very, very high-quality synthetic data should be interchangeable with real data. Training an algorithm, training an AI algorithm using synthetic data, that's very, very good synthetic data should produce a model with almost identical performance characteristics to one that's trained on the real data from which the synthetic data was derived.
Fair synthetic data is also fair, is also fake data but during the training of the algorithm that generates these fake samples, and particularly mostly AI's fair synthetic data generator, it adds another constraint, a fairness constraint that pushes the model to generate fair synthetic samples. In this case, the fairness constraint pushes the model to create samples that improve the statistical parody in the underlying characteristics of the data. Statistical parody is one of those top most utilized metrics for measuring equity in a data set. It's a core one you'll see people use to describe the performance of their model in terms of fairness or the characteristics of their data in terms of its fairness.
If you train an AI model with perfect statistical parody means that it's going to have with respect to some sensitive attribute, like a demographic attribute, gender, race, something like that, it means that your data set is your model is performing perfectly equitably with respect to that sensitive attribute. It's a little bit of a confusing metric sometimes. A good way to understand this metric is with an example. This is the kind of data we're trying to generate that is inherently fair. Say if you want to make a model that will make a recommendation about how much money to offer a job candidate. That's a big one in the news these days, anything HR-related has to do with AI.
Anyone training on the historical data that you have. It's things like information about people that includes demographics, past salaries, education level, occupation, as well, as well as gender. We know that historically women have had lower salaries than men, for many reasons nothing to do with their abilities. If you just train your data, change your algorithm on this data naively you will get a model that would just propagate this inequity forward.
In terms of the way that it makes recommendations with respect to salary and gender, even if you leave gender out completely from your model, you don't use it to train your model at all, you have all these other variables in there that are correlated with gender so that effect is not neutralized, even if you leave it out. How do you fix this? One way would be to adjust the data itself by generating a fairer version of that data and that's what we're wanting to do here with the mostly is fair synthetic data generation engine so it would correct for the inequity by making gender and salary independent of each other in the data set that you would use to train the algorithm.
Then the model itself that was trained on this algorithm should therefore propagate that fairness forward. Remove that inequity in the recommendations that it makes. If you have some data set that's got some inherent inequity between what you're trying to predict and some sensitive attributes using a fairer version of the data should theoretically keep that from happening.
Alexandra: Makes sense. To get statistical parity, basically what I would expect is that the algorithm would predict the exact same salary if I have two candidates who have all the same education level, qualifications whatever, but just the gender is different that both the female and the male candidate would get offered the same salary.
Laura: Exactly. It would be statistical independence is way I think of it. Those two variables are typically independent of each other, and now you have an equitable distribution of the recommendations.
Alexandra: Understood. Can you walk us through the project that we did together? What was the scope what was the problem that you were trying to tackle?
Laura: Yes, so we started with the data set that we collected as part of our social determinants of health initiative, which is, a project where we reached out to our members to ask them questions about their lives. Not necessarily their health, but other things we know are extremely correlated with health outcomes. Things like food and housing and security, loneliness, social isolation. We know these conditions greatly impact people's health. Our goal is to connect members with these kinds of problems, with resources that could help them like access to local food banks or counseling services or things like that.
For this particular study, we use the responses to questions about loneliness and social isolation. We want to see if we could create a model to predict loneliness in our members. We want the model we train and perform fairly for all of our demographic groups, of course, because the idea theoretically would be that we want to then offer those people that we think are lonely or have social isolation is an extra resource, potentially some counseling or something like that. The idea is we want to make sure we have a fair model to offer that resource equitably to our members.
We ran an experiment to see if using this fair synthetic data to train the algorithm could improve the fairness of the model's recommendations with respect to some of these sensitive attributes when compared to model trained on real data. The experiment was very, very simple. We train some loneliness prediction models using real data, synthetic data, which is synthetic data just based on the real data so not inherently fair. Then the fair synthetic data as well, using identical methods and tested all the models on the same set of held-out real data that none of the models have seen during the training process and then compared the performance of the models and the fairness characteristics of them as well.
Alexandra: Makes sense. Where was the bias in the data? Where was the difference in how different demographic or gender groups responded to. For example, the question, whether they feel socially isolated, so what was actually corrected or tried to be corrected?
Laura: There were several things that impacted that. Some of our demographic data showed some imbalances there and the responses. We also had a lot of imbalances in the rates of responses between different groups. We had a couple of things going on at the same time and we were really looking overall at all the demographic information that we had to see what the fairness characteristics of the model were and compared to the real data set. We had lots of different things going on in this data set at the same time, response rates and imbalances and the way that the questions were asked and wasn't just one thing, necessarily. We were looking at the whole field of potential sensitive attributes, basically.
Alexandra: Understood. Various things were taken into account and with response rates. For example, if one group is not as likely to respond to the survey, then you would still be able to try to correct for that and also ensure that those would get offered services if needed. Understood. What were the results or the findings of this experiment and this project that we did together?
Laura: Actually, we did find that using the fair synthetic data to train the algorithm to predict loneliness did produce more fair models. Ones that had better, specifically the statistical parody metric that the synthetic data was optimized to improve, did improve that metric on an equity metric for the models that were created with that data. That's the whole idea, that's what we wanted in the first place. That was with respect to several different kinds of demographic attributes and then we compared to models trained on the real data set.
We also found that models trained on the real regular synthetic data that was not adjusted for fairness performed almost identically to the real data so that was like a separate validation of just the use of synthetic data anyway demonstrated that it was a good proxy for the real data, so two things there to validate is synthetic data. Can you use it anyway? Is it valid? Is it good synthetic data and then does the fair synthetic data improve the metrics we really cared about?
Alexandra: Sounds great. Two in one.
Laura: Yes, exactly.
Alexandra: Perfect. What are the aspects that you liked about fair synthetic data and is there also something where you see areas for future research or what further could be added to the capabilities of fair synthetic data to be of value for Humana fairness?
Laura: We liked that it worked really well. We liked that it gave us the results we were hoping for one thing. For further research, we want to hopefully expand the fairness generation capabilities to incorporate different fairness metrics. Like I said, we had talked about statistical parity here, but there are other metrics that are very commonly used. There's a top two or three that are used in the field. Another one's equal opportunity metric we're interested in.
Some of our models that's the more important one that we want to make sure that we meet in terms of our fairness characteristics because every use case is different has to be evaluated for different kind of harm that can come from it and therefore, what are the most important bias metrics? That's another one that is really top of our list in importance of metrics that we care about. It'd be really interesting to see if this data can, or this method can be improved to include additional fairness metrics, basically.
Alexandra: That would indeed be interesting to find out. What, in general, are the next steps on the one hand for fair synthetic data, but also for the fairness program within Humana. I can remember that you once mentioned tackling fairness on so many different stages with the data during model training with the threshold. What's up there?
Laura: We want to really take this work and identify a candidate use case that's currently developed if we can, that has a data set that could benefit from training with fair synthetic data. I want to push the experiment a bit further to see if we can get this kind of improvement in model fairness with multiple demographic attributes simultaneously. You can imagine how complicated things get when you start getting into this sort of very multifactorial demographic group information there, how much more imbalance and inequities that can show up in these very precise demographic groups.
Those are two things. We're still on the lookout for another model in production or currently in production that we can use this kind of data approach on as well.
Alexandra: Very exciting. Can't wait to see what the results of those will be. Since we are approaching the end of our hour, let's come to the end. One thing maybe both for you, Laura, and also Brent I would be curious about. AI ethics and AI fairness we mentioned are rapidly evolving fields scientifically, regulatory, and also within organizations and how to tackle that. What's your opinion? What can we expect to see in the, let's say, upcoming two to three years or frame differently, what would you like to see?
Laura: I think as we talked about, there's going to be more regulation coming down the line. In the US, it's going to start in the states. It already is in the individual state level. We're seeing some states passing their own laws, even some cities passing their own laws about how AI can, and can't be used particularly in hiring decisions. That's a big one. That's already coming up to the top of the list of important topics here but because of this, I think we'll start to see more enterprises that have an AI business area start building out their responsible AI portfolios as a risk mitigation strategy.
Shareholders are going to start asking about this, wanting to become more aware of the issue, and start asking the question, what are we doing with ethical AI?
Alexandra: I can also see that coming. What about you, Brent? What do you expect to see or what do you want to see?
Brent: I think we're going to be also looking to revisit old assumptions and using AI in a more active way to remove systematic biases. For instance, there's a researcher Ziad Obermeyer who is looking at how x-rays are examined for arthritis. That analysis actually originated the '60s and was trained primarily on white patients. The research now is indicating that Black patients have arthritis, which AI can detect, but is not represented in the actual human diagnoses. We can start to revisit old assumptions and start to use AI to improve things that are going to be inherently fair.
Alexandra: That sounds great and really to move in a direction of what's long overdue to ensure this equity and that people are treated equally. To end, I would ask you for your final remarks, or maybe also for one thing that our listeners should remember from today's episode about AI fairness, even if they forgot anything else and everything else we talked about.
Laura: I would say that inequitable AI doesn't just happen. It's the result of multiple decisions made throughout the development process. Creators of AI technology have to stay vigilant and be on the lookout for potential sources of bias from the very beginning of their project, starting from the ideation phase of the project, all the way through to the deployment of the AI and how it will touch people's lives in some way. Stay vigilant, data scientists. That's my--
Alexandra: I love that. Thanks for sharing Laura, what about you, Brent?
Brent: Something that I find myself saying more and more often is just that ethical AI is effective AI. In looking at performance, it shouldn't really be viewed as a trade-off or some zero-sum game, rather as a fundamental element of quality.
Alexandra: Absolutely. Thank you so much, Laura and Brent, for being here love this episode. Of course, also the work that Humana is doing and what we've done together on fair synthetic data. I can't wait to see what else will be up there about Humana and how you approach AI ethics, because I think you're really leading the way in so many areas, and I really admire how you're approaching this. Thank you so much. It was a pleasure having you on the podcast today.
Brent: Thanks for having us.
Laura: Thank you.