How to effectively audit AI with Ryan Carrier

AI Data Generation

[music]

[00:00:09] Alexandra Ebert: Welcome back to the Data Democratization Podcast. It's so good to be back in the studio, and I'm excited to kick off what's already our third season of the podcast. What better way to do so than with a topic that more and more of you are already starting to discuss: how to get an AI audit, what to expect during an AI audit, or how could you as an organization offer AI audits to other companies and do so effectively?

To discuss this, I'm pleased to be joined by one of the leading experts in this field, Ryan Carrier. Ryan is the founder and also the executive director of the NGO ForHumanity, which is leading the way in transforming emerging AI regulation into auditable criteria. Stay tuned to learn why AI audits will be critical when it comes to building trust in artificial intelligence and also what the three key criterias are for meaningful as well as effective AI audits.

We will also talk about mandatory AI audits and how they could help with proactive compliance and protecting your customers against harms. Lastly, you will also hear which actions you could take already today to get a headstart when it comes to complying with the AI. I would say we dive right in.

[music]

[00:01:36] Alexandra: Welcome to the Data Democratization Podcast, Ryan. It's so great to have you as actually the first guest of Season 3. I'm very honored to have you with us today. Before we jump into our exciting episode where we cover more about AI certifications and AI audits particularly, can you briefly introduce yourself and maybe also share a little bit about For Humanity and what makes you passionate about the work you do? This is a question that I always ask at the beginning.

[00:02:03] Ryan Carrier: Absolutely. It's my pleasure to be here. Thank you for having me on and an honor to start Season 3. Thank you for that. My name is Ryan Carrier. I am the executive director and founder of For Humanity. For Humanity is a nonprofit public charity. It's US-based, but it's really a global entity. I started it in 2016, but I started it without funding or a plan.

It was really just the mission statement. The mission statement is this, "To examine and analyze downside risk associated with AI, algorithmic, and autonomous systems, and to engage in risk mitigations wherever we can," the theory being if we can mitigate more and more risk in these tools, then we get the best possible benefit for humanity. That's where the name came from.

For the better part of four years, For Humanity was just me exploring risks to humans associated with AI from technological unemployment, rights, and freedoms in the Fourth Industrial Revolution, future of work, transhumanism, data ownership, should we own our own data, is that part of the solution? When we settled on and established independent audit of AI systems as our primary work, that's when For Humanity started to grow and become a real community.

Since March of 2020, we have grown to be more than 1,300 persons from 79 countries around the world. We crowdsource, transparently, audit rules and criteria to help to establish this ecosystem around independent audit of AI systems. We're going to talk a lot about that today. I'll pause there.

[00:03:49] Alexandra: Absolutely. 1,300 people definitely sounds like a great crowd to establish. You said you started without a plan and without funding, even though the mission obviously is something to strive for. What was the motivating factor behind that? Was there a specific event that told you, "Well, now I have to go and do that"?

[00:04:07] Ryan: It was a series of things that came to fruition. I had had a 25-year finance career before launching For Humanity. The last part of that, I worked for the World Bank and Standard & Poor's, and Wall Street Bank, but then I ran my own hedge fund. That was something that I survived but didn't thrive at.

[00:04:29] Alexandra: Surviving a hedge fund? What does that mean? [chuckles]

[00:04:32] Ryan: Yes, surviving instead of thriving. If I'm thriving in a hedge fund, I'm making tons and tons of money, to be honest, not working that hard, and I'd probably still be doing it. That's reason number one that I tell you that I only survived it. When you survive a hedge fund, it means you have to close it. In 2016, I was closing that hedge fund. We did not succeed.

We didn't make enough money to make it run. I had some time on my hands. When you close a hedge fund, it's a responsibility, it's a duty, but it's not a full-time job anymore. I'm closing the fund. We had built artificial intelligence to help us manage money, so a great familiarity with AI, but with the time on my hands, with that familiarity with AI, but the most important part of the story is actually I have two boys, and in 2016, they were 4 and 6 years old.

I don't mind sharing with you and with the audience that when I extrapolated out the direction that we were heading in, in terms of the society using AI, algorithmic, and autonomous systems, I don't mind sharing. I got scared, scared enough that I started a nonprofit public charity with no money and no plan. It was just, "We got to do something. What is it we have to do?"

What we eventually have settled on at For Humanity, the focus is this idea of bringing governance, oversight, and accountability to all of these tools where there hadn't been any, and even today there still are very few. That's a big part of what we're trying to do is bring some societal oversight into what these tools accomplish and how they accomplish them.

[00:06:20] Alexandra: Makes sense. Can you maybe break this down for our listeners? Audits, standards, certifications, what is what, and why is each of them important, or why do you focus on audits?

[00:06:30] Ryan: Absolutely. Standards is probably the one that people understand. Well, I'm not sure the world understands standards. I think they know what the word means, right? What these bodies are doing, and I'll name a few of them, not in any particular order, there's ISO, there's IEEE. In Europe, there's CEN-CENELEC, which is the collection of the national standards bodies.

What they will do is establish some basic rules and guidelines around how tools should operate, and those tools could be a ship floating on water, it could be a car driving, it could be an airline, or it could be these AI tools. There are people and groups who are trying to establish what these standards should be, but standards are often controlled, the process is controlled by industry.

Let's be frank, industry may or may not have the best interest of the public in mind as they establish these standards. Then you have to look at it from a completely other perspective, which is the perspective that we come from, which is we are a civil society organization. Only humans are welcome in For Humanity. The interests that we represent are really at that human level. In terms of corporate interests, we kind of say, "You can handle those someplace else [crosstalk]

[00:07:52] Alexandra: I was briefly wondering, was it corporate applicants, or did you have already some robots requiring and requesting access to For Humanity?

[00:07:58] Ryan: No, just humans. I know you're asking that tongue-in-cheek, but what I would say is that's actually something we don't support. We don't support personhood for machines. We don't support AI that can own assets or control currencies or hold intellectual property. We always think there should be human beneficial interests for these tools. Getting back to the idea of what audits do, what we looked at was there's a 50-year track record in finance of building an enormous amount of trust in companies.

The way that works for people who don't have the financial experience, here's how I explain it to people. When we do audits of companies, we do it on behalf of the public. We do it on behalf of the world to basically say, "These are the numbers inside a company," and an auditor who is a proxy for society says, "We know what the rules are. We're going to check how they follow the rules, and we'll make sure that the reports they produce are accurate," okay?

We've done this for 50 years, not perfectly. There have been troubles and problems and fraud, and malfeasance can occur, but most of the time, 99.999% of the time, this system works extremely well. What it does is it basically takes a company and turns it into these numbers that become available to the public, become transparent, and useful to the public who might want to use them.

There are entire businesses, entire industries, entire investment strategies that rely upon these numbers often without checking them. That's the key message. It's an infrastructure of trust that I would argue is well placed that these numbers are trustworthy and that the process that produces these numbers is extremely robust and tested over 50 years.

[00:10:01] Alexandra: So sorry. Just to clarify, the process this is working is that the other parties in the ecosystem trust the work of the auditor, and if the auditor says this is correct, then they know that they can go about their business without themselves checking the numbers of, let's say, a financial services institution.

[00:10:18] Ryan: That's correct. Actually, we produce independently audited financial accounts and reports for every company on the planet that's publicly traded. It's not just a financial industry thing. It's producing these financial reports for all companies. The theory of what For Humanity does is to borrow from that ecosystem, borrow from that process, and bring it to AI, algorithmic, and autonomous systems. Why?

Because essentially, we build and bring governance, oversight, and accountability of how these tools are built and operate the same as what financial accounting brings to governance, oversight, and accountability of the numbers and the operations of a company. We've, again, done that for 50 years. We want to leverage that expertise, and what we call, it's a term I use all the time, that infrastructure of trust.

[00:11:15] Alexandra: What's the challenge with making these audits work? For those not that deeply involved in these topics, they might ask, "Well, there's the AI act on its way and many other regulations that already are in place for artificial intelligence. Why do we need audits next to the regulation?"

[00:11:32] Ryan: The regulation itself calls for conformity assessments. How those conformity assessments will be executed remains in discussion. It could be that companies self-report their information to a notified body, and that notified body says, "Yes, they checked all the boxes." Now, from a For Humanity's perspective, this is suboptimal. What that is essentially letting companies do is self-report.

It's like grading their own homework. You and I both know that most people want to do good. Most corporations want to do good. Most corporations want to follow the law until they get busy, until they get distracted, until something comes up, until they don't quite finish all the details that they need to. The thing that changes that behavior is when an independent third party comes and checks your work.

We know this from education. We don't teach ourselves, right? We go to school, and who's there? Well, a teacher or a professor. Why? Because when we do our homework, we hand it in to that teacher, and the teacher says, "Yes, you've done your work." It's a third-party checking mechanism. What it does is, it fights our human nature. We want to do the right thing, but we behave differently when we know someone is checking our work.

That's where the independent third-party auditor comes in, on behalf of society, on behalf of the public to basically say, "Yes, this company has done all its responsibilities. It's complied with all of the rules under the Act." Now, one other key piece to that is laws, guidelines, regulations, even standards are not written to be auditable. Auditable in this context means one thing: binaryness.

It's where an auditor who's at risk who if they say the wrong thing, they can get in trouble, too, okay, where they have to basically adjudicate you've complied, or you haven't complied, and there is no gray. That's key and critical. It's taking those laws, guidelines, best practices, and standards and converting it into auditable criteria. Standards, laws, guidelines, regulations, they aren't written this way.

The work that For Humanity does in our crowd, transparently, everyone can participate in the process, is to take those laws and translate them into auditable criteria. That's the big difference between what we do and what the rest of this work that's going on in terms of the EU AI Act and standards and so on. We play a very specific rule, but the thing is when I say we make these rules or we translate them into auditable rules, some people hear that, and they get nervous, and rightfully so because it sounds like For Humanity has a lot of authority.

To be clear, we have no authority, and we seek no authority. What we do on behalf of the people of the world is we translate all of this into auditable rules, but because we don't have authority, we take it back to the government. We take it back to the regulators, and we basically say, "You know that law you made? Is this right? Do these auditable rules fit what you intended the law to read?"

That's where the authority for the work that we do comes from. They are the ones who would say, "Yes, this is a certification scheme where if people follow these rules and an independent third-party auditor said check, check, check, and they all happen, then they can say, 'Yes, you've complied with these criteria.'" It's the highest form of compliance with the law.

[00:15:23] Alexandra: Makes sense. If you say an independent third-party auditor, this means that it's not For Humanity who is going to then audit companies. If so, why is it important?

[00:15:33] Ryan: It would never be For Humanity. It's not what we do. To create this infrastructure of trust, to establish audit in the right way, you need three things from an auditor. Number one, auditors need to be certified and trained practitioners. We've already proven that in the financial world, everyone knows them as CPAs, Certified Public Accountants. We need certified and trained people in the space.

Number two, you need an independent third-party set of rules where the auditors aren't establishing the rules. The third thing is establishing the very nature of the relationship between the auditor and the auditee, that tension, right? The auditor is asking the question, "Have you complied with the rules?" like a test in university or school, right? Have I passed? Do I have enough knowledge for the subject that's being tested?

In this ecosystem, we need the support that basically says, "Here's this third-party independent set of rules." That's the role For Humanity plays. Once the government approves those rules, well, now we have to manage and watch the watchers. One of the roles that For Humanity plays is that we make sure that everybody abides by independence and anti-collusion.

When an audit is conducted here, it looks like an audit here, it looks like an audit here, and they look the same. That uniformity is hugely valuable. In addition, the other role that For Humanity plays is we train people on how to comply with these rules. We are making the CPAs of this world, we call them For Humanity Certified Auditors, FHCAs.

[00:17:15] Alexandra: You train the auditors, not necessarily the businesses, or would it also be helpful for somebody within a business to take the courses on For Humanity University to know how to better comply?

[00:17:28] Ryan: Yes. The course knowledge, the things that we train people on is the knowledge of what compliance looks like. It's equally valuable for an auditor to know as the company that's seeking compliance. It's the same knowledge, right? The auditor is in the business of making money. What they do is they license the criteria. That gives them the right to use the intellectual property and how its governed.

That's how we have that oversight to make sure they're doing the right things. Companies who choose to be compliant, they can be educated in the same way. They don't have a cost associated with this other than training their people. Those people can then help build compliance inside of the company. Look, you don't want to show up to an audit and not be ready to pass.

You want to know that you're ready. Taking the time to build that capacity inside the company, having people who are trained and know what they're doing, fantastic, right? Then the auditor comes in as that proxy for society that says, "Here I am the check." They basically say, "Yes, you've done all the right steps," or "No, maybe you have it, and there's work to be done." I hope that answers your question.

[00:18:38] Alexandra: The place where this whole responsible AI assurance ecosystem fits in, which not only is auditing but separate entities also helping companies get fit for the audit and prepare everything.

[00:18:54] Ryan: That's correct. We refer to it as audit and pre-audit. The rule of independence, though, is very important when we have that discussion. Independence basically says, as an auditor, you can get no other monies from your client except your audit fees. What that also means is that you cannot have a strategy, advice, guidance, software, platforms, hardware, you can't have any other sources of revenue with the auditor, right? That's where the pre-audit services come from. If you do pre-audit, you can't do the audit. If you do the audit, you can't do the pre-audit.

[00:19:33] Alexandra: At least not for the same entity.

[00:19:35] Ryan: That was the part I was about to say, which is you can provide both services, just not for the same auditee. That is governed by For Humanity. It can also be governed by national accreditation bodies who are the ones who would basically tell a firm, "You are qualified to issue certification," and we work together with those national accreditation bodies to make sure that, basically, the entities that are doing this work are duly trained and licensed and doing all the right things to make sure that the ecosystem works best for people, which in the end, that's our mission, right, that For Humanity, we want what's best for people, and this is how we make sure it happens.

[00:20:21] Alexandra: That makes sense. This also makes me think back to the few AI audit programs I've seen in the past where sometimes there was also consultancy services added to it, and this now makes me wonder if this even falls in the category of proper AI audits. What's your take on the landscape today? You mentioned that there are not that many offerings, but is there anything else you would like to add?

[00:20:42] Ryan: Most companies aren't ready for a proper AI audit. By the way, the EU AI Act won't be in place until 2025 before enforcement matters. Digital Services Act will be the year before in 2024. GDPR is already in place, and we've built audit criteria for GDPR compliance, mostly because we don't think very many firms who are using AI and algorithmic systems using personal data, therefore, governed by GDPR, we don't think most of them are compliant either.

That's a big part of what's out there today, is that very few firms would be ready for an audit today. That means almost all of the work is pre-audit preparation, building the capacity. Once you build this the first time, if you do it right, you build it compliance by design, then you're going to be able to leverage that capacity, that compliance into lots of other areas where you're going to use AI, algorithmic, and autonomous systems.

Establishing that ecosystem and that process internally, that's where people will spend their times today. Then over the next 12 to 24 months, that's when we're going to start to see the first true AI audits begin to occur.

[00:22:03] Alexandra: Understood. Since you mentioned GDPR, I would say it's an openly shared secret that compliance isn't as high as it could be. Do you think that in the future of AI, we should also put a stronger emphasis on mandatory audits in that space, just thinking of the limitations that regulate those data protection authorities already today have when it comes to evaluating data practices within organizations? Maybe making it mandatory to get AI audits here and there is something that could help us in the end to have AI that's more rule-adhering.

[00:22:36] Ryan: That's what we advocate for. Let me explain why we advocate for mandatory audits. Number one, it would be risk-based mandatory audits. If your AI system is a funny joke generator, the audit for that doesn't need to look anything like facial recognition technology being used at border control, where it's literally life-and-death decisions that are being made, okay?

Risk-based is a key and critical word. The nature of these audits change based on the tool, okay? In terms of the mandates, however, this is what I would share with you, compliance with law in general, let's take GDPR as the example. Here's a better way to explain it. GDPR exists, and people go about their business until somebody does something wrong and they kind of stick their head up. It's like a game of whack-a-mole.

I don't know if whack-a-mole translates, but it's this idea that they do something wrong, and the government comes along, and they enforce it, right? It's a punishment. Here's the problem with that. When that punishment occurs, people have already been harmed. Their privacy is already broken. It's a reactive process. If we put in place mandatory audits, yes, it will be more expensive.

Yes, people will have to build capacity, but then what happens is we have compliance with the law proactively, before people are hurt, before people are damaged, before AIs that are considered high-risk cause harm to people in the marketplace. We would argue that a mandated approach is more proactive and will generate far better compliance than what we have today.

[00:24:23] Alexandra: Completely agree with that. One thing I'm curious about since you mentioned that you have a risk-based approach, and we know this term also from the AI Act, where it's more classified towards the severity of impact certain AI systems could have on individuals, one thing that I'm missing with this risk classification that it's focusing more on the impact on an individual and less on something that has a negative impact that's maybe not as severe as the impact on an individual but on a much larger scale, for example, social media, something like that. Is there something how you plan to address also this lower severity but larger-scale risks?

[00:25:04] Ryan: Yes. No, it's a great question. We have an entire risk management framework that's built to be compliant with the EU AI act, and included in that is measuring risks in those two vectors: severity and likelihood. For your audience purposes, severity might mean the harm that could be caused is, let's say, life and death, right? Then the likelihood might be, well, it's a very, very, very, very rare event that someone would die.

Well, it's still terrible, but it's not the same as an event that might bother a lot of people and be really likely to happen. That can be very harmful in and of itself. The way we measure harm is in these two vectors. Our process looks at all risk and attaches severity and likelihood to all risk and then requires mitigations associated with that. It also requires the disclosure of unmitigated risk to people in advance so that they know the riskiness of the tool that they're using.

Our challenge or issue around the EU AI Act is that, well, they already know that there's a hole. That hole is as follows. They've identified prohibited risk. We totally agree with that. Every nation, every union has the right to say, "This is not for us. We rule it out," and we support that. Prohibited, out. We then agree on the next step. The next step is low risk. Are there systems, are there tools that are so low-risk that we take them off the table and we say, "Don't audit these, don't spend any resources on them," okay?

We take those off the table. Now, here's where we disagree. For Humanity covers everything that's left by saying it should all go through a risk-based audit, whereas the EU AI Act only says, "and here's what's high-risk, and we're going to define it, and then we have a process for basically putting more things in that club over time," okay? I don't like the idea of leaving it uncovered, okay?

Our preference in general would be to audit everything but then on a risk-based basis. If it impacts humans, it should go through a certain amount of audit. If it can really impact them with high severity, high likelihood, well, maybe it should be prohibited, or if you can mitigate those risks, let's go through an audit and make sure those mitigations work, that the controls are in place, that the monitoring is happening to make sure that it does what we say it does.

These are the things that the EU AI Act calls for that we would simply mandate for AIs that impact humans meaningfully and then create dedicated audit criteria that fits the risk of each of those different use cases.

[00:28:03] Alexandra: I completely get where you're coming from, but I'm now wondering, what does this mean then in practice, these two different stances for the Actual audit criteria? Since you mentioned that, of course, the regulators, the entity to sign off of the audit criteria, and even though For Humanity, as the name implies, is advocating for humanity, the job of the regulator sometimes is to balance societal interests with economical interests and many other interests, so what would this mean then, for the end criteria?

[00:28:34] Ryan: It means a couple of things. The first is that you have to create very tailored solutions on a use-case-by-use-case basis. For example, automated employment decision tools, so AI being used to make hiring or employment-related decisions. This is all considered to be a high-risk AI under the EU AI Act. it's Annex III, number 4, A and B.

All of these categories of AEDTs, Automated Employment Decision Tools, are governed by the EU AI Act. Let's be clear, a tool that simply collects resumes is very different than a tool that collects resumes and decides for you who you should hire and not.

[00:29:20] Alexandra: Sure.

[00:29:21] Ryan: Already, right there, we see two different risks based on the use case. What we do is we have a lot of governance oversight and accountability that applies to all tools. Then we get down into the individual criteria, and we basically make sure that certain risks are mitigated on a use-case-by-use-case basis. In the case of AEDTs, we're writing audit criteria for 58 different use cases of automated employment decision tools, all of which would be submitted to the regulators to basically say, "Is this what you meant?"

They have every right to say, "Yes, yes, yes. No, don't say that. That's not a rule we want to support." We will always abide by that. We might make an argument back to them that says, "By the way, if you take this out, here's the risk to humans." They could say, "We get it. We get it. We're just not able to do it now," for whatever reason. They're in charge.

We will always approach it from that perspective, but in the end, the regulators will determine what is a shall statement, what you have to do, what is a should statement? They may say, "Look, For Humanity, what you're doing is great. It's great that you're recommending these things, but you can only recommend, not dictate." Fine. Then it's a should statement, and we recommend what a best practice is. Then those who don't put in best practices might have to justify why they didn't put in best practices.

In some case, it might just be that it's not applicable, which is great, then it's not a risk that needs to be mitigated. That's how we can deal with different levels of compliance based on a risk-based basis.

[00:31:01] Alexandra: Understood. If you say 58 different use cases only in the context of hiring, how do you pull that off? How does For Humanity collaborate?

[00:31:11] Ryan: I plan to do this for the rest of my entire life, [laughs] and that's honestly what it's going to take. That's why we're a 1,300-person group, and this time next year we'll probably be 2,500, and we're going to keep growing and adding staff, not even staff. We don't have employees. We only have volunteers, but our crowd will get bigger and bigger and bigger, and we'll have more and more capability.

By the way, we get better at this all the time. It becomes easier to do each of these use cases. Then once we have an established footing, well, then growing and adapting, and changing to best practices and new technologies becomes easier.

[00:31:48] Alexandra: It's prioritizing which use cases to tackle first. Also, something that you discussed with regulators, risk is a For Humanity prioritization on which ones are the ones that you really want to get sorted out as soon as possible.

[00:32:00] Ryan: Yes, we've gone where the regulators are ready to provide enforcements. The EU AI Act gives us a mountain of work to do already. The Digital Services Act tells us where we should do work already. Where governments and regulators are ready to enforce, that's where we spend our time, and then we grow out from there, especially as new technologies come along and they become a focus of the marketplace like ChatGPT and Large Language Models, LLMs.

That's an area of work where we're going to have to spend some time this year because it's becoming much more mainstream, and people all think that they should be using these tools even if maybe they're not trained or it's not appropriate for what they're trying to do. We need to be there to basically provide or help provide what the infrastructure can look like to properly govern an account and oversee these tools.

[00:32:58] Alexandra: Understood. As you already mentioned, the AI Act is not yet in place. Is there a reason why companies should not yet start looking into the For Humanity criterias for AI audits, or can you maybe give an overview of how likely it is going to be that all audits will be done via the For Humanity criterias?

[00:33:19] Ryan: Oh, I wouldn't say that ForHumanity is the criteria, right? I wouldn't say that until the commission said that, okay? I do think ours is the most robust. It is the most well-developed and detail-oriented. When it comes to the Act, the Act is not passed yet. However, the vast majority of the compliance procedures and requirements are locked in. They're not going to change.

If a group said, "We're not going to begin to build capacity until it's passed," fine, I can't really argue that. There seems to be very little doubt that the law will pass. It's really just a question of when you want to start the process. I would think that having time to be able to assess how to build in this compliance and to build it slowly and carefully, compliance by design, I think that would be a better solution. I would advise people to get started. Just a question of if they're ready.

[00:34:26] Alexandra: That's true.

[00:34:27] Ryan: I'll tell you most people aren't yet. They tend to like to wait until the last minute.

[00:34:33] Alexandra: Then it will be a busy time for lawyers-

[00:34:36] Ryan: That's right.

[00:34:37] Alexandra: -pre-audit services, and many technical folks. Since you said the For Humanity criteria, in your opinion, are the most detailed ones, for somebody who hasn't yet looked into this ecosystem of future criteria, what else are the other competitors when it comes to global criteria?

[00:34:55] Ryan: Yes. I won't speak too much to what other competitors are doing, mostly because we don't even think of it as a competitive thing. This isn't a money-making process or anything we're doing to win or capture market share. We think that having a single set of rules that everyone can abide by creates uniformity. Uniformity creates great value for humans.

As far as I can tell, we're working with CEN-CENELEC, the European standards body, and JTC 21, the working group tasked with creating the conformity assessments. They've retained us as a technical liaison to help advise what those conformity assessments can look like. They haven't brought anybody else yet. I guess one thing that I would say is if others were bringing a similar kind of value, then they'd probably be in the room.

Otherwise, most people just don't look to do what we're doing because if you're doing it right, there's no money in it. It's a little bit like, "Why are you spending time on it? What do you do? How do you rationalize it?" From our approach, we rationalize it because we think it mitigates risk to humans, and that's consistent with the mission. There aren't many groups that are built us that are nonprofit public charities that are mission-driven. I think that's why there's not a lot of "competitors" in the space.

[00:36:29] Alexandra: Perfect. Thanks for this overview. You mentioned now uniformity and its importance twice. I thought it might be valuable for the audience to share what you also shared with me in our poll prior to this conversation about the story why uniform criteria are so important with that example from I think it was financial services industry or, in general, auditors dumping their own standards and requirements. Could you maybe repeat that?

[00:36:53] Ryan: Yes, happy to. We learned this in 1973. In '73, one group would go out, and they'd do an audit, and another group would come along, and they'd basically say, "Well, look, if you follow my rules, I can reduce your taxes, or I can reduce your cost of goods sold, and I can make it beneficial for you." What happens is you create a race to the bottom of "I'll adjust my criteria in ways that benefit the client."

That's not good for anybody. It's not good for comparison. I audit one way, and this group audits here, how do I compare the profitability or the success of that company? The industry saw this, too. By the way, this isn't just my opinion. The industry said, "Wait, wait, wait. This is bad." What they did is they created generally accepted accounting principles. This was in the United States in 1973.

The world wasn't quite so global. In London, they created International Financial Reporting Standards about six months later. In most places, those two sets of audit criteria agree. Where they don't agree, it's a problem. It's a conflict, and it makes it difficult to understand companies in those spaces. That uniformity, that comparability that says, "I understand the compliance that's happened here, and I understand the compliance that's happened at Company B, so A and B, and I know the difference or the sameness of those two," that's the great value of audit, is that uniformity across jurisdictions, maximally harmonized to the regulatory environments and to the laws.

[00:38:30] Alexandra: Understood. Maybe to summarize, since we wanted to establish the criteria for effective AI audits, we could say uniformity, we could say certified professionals, a universe set of rules that's accepted signed off by the regulator, and importantly doesn't leave behind any gray areas and only black and white, and then the independent role of the auditor to not provide, I don't know, consultancy service to the auditee at the same time, but really working on behalf of the public society. Did we forget anything important, or is this a good summarization?

[00:39:05] Ryan: No, that's really good although I do want to pick up on one small piece of independence that'll help people understand its great value. Independence is both having the ability to fairly adjudicate compliance, so to basically say, yes, you've complied and to not have a decision be influenced because I'm worried about losing revenue streams, but there's also this thing called false assurance of compliance.

This is really important for building trust. The example I would give you is, you're my client, I'm your auditor, and I've been your auditor for years. You and I are great friends. All of our audits go perfectly. We always get a meal afterwards, okay? Now I roll in. I say, "What's going on?" You say, "Ryan, I got a new system. We have to audit it." "Great. Let's start. Have you built the algorithmic risk committee?"

You smile at me and say, "Yes, Susie, Sally, Jimmy, Joe, Bob, they're all on the committee with me." I smile right back to you because I know you're telling me the truth. I've met all those people, but as your auditor, I have to behave differently. I have to say, "You have to prove it to me. You have to show me the meeting minutes. You have to show me the roster. You have to show me how the algorithmic risk committee impact the system."

Why? Because I don't work for you. I work for the public. Here is the key thing to building trust. If I have no upside potential and I have downside risk of false assurance of compliance, so in the example I just gave you, if I tell the world you've built the algorithmic risk committee and you have not, who gets in trouble? Is it you, or is it me? When it's me, and that's the correct answer to this question, so when I have no upside potential and I have this downside risk of false assurance of compliance, now when I tell the world that you've complied with all these rules, that you've built that algorithmic risk committee, do you know what the world believes?

It must be true because I have no reason to say otherwise. This is the nature of independence. This is the nature of creating the auditor-auditee relationship in the right way. This is how we build this infrastructure of trust, and it's just such a key part of that independence that makes everything work. I hope that helps.

[00:41:20] Alexandra: Absolutely. One part, too, I'm still trying to wrap my head around is this topic of certified professionals and no gray areas left behind. Just when I think about fairness, for example, it's a much more complicated topic with so many different nuances. Can you maybe walk us through a tangible example of how this would look like in an audit process? How can I as an auditor tell a binary answer to "Is this algorithm discriminatory or not?"

[00:41:48] Ryan: There's a couple of ways to answer that. Sometimes we're talking about ethical choices where it's not right or wrong, it's "What did you choose, and did you share it transparently and give people the information so they can make their own decision?" Those are actually easy to audit because we're asking, "Did you execute a process? Show us what that process was. Then, did you share it transparently?"

This is a big part of what the audit does. It doesn't say what you chose was right or wrong because, honestly, there's no right or wrong to the question. Now when we get to fairness, again, the only one who can adjudicate that you were fair or unfair is going to be judges and juries, okay? When we're asking about fair, we're setting up a set of criteria to say, "Did you go through a process that tested and checked if you were meeting standards of fairness?"

I'll give you one example. In hiring in the United States, we have a rule called the four-fifths rule. The four-fifths rule has gone to the Supreme Court twice. When we talk about a quantitative measure of fairness, it's the most tested measure in the world. It's still not a measure of perfect fairness because what it says is whatever your highest rate, so let's say the highest rate is hiring White males, okay?

That's 100% that's your standard. Then every protected category has to be within four-fifths, so 80% of that selection criteria. Let's say Black females is 60%. It doesn't actually mean that that's unfair. What it says is "It might be unfair. You better check why in terms of Black females, you are only hiring at a 60% rate of your males. Maybe you don't have enough candidates.

"Maybe you're not advertising in the right places. Maybe you're not bringing enough people in. Maybe you are biased, and you're making some bad decisions," right? When we talk about fairness, what we're really going to start talking about first is process. Have you gone through a process to mitigate risk in statistical bias? Have you gone through a process to make sure that you don't have cognitive--

That example I just gave you, you may have cognitive bias because you may only be advertising in places where White people are reading your stuff and applying to jobs. Have you actively gone out to the Black community? Have you actively gone out to an older community versus younger? Lots of different ways we're biased but these are cognitive bias where you have to be asking these questions about "Well, maybe I have my process isn't created to be as fair as it could be."

Really, when we mitigate bias through our audit criteria, it's about going through a whole range of processes to examine bias in data, bias in architectural inputs, bias in outcomes, bias in statistics, bias that's cognitive bias, and maybe bias is what we call a physical barrier or non-response bias. The technology is preventing certain people from participating.

All of these are assessments that are built into our process where we're asking, "Did you do this? Show us you did it. Prove that you did it." Bias is one of those tricky things. We will never eliminate bias. It's almost a statistical certainty that we will never eliminate bias. Therefore, what our process has to be about is, are we mitigating, and are we taking more and more and more and more steps to continue to mitigate bias in hiring and in everything that we do to ensure the best possible fair outcomes? Even then, judges and juries will make the final decisions if it comes to that.

[00:45:43] Alexandra: That's true but again, here, then the concern is the complexity of mitigating AI bias and the capacity of a non-technical jury understanding all the arguments and correctly knowing whether decisions were taken in the best interests of mitigating certain biases or not. I definitely get what you're saying and how at least forcing people to adhere to processes and dealing and thinking with the issue is, of course, a better step than not assessing this at all.

[00:46:14] Ryan: That's correct. There's transparency requirements and disclosure requirements and, again, proving that these processes were undertaken and having an auditor say, "Yes, they did them." These are all strong steps in the right direction.

[00:46:26] Alexandra: Understood. From all your focus on making AI systems that are for humanity and are to stay responsible, do we have, I don't know, your top three or top five tips for the business listeners we have here with us today how to build AI systems that are actually beneficial for humanity? What should they take into account with the specific systems but also in general as an organization approaching AI?

[00:46:53] Ryan: Great question. The first one is absolutely a standing and empowered ethics committee. These are sociotechnical systems. We are not only providing the outputs to impact humans, but the human is part of the equation. They're put into the system, and therefore, ethics are embedded into all of these systems. Designers, developers, data scientists are rarely trained in ethics, and even if they were when they're making these adjudications, they would suffer from sunk cost bias and confirmation bias.

What the For Humanity process does is it extracts these instances of ethical choice and says, "You're not qualified. We're giving them to our trained and empowered ethics officers who they will adjudicate these instances of ethical choice. This is the best segregation of duties." That's the second point, which is segregation of duties. Let experts do expert things. For example, when we assess risk, it is key and critical that we have diverse input and multi-stakeholder feedback.

Number one, a standing and empowered ethics committee. Number two, diverse input and multi-stakeholder feedback. We need a 360-degree perspective of stakeholders, which includes stakeholders at the individual level but also society. Also, the environment needs to be providing input to this process to say, "Look, this is a really risky system because it's polluting like crazy," which from an organizational level might, "Oh, we make a lot of money," not paying attention to the polluting part of it, right?

Having the full perspective of stakeholders that are sufficiently diverse will help to mitigate risks and identify negative impacts in advance. That would be the second piece, and the third piece would be to establish compliance by design. Build a process that captures this from the design level to the deployment level, to the decommissioning level right across through development and everything to essentially say, "Let's make sure we're taking all these steps as we go."

Documenting process, mitigating bias early in data sets, but also reviewing it when we're real-time when it's the pipeline that's bringing new data in. Having a compliance by design process would be step three. That's four lines of defense: compliance at the design level, compliance at what we call committee level or the oversight, compliance in internal risk, and internal risk management, internal audit, and then the fourth line of defense is the external independent auditor.

That will result in the maximum mitigation of risks to humans and to companies. By the way, mitigating risk to humans means companies are less at risk as well. That's a good thing for everybody. That's where we get sustainable profitability.

[00:49:39] Alexandra: Absolutely. This makes sense. Maybe to close off this episode, for everybody who's now interested in getting involved, what would you advise them to take as the first steps?

[00:49:48] Ryan: There's two ways to get involved. Both are through our website, so on forhumanity.center. You can get involved into the Slack community. The only requirement is you provide an email address, you agree with the code of conduct. You get 100% access to everything we do, total transparency, and you don't have to do anything.

It's an all-volunteer organization, but come in and get access and see everything that we do. That's one. The other one is if you want to get educated on this, you can join For Humanity University, which is a separate sign-up because it's outside of our website and outside of our Slack community. It's a teaching, it's our classroom if you will, and you can take courses.

The courses are free. If you want to sit certification exams, then there's a small fee associated with that, but the knowledge is available for you, and it's a great way to understand this very broad, very large ecosystem called independent audit of AI systems. It's the two easiest ways to plug into our work, and I hope to see you there.

[00:50:50] Alexandra: Excellent. I already signed up, and I'm very curious to check out the courses. Thank you so much, Ryan, for taking the time. Keep up the work you're doing. I think it's tremendously important to make AI systems auditable. It was very exciting to be walked through the different aspects of that.

[00:51:05] Ryan: Thank you so much for having me. It's been great, and it was great fun.

[music]

[00:51:09] Alexandra: Thank you.

[music]

[00:51:15] Alexandra: What a great topic to start off our new season of the Data Democratization Podcast. I hope you enjoyed learning about AI audits as much as I did, and maybe one or the other of you is even interested in getting involved and supporting the mission of ForHumanity. Before we end this episode, I'd also be super curious to hear from you and learn what your take is on this question of "Should AI audits be mandatory or voluntary?"

As always, just shoot us a message at podcast@mostly.ai, or you can also reach out to us on LinkedIn. With that said, have a great rest of your day, and stay tuned for our next episode.

How to effectively audit AI with Ryan Carrier

Transcript

Ready to start?