💡 Download the complete guide to AI-generated synthetic data!
Go to the ebook
Episode 5

Enterprise strategies for privacy and AI with Punit Bhatia

Hosted by
Alexandra Ebert and Jeffrey Dobin
Punit Bhatia was the Privacy and Protection Officer at ING and has tons of experience in managing privacy and data protection in large international environments. In this episode, Punit shares his best strategic advice for those looking to strategize privacy in their organization.  You will read about:
  • the three biggest data privacy challenges for enterprises and their solutions
  • whose responsibility should data privacy be in an organization for best results
  •  why synthetic data is the way forward in data-sharing
  • how to approach AI from a data privacy and data protection standpoint
  • what should companies expect in the data privacy space in 2021 and beyond
 Subscribe to the Data Democratization Podcast on SpotifyApple Podcast or wherever you get your shows! Listen to the previous episode or read the transcript about privacy and innovation.

Transcript

Jeff Dobin: Good morning, Jeff here, your data privacy guru.

Alexandra Ebert: I’m Alexandra Ebert, Chief Trust Officer at MOSTLY AI.

Jeff: We’re the hosts of the Data Democratization Podcast. Today, we’re going to run through a deep dive on strategies for privacy and of course, AI.

Alexandra: That’s right. Today we’re featuring Punit Bhatia on the show. He was the former privacy and protection officer at ING and has tons of experience establishing GDPR programs at large international firms. In his current role, he provides strategic privacy coaching and is known for his pragmatic business aligned advice. But as if it weren’t enough, he also is an author. His latest book is titled, AI & Privacy: How to Find Balance, which is a super relevant topic given today’s challenges around data protection laws.

Jeff: Sounds like a super impressive guy. Let’s go ahead and meet him!

Alexandra: Punit, very happy to have you on the show today. You’re a privacy pro with 22 years of experience in the enterprise space and definitely somebody who knows the ins and outs of managing privacy and the challenges that large organizations, banks, insurances face. Can you briefly introduce yourself and share a little bit about your professional background with our audience?

Punit Bhatia: Sure. Thanks for having me, Alexandra, you guys are doing a wonderful job of bringing AI and privacy and also connecting the data aspect of it in this podcast. Thank you so much for this and thanks for having me. Regarding me, yes, I’m a data privacy professional but I come from an IT background. That allows me to understand privacy from a data or a technology standpoint.

What I do is I help CPOs or DPOs or even CEOs in three things. First, it’s about having the right strategy, because these days every country is coming up with a law. If every country has a law, you cannot have a strategy in each country, so you need to have an overarching strategy and that’s what I help. Second, I help them with having a network of privacy champions. You can have a DPO, you can have a CPO and experts but you need to have that embedded in the business and that’s where you need to create a network of champions of privacy within the business. That’s a second thing I help.

The third thing I help them with is the training because you cannot go to a procurement officer or an HR officer and say, “Hey, this is GDPR, these are the rights,” they say, “Yes, thank you.” You have to make it relevant for them, scenario-based for them that when they’re hiring a job applicant or processing a job application, what do they need to do? What is it that matters?

Of course, as you know, I’ve written a book on AI and privacy recently apart from my other books on privacy. That’s in short and I’ve been privileged enough to work in large enterprises but you mentioned it and thanks for the warm introduction.

Alexandra: Of course, of course. I’m really looking forward to reading your book. I think it’s already available for pre-order on Amazon. For readers, we will, of course, announce it at the end of the episode if you want to check it out. You talked about managing privacy, in your opinion, what are the biggest challenges for enterprises when it comes to managing privacy?

Punit: I think it first starts with ownership, as we call it, the accountability principle, who is responsible for privacy? Whose business is privacy? That’s where it all starts because typically, there are three broad ways in which we can manage privacy. I’m not talking about those who say, “Okay, it’s not my business, we don’t need to do anything,” that’s just another category, but broadly three categories.

One, those who look at privacy as a compliance function and say, okay, compliance or legal or somewhere there, it should be managed by a privacy professional or a data protection officer. Now, that’s one way of looking at it. The second way is, you use privacy in business. This business takes the responsibility of privacy. The third is, then you are using privacy as a business differentiator because there are studies now that we are three years into GDPR.

Now studies are available that it impacts customer trust, it impacts customer loyalty, and even operational metrics. If it is impacting those metrics, the question is, where does it own and who owns privacy? The first challenge in any organization is the ownership or the responsibility and I believe it should lie with the business. Of course, I’m not saying the business should start deciding on reporting data breaches or not, but it should start with the business. That’s the first thing.

Second thing is interpretation. We all know GDPR is a framework law, so people looking for black and white answer, yes, no, it doesn’t work. Then the third thing is making it operational. Once you have the ownership, the interpretation, and then you need to operationalize. I think the third is the most easiest of the three. I won’t say easiest, meaning it’s very, very easy. Relatively, that’s how it works, the ownership, the interpretation and then the operationalization. Those are the three challenges which large enterprises usually face and those are the ones we need to solve.

Alexandra: Yes, absolutely. I agree. Talking about ownership, I know from some business units that see privacy more as , okay, that’s a legal topic and approach it rather hands off. Do we have any practical advice on how to achieve a mindset shift in organizations that are handling privacy that way?

Punit: That’s a very good question. The way I see is there has to be a dual responsibility. Dual responsibility meaning the legal does have a part to play or compliance does have a part to play because at the end of the day, it’s a law. However, they are responsible for defining a policy, giving you guidelines and so on. However, the implementation or operationalization is business responsibility. That’s where you need to define maybe call a business responsible for privacy, call it a data protection executive, call it privacy champion. Whatever you call it, there has to be duality in those responsibilities, legal or privacy professional taking care of the law stuff or interaction with the regulatory bodies but the internal business operationalization is business responsibility. That’s what I strongly recommend rather than building up a legal team and expecting things will happen. Yes, they will happen, you need both.

Alexandra: Yes, I see that this is definitely important to operationalize it and make it happen as you said. One other thing I would be curious about, we oftentimes when we approach by prospects hear that internal data sharing and getting access to relevant and exciting data sets is something that’s super cumbersome within large enterprises. Oftentimes, we hear it takes six months, eight months or even longer. Why is that?

Punit: I would say it is cumbersome because large organizations are careful and they want to protect things but right now that protection is more to the side of a little bit too much, let’s put it like that and that’s why it’s taking time. There’s also a fear based perception that the privacy laws say we should not transfer data, we should not share data. Well, privacy laws always say you should do the right things with data.

If you have a reason and you can justify that reason, you can. You cannot say that finance needs to process the payroll and it will take six months to give the data, it won’t work or the corporate office in the US needs to analyze whether the bonuses should be given and what’s the expenditure and we don’t want to give data.

Those flows would continue, the only question is if your corporate office in the US needs that data, do they need at individual level or aggregated level?

If you ask that right question, the right answer usually would be they need an aggregated information. There is no problem because that aggregated information or pseudonymized information is outside then the preview of GDPR or other privacy laws. It’s asking the right question and solving it in the right way. At the moment, there’s a fear and there is also caution to the extreme.

Alexandra: Yes, I have the same feeling. Sorry, did you just say pseudonymized status out of the scope of GDPR? I thought that always just anonymized data that is of scope.

Punit: No, no, no. Of course, if it meant like that, then I would rephrase that. Well, pseudonymized and the keycap here and the data there, so then it’s effectively anonymized. Pseudonymized data don’t share the key and give it to the US entity. They cannot break in if it’s reasonably pseudonymized but anonymization is also a form of pseudonymization, you pseudonymize and throw away the key. That’s anonymization, one way of anonymizing.

Alexandra: Yes, understood what you’re getting at.

Punit: It’s technical, these are tongue twisters.

Alexandra: Yes. What would be your actionable advice on how enterprises could speed this process of data sharing up?

Punit: It depends where they are sharing, so it’s not straightforward that I can say this is how you can share. It’s not going to be technology, there is a technology element but it’s structurally where are you transferring, what are the reasons, and then you need to decide. Then also what data, because if it’s aggregated anonymized data, then it’s no problem.

If it’s personal data, then we have to understand the reasons and those reasons being valid. Also, those locations provide adequate safeguards and if not, then we know life is a little bit challenging at the moment for large enterprises in context of data transfers given what happened in the European court ruling last year. Yes, last year now, time flies, that’s what you like.

Alexandra: That’s true.

Punit: It’s challenging. There’s no one-size-fit-all answer. It has to be contextualized and context means where the data is going, what data is going, why it is going, and what’s the purpose of it. If you can answer all of those things, then we can talk about safeguards and make it happen.

Alexandra: Understood, understood. You mentioned data anonymization several times. What’s considered successful anonymization and what needs organizations look out for to make sure that the data really is anonymous and not something that’s easy to re-identify?

Punit: Well, that’s a complicated question, but a good question though. If you truly anonymize data, then it would hardly be relevant for business or analysis perspective, because why do you anonymize, is so that you can carry out research, carry out statistics and use it for some output, but if you truly anonymize it, it is effectively rendered useless if I may call it like that. What we need to do is we need to find a fine balance and choose how much of risk is possible and how much of risk appetite we have and how much of risk we are creating and then create. Or the other alternative is what you guys are doing, you look at the production data and create something called synthetic data. It’s another way of generating anonymized data, but it’s alternative data rather than the production data anonymized.

Those are the solutions I would say, and anonymization is complex, but the only way is you cannot have completely anonymized, but you need to look at the risk of. There are techniques which are available. The k-anonymity, l-anonymity, t-closeness, and all that, but that’s too complex, let’s not get there, but in short, look at reasonable way of anonymizing, not completely, but weigh out the risks and carry out a risk assessment or data protection impact assessment in that context on what can be identification risk.

Alexandra: Yes, of course, that’s always to be advised. Talking about or you mentioned the complexity of anonymization techniques. Some organizations develop their own anonymization tools, technologies, others rather purchase something from vendors. What’s your take on that? What’s a device of choice as an enterprise?

Punit: It’s a strategic choice and it depends on which business you are in. If you are in business of Department of Defense, then I would say it’s better off you develop your own but if you’re a normal enterprise, a large enterprise and dealing with technology and technology is not your core business and the risk or the sensitivity of the data is reasonable, or I would say normal under normal circumstances, then you’re better off using somebody else’s algorithms. The reason being it’s a specialist area and you can’t create everything. It goes back to the same thing which people were saying, “Should you outsource, should you keep in-house?”

It’s the same thing that the anonymization techniques or algorithms for encryption or all those things. If it’s really, really sensitive and you know you want a proprietary algorithm, go for it, but then you really need to have a strong reason and appetite to invest because it comes with a cost, and if not, you’re better off choosing a good vendor and in most categories, or most companies will fall into the second category. Very few exceptions, 80/20 rule would be 20% companies or even less, 1% or 2% maybe would fit into that category wherein investing into your own algorithm is worth it.

Alexandra: Yes, absolutely agree. We talked about the challenges when it comes to internal data sharing. From your experience, do you have any best practice in mind? Can you share a story of a team or department that really successfully managed to reconcile data utilization and privacy protection and what were they able to achieve in this project?

Punit: I’ve worked in companies which have successfully implemented data lakes and I’ve had the good fortune of working in two or three of them and working with them. When you implement data lakes, you can implement layers of data in which the identifiability of data is maximal or reduced based on what’s the need. You’re reporting for risk and compliance, you don’t need identifiable data, but aggregated data.

A very specific example in that context is, we were building, what do you say? Systems and for the testing out of systems, especially in what we call the last stage of testing, which is called a user acceptance testing just before production, you want to have data that is very, very similar to production environment, but having production data now in context of GDPR, it’s a risk because you outsource multiple parties, multiple vendors have access.

What we implemented was a data masking technology. I’m talking about three, four years ago, and that time you guys were not there or this technique of generating synthetic data from the production environment was not available. We did that and we did it successfully and it allowed us to use the de-identified data or the masked data as we call it, so Tom is replaced with Harry and his date of birth is added by two or three, so random techniques and it’s very close to production environment, but it is not the environmental production data and you cannot identify people. We did that data masking quite successfully.

Alexandra: That’s good to hear. That’s definitely one of the big challenges that we are approached with that you really need this highly realistic data to test your environments, and that, of course with the onset of GDPR, it’s not too easy and not allowed to use the production data for all of these projects. Definitely a big application area for synthetic data now.

Punit: Even last week I was talking to a payments provider and they were thinking of analyzing all payment data and the challenge was, again, if you use entire data, then it’s not in the remit of GDPR and I advised them that they need to look at generating synthetic data from that customer data and then do their statistics and analysis from marketing perspective.

Once that’s done, they can take out the findings and apply back on that data and it’ll be very close because AI simulations can nowadays generate data, which is very close to your production data, but it’s not production data and then the most important thing is it allows you to carry out research rather than having to focus on an anonymized data, which has taken out the necessary elements.

Alexandra: Absolutely. I think that’s the main benefit of using synthetic data to anonymized data because you can really retain nearly all of the utility of the data unlike the classic anonymization approach as we touched upon, you of course lose lots and lots of information, which is a downside if you want to do analytics or research. Now I’m super curious to talk with you and get your take on AI and privacy. Your book is called AI & Privacy: How to Balance. What’s the secret? How can large enterprises successfully navigate the complexity of AI and privacy?

Punit: I wish the secret was the book Secret, think, believe, imagine, and it will happen.

Alexandra: Would be nice.

Punit: Would be nice, but it’s not that simple. It’s about a few things. One is having a balanced view because entities or corporates tend to think of one perspective, usually, the financial perspective or the business perspective while there is the legal aspect, the privacy aspect, human aspect, and more importantly ethical aspect because it’s also based on the society in which you are.

You need to look at multiple perspectives. Then the other thing is create a balanced decision-making framework that is, have a committee, have a team, and also representatives from multiple, say, backgrounds who still do that decision-making and then, of course, the other aspect is have a set of principles because privacy is principle-based AI. We talk about responsible AI, AI for ethics, or ethics by design, but there’s no such structured mechanism for AI. What we recommend is having a set of principles that you have starting with the GDPR and then followed by responsible AI and combining them and following them.

While you do that, there’s also about keeping in mind the responsive aspect of it. Responsive not meaning the responsive design, responsibility aspect of it. That is it’s lawful, it’s done in a fair way, and it’s done with an end in mind. End from a societal perspective rather than from a corporate perspective. If you do that, I think you will find a reasonable balance and that balance would allow you to eventually see through both AI and privacy because it’s not and/or, it is both. AI is important, AI is the future. Privacy is important and that’s also going to stay.

That’s in a nutshell how you will find balance. Have a framework, have principles, have a committee, and look at it from a 360-degree perspective if we call it like that.

Alexandra: I can imagine that this is definitely important. Can you elaborate on having the ends in mind from a societal perspective? What would be an example of that?

Punit: What I meant is if you’re putting in a solution, you should think of it if you are the end consumer of it, don’t think at it from business perspective or think it from a consumer side of it and don’t judge it like, “Oh no, I would love to have it.” Thinking in terms of a conservative person and saying, what would I think about it? Would I like that solution to be in place? Would it be respectful of my privacy?

Take, for example, you’re building a virtual assistant, say like Alexa or Google Home, or anything that Apple has. You would like that to be put in the bedroom and you want to silently listen to the conversations and all that sounds fine, fancy, because you can create profiles, you can have a psychological profile and predict, and then, of course, sell more, and all those things are commercially viable. Then look at it, if you are in your bedroom or if you are talking to your lawyer, and you forgot to turn off your audio. Would you like that conversation to be heard and usually, we will say yes, it’s no problem for me, because I’m very open but think a little bit and give it some time to sink in. If then you say yes, or probably, I think I need to think, that’s a good enough reason. If you have that slight hesitation, slight doubt, then you need to think of if you can, even as a commercial person, have the little doubt in your mind, the actual user will have far more, and especially, the skeptics will be going after your company much, much more.

That’s what I mean and in the mind, meaning, don’t look at it from a business perspective, because you have to run through the initiative but end consumer, and the person who will use it. Think from their perspective or even ask yourself a question, like we usually say, would this be some be something which my child or 16-year old daughter or mother or a sister would like to use? If I can record their conversation, would I be happy? Those are some tough questions to ask and they’re not easy.

Alexandra: Yes, absolutely not but I think those are the important questions to ask and it’s definitely critical that you put yourself in the shoes of the average person and the consumers and not only think of maximizing business impact when implementing AI solutions. You mentioned AI frameworks, and there are plenty of AI frameworks emerging, also AI guidelines, ethical AI guidelines, but we don’t see too much regulation. Currently, of course, there are some discussions but what’s your perspective on AI regulation? Should there be more regulation or not?

Punit: I mean, first thing first, there is no regulation because AI is a new technology and technology always leads and laws always lag. Regulations come a few years later. I mean, the iPhone came in 2007, and we have the GDPR in 2018. The previous privacy legislation was in 1995. There’s always a lag, there’s always a few years. Right now, there is no legislation but there are a lot of frameworks, the US framework is good one, the EU guidance is also good one, the responsible AI, OECD has a guideline, even China has a framework but of course, I missed the word transparency in the or trust in the a little bit. I mean, that word I’m missing.

It also creates some doubt but at the end of the day, there are a lot of frameworks, and none of them is perfect, in my opinion, you need to take the best of it. Even the time, in due course of time, we will have legislation we like it or not and that legislation will determine what is to be done and what is not to be done but still, if people expect that will be the only legislation to follow, I think they will be disappointed, because that will coexist with the legislations we have like the GDPR, like the privacy directive or regulation, or even other regulations around data or security. It will have to be coexisting and it’s better for companies to start adopting a framework or principle-based approach already so that when that legislation comes, you don’t have to go back, you can go forward and enhance your principles and the security and framework.

Alexandra: Yes, I see that. Coming back to privacy and artificial intelligence, in your opinion, what would you say are like the three top challenges or obstacles that organizations have to overcome if they want to implement AI projects?

Punit: Only AI or AI and privacy?

Alexandra: Only AI projects and what the privacy challenges are if you want to do AI.

Punit: I mean, it doesn’t change much, doesn’t shift much from what I said earlier in context of privacy challenges, because AI, of course, complicates the situation, because there’s a larger business benefit and larger business interests, but it still is, who is the owner of this decision? Is it yes, we believe business will decipher who are the stakeholders who will join in? Typically, people tend to avoid saying no, we don’t need privacy because it’s just a data AI project but you do need them. It’s not only that we don’t need the lawyer because there is no legislation, but you do need them, not because there is a law or not, but because they will bring a different perspective to the table.

The first thing is ownership and creating the team who would decide, decision making framework. The second is interpretation because since there is no law, people tend to think it’s freewheeling while you still have your social responsibility, corporate sustainability, ethics, responsible company, all those things and they are very subtle definitions. It varies from culture to culture, company to company, and society. That’s the hard part of it, the softer part of it, of course, but it’s I mean, the soft part, but it’s very hard to manage. Then the third dimension is with those two, getting it operationalizing, making it operational, but if you can sort out who is deciding that is who are the stakeholders on the table.

Then second, what are the context in which we are going to decide there is a clear set of principles, clear set of guidelines, and then having an approach maybe ethics by design or responsible AI or mix of everything, then you have the ingredients to set the right direction. Those are the few challenges.

Alexandra: Yes, absolutely makes sense and what we hear from our prospects and clients is, of course, that with AI projects, specifically, it’s also the vast amount of data, realistic granular training data that they need access to, which makes it difficult within large enterprises to even in the first place get access to the status. This is also something that we hear quite frequently but I definitely like what you’re describing about, I think we can also talk about it as privacy by design, to having everybody on the table from the beginning of the project, and not saying okay, privacy is not relevant, it’s an afterthought, but really having the lawyers and the privacy people involved from the beginning, if I understood you correctly.

Punit: Absolutely and I think regarding your question on people thinking data is not accessible and that’s why because people directly go and start asking the data and when you go directly, then it doesn’t work. You need to use what I call a 5A approach. You first need to create awareness on what is AI and also have awareness of what is privacy or what are the legislations or ethics by design and responsibility. Then you need to assign the responsibility to some person. Think this is what we are looking at and what is feasible rather than straight away demanding that we need this data. Then conduct an assessment, a line, and finally agree on what is it that can be available and on what conditions? When you do that, then you are making an informed choice. Typically, the challenge I see is I need this data because we have to do this marketing analysis. It doesn’t work like that, that sets off a lot of people or cuts corners on multiple dimensions.

Alexandra: Understood, understood. What would be your response or what I oftentimes hear is that people say, hey, privacy is actually hindering the adoption of AI, I assume that you disagree with this statement. What are the people doing wrong that have this opinion?

Punit: It’s not people doing wrong, I mean, in anything that we have, there are always what we call the glass half full and the half-empty. There are those people who love the noise and is the job of the media to create attention or create a little bit of sensation and those people tend to be heard a little bit more, but I still see a lot of articles, whether on LinkedIn or on media, saying privacy is dead, or the pandemic killed privacy, things like that, or AI will kill privacy.

Well, it’s good for sensationalization but effectively, if you look at it, those who are in responsible position, they are generally not making those statements, unless, of course, I’m the head of consulting of somewhere. It’s my business to attract clients and then what I will do is I will say, “AI will be killed because privacy so I can get some AI business,” and around privacy, I will say, “AI will kill privacy,” and get that business. I think more by and large people are being responsible, and they are aware of and awareness is increasing. Of course, we are not where we should be, but we are not where we were four years or five years ago. It’s a journey. It’s an evolution we are in and we are on the right path.

Alexandra: Yes, I think too. Talking about increasing awareness. What’s your perception on how important privacy is currently for the C-suite?

Punit: It’s contextual. There are a few companies who’re started to see this as a differentiator and there not only awareness, there’s also an intention to put privacy as a focus. For example, if you see Apple, just to pick one example and what’s in everybody’s eyes. They’re clearly putting their mind on privacy and trying to differentiate that we protect privacy better than everyone else. Then if we see Microsoft or Cisco, they’re also consistently producing messages saying, “We protect privacy.” Google is also joining these days. Facebook is lagging, but it is also catching up. If you look at the number of job requirements Facebook has on privacy in the last one year, you will be amazed how many privacy professionals or lawyers they are hiring. Something is going on in there and something will happen.

By and large, I think people are well aware and willing to act. Then there’s a second category, second tier, which are highly regulated environments. This was Big Tech. The highly regulated environments like banks, insurance, or pharma, they’re also very keen and awareness is very high. Then we are left with the non-regulated, the manufacturing or the other companies or other industries. There I think we are still catching up. The awareness is a little bit less. They have done hiring a director of privacy or a DPO, and they believe we’ve done enough.

There the journey is a little bit behind, but everybody has their own place and space because of the things. This Big Tech needs to make money, so they are ahead of the game. Banks and insurance or pharma, they don’t have a choice because they’re so highly regulated. For them, compliance is core. Then manufacturing and others, they have a little bit of leeway, and they are taking that leeway. Again, it’s on the right path, though I would have liked those in the category three to be in category two already. It will happen with time.

Alexandra: Definitely. I do think with banks and insurances on the one hand to have a regulation but on the other hand that consumer trust is just core to their business and that they simply can’t afford any privacy leakages and things that destroy trust. Do you think that especially in the not so heavily regulated enterprises, I think the majority of GDPR fines, or at least from the big GDPR fines, also went to the retail sector, that this is the reason that they were not yet in this middle stage of privacy awareness?

Do you think that overall, these industries are taking it more seriously now, or do you think it will just take more time and a few years until they consider privacy really as something that has to be managed much more carefully?

Punit: I think it would be hard for me to comment on the fines part or why they happen in which way because I’m not having enough information to comment on it. From where I sit, I think the regulators also needed some time to catch up. Those are also organizations. If GDPR was new for organizations, for vendors, it was also new for regulators. If organizations needed to hire DPOs and privacy professionals so that the regulators– If we needed to get budgets approved, so did the regulators, the story is not that different.

They were also on the same journey and they are also catching up and coming up to speed with it. Again, like an organization focused on high-priority risks or high-risk items, the same thing our regulators should be doing and I think they are doing. They are focused on where the maximum risk is and they’re focused on– or sometimes a complaint comes or some incident comes into the light and of course, that needs to be handled. I think that’s the approach. They also have limited resources, so I will not put them on spotlight and say they’ve not done enough.

Alexandra: Yes, I understand where you’re getting at. You said that regulators also have to catch up. Does this mean you expect that more privacy fines are going to happen in the coming years?

Punit: I probably don’t need to say that. If you look at the trends of fines from 2017 to now, every year there have been more fines, and every year they’re not both in number and in the size of fines. That trend is on. It would be difficult to say it won’t happen. The trend is that, and it will continue because we all know companies are sometimes taking it easy, some of them, not all of them. It will continue. The fact that they’re speeding up and they are having enough staff, I think a lot of guidance would also come along.

Alexandra: Yes, of course, I think it’s all with a combination of, on the one hand, giving guidance and explanation and showing organizations the right way forward, and then of the other hand, also having fines as a tool to really get people to take this topic as something that’s important.

Punit: Right, yes. I always compare this with when we are going on a red light or crossing and there’s a camera. There are two ways of looking at it. The government is making money by putting in the camera, and they will issue fines, or I need to drive safely. I think usually, we look at it responsibly and say, “I need to drive safely.” It’s the same in case of GDPR or privacy. Yes, there are fines and thus there will be cameras and yes, there will be fines, but the objective of those cameras or those fines was never to make money. It was always that people are driving safely and responsibly. It’s the same here in GDPR. The only thing is, it’s not the police department or traffic department, it’s the regulator, and we are not individuals driving here, organizations are driving.

Alexandra: Yes, I think that’s a nice analogy that really shows the intent behind these fines. If we look at 2022 and beyond, which topics should privacy and data professionals be prepared for? What are your predictions?

Punit: I would say AI and data because thanks to the pandemic– Of course, we look at pandemic in different ways, but thanks to pandemic, digitalization or digitization has increased, and increased multifold. It’s not seemingly ending soon, and even if it does, there will be variants and so on. We will stay busy, but we’re not going to go back to 2019 or ’18 levels. Digitization will stay and if digitization stays, then two things will happen. There’ll be a lot of data and there will be a lot of AI and a lot of technology to manage.

When that is to manage, the privacy professionals will also have a role to play. It’s good for everybody. If you see, these days the number of posts I see on LinkedIn, the number of ads I see on YouTube have increased many times. It’s because companies in the last 12 months have been investing in technology and AI and so on. That will continue and I don’t see anything being different. Along with it, another thing which will come up is security because as the laws lag, I think the technology also lags the criminals because criminals usually are a step ahead. They find the way and then technology beefs up, security measures are created. I think another big thing that will happen is the security around the data.

Alexandra: Yes, I can see that coming. With AI and data being more important and of course, also privacy staying as important as ever, our listeners should definitely check out your book on AI & Privacy: How to Balance. It’s available on Amazon. Can we find it anywhere else in [crosstalk]?

Punit: Right now, it’s available on Amazon. It’s published with my co-author, Eline Chivot. We have it available on Amazon in print and e-book.

Alexandra: Perfect, perfect. Good to know that. Before we end at the end of our episodes, we usually like to play a short “This or That” game with our guests. Just answer what comes to your mind first. Are you ready for that?

Punit: Let’s try that.

Alexandra: Perfect, perfect. Podcast or Clubhouse?

Punit: Podcast.

Alexandra: Podcast. Why podcast? Because you have one on your own?

Punit: Yes, I have one of my own and I’m on the other one. The other thing is, Clubhouse, I have had some privacy concerns. I have not researched them. I’m not a fan of being locked into one operating system. Clubhouse right now is on Apple or iOS only, so that’s why I would prefer podcast because it allows access to everyone rather than locking you to a platform or an app.

Alexandra: Yes, that’s a good argument. Privacy or personalized marketing?

Punit: I would say both.

Alexandra: Best to reconcile both of them, I’d guess?

Punit: Yes. The personalized marketing for those who choose and non-personalized marketing for those who do not, and then you take care of privacy.

Alexandra: Yes, of course, basically, giving people options.

Punit: Yes.

Alexandra: AI, friend or foe?

Punit: Friend and foe of who?

Alexandra: You are to decide.

Punit: I think it’s a friend. Any technology that comes along, it’s always a friend. The way we use it, it may be counterproductive, so we need to be responsible and aware of– It’s like missiles or satellite navigation or satellites. Are they friends or foes? Normally, they are meant to be friends. If you use it for other purposes, then they can be foes. I’m more for the friend side.

Alexandra: Yes, good attitude to that. GDPR, all black and white or lots of gray areas?

Punit: That’s a tough one. For me, personally, I would say black and white, because it’s a principle-based approach. You can always find it in the black or the white. Then sometimes you are in the both and yes, there’s shades of gray, but for me, it’s generally black and white because it’s principle-based. There’s not so much confusion as people make out of it, at least in my mind. If you are clear on what risk you can take and not– You’re willing to take the risk or not, then the answer is generally straightforward, but people don’t want to take the risk. They want the risk to be on the GDPR side. No, it said so, that’s where the grays come in.

Alexandra: Understood, understood. Do you have the expectation that GDPR is going to change in the next few years ? Some experts say that there are some gray areas and things that need to be clarified or that were maybe too strict? What’s your take on that?

Punit: Like any other law, it would evolve, but it won’t evolve in the short-term. I think that every five or seven years, there’s a review. GDPR also had its review cycle, but it will not be fundamentally changing. It will only be tightening or adding in context of what the world would have changed.

Alexandra: Understood, understood. Then my next question would be books or e-books, what do you prefer?

Punit: I would prefer e-books.

Alexandra: Why is that?

Punit: Books, I do have some of them with me, but then I need to carry them. E-book is even when I’m walking or I’m listening, if something strikes you, you can for ease of accessibility, if I call it like that.

Alexandra: That’s definitely a benefit. My last question would be data minimization or open data sharing?

Punit: Data minimization.

Alexandra: Why?

Punit: Open data sharing as a concept, it means we will share data and there’s nothing called minimization. It means sharing all the data. I’m against that because not every data is to be shared. Both of us, we would not share our bank account password, even though we may know each other. That’s as simple as that. It’s not open data sharing and password is data. However, within the limit, within the context, we may be open to sharing some data and sharing some insights. That’s minimized data. Always within the minimization perspective,

Alexandra: Understood, understood. In general, do you think it’s beneficial for organizations to, of course, not openly share all data, but some data with startups, research communities, or do you think it’s better for these organizations to just use the data themselves?

Punit: I would call it transparency is always welcome. You need to be transparent and you need to be collaborative. This data sharing comes in the collaborative space because you as an organization have your own mindset, own culture, and own resources. If you’re collaborating with others, you have more opportunities and more insights, so definitely go ahead and share data with the right rules and right application of principles of privacy.

Alexandra: That’s definitely a good point. One thing about data sharing, I’m also excited about one of my colleagues just published a blog post about explainable AI and explaining the role that synthetic data could have in that regards, because it will help you to publicly share the data, to understand the model and what it’s discriminating and how it performs in a privacy-preserving manner.

I think it’s also beneficial to have ways to really ensure that also audits and external people can check what algorithms are deployed and used in practice. I think there are definitely some benefits to data sharing, but I agree that, of course, not all data should be shared openly and definitely not passwords, don’t do that, dear listeners.

Alexandra: Other than that, do you have any last remarks for our listeners?

Punit: No. I would say go ahead and explore the potential of AI while respecting privacy and that’s the right way to do it. It’s possible to do it if you factoring like you mentioned, explainability ethics, the responsibleness, purpose limitation, and also human intervention because we don’t want the scientific movie situation wherein there’s a robot and he or she has whatever you want to call that, or it has taken control over everything.

We are not able to control that because that’s AI and is learning from humans. We don’t want that to happen. That’s why the ability for humans to intervene or have control over those actions is the most important thing.

Alexandra: Wonderful. Thank you so much, Punit. It is always a pleasure to talk to you. Thanks for sharing all of these insights with us today. For our listeners, don’t forget to check out Punit’s new book, which will be available soon, and also his own podcast FIT4PRIVACY. Definitely plenty of insightful conversations and talks there as well. Thank you so much.

Punit: Thank you so much for having me. Thank you.

Jeffrey: This interview should be covered in a book. Don’t you think?

Alexandra: Maybe it should be. Let’s see what would be put down in the last chapter for takeaways?

Jeffrey: The three biggest privacy challenges for enterprises are?

Alexandra: Ownership, legal interpretation, and of course, making privacy operational.

Jeffrey: One practical piece of advice Punit shared was to create dual responsibility by treating privacy, not only as a legal issue but also an important business driver.

Alexandra: Exactly. Punit also recommends synthetic data for data sharing since traditional anonymization techniques oftentimes render datasets statistically useless.

Jeffrey: As for AI, companies should follow frameworks and their own principles. It’s important to take the perspective of the consumer in addition to the business.

Alexandra: That’s right. We also had predictions from Punit who said that privacy fines are on an upward trend, but we can also expect a lot of guidance coming our ways from the regulators. After all, the intention of regulations is to reinforce good behavior in data management practices.

Jeffrey: I really liked the traffic light analogy here.

Alexandra: That was nice. I would have added that having synthetic data is like having access to a helicopter in a traffic jam.

Jeffrey: Or a really cool bike.

Alexandra: Maybe.

Jeffrey: Let’s leave this one to the imagination of our listeners. If you would like to ask a question or suggest a topic for a future episode, please drop us a note or a voice message at podcast@mostly.ai

Alexandra: Wonderful. See you next time.

Alexandra: The Data Democratization Podcast was hosted by Alexandra Ebert and Jeffrey Dobin. It’s produced, edited, and engineered by Agnes Fekete and sponsored by MOSTLY AI, the world’s leading synthetic data company.

Ready to try synthetic data generation?

The best way to learn about synthetic data is to experiment with synthetic data generation. Try it for free or get in touch with our sales team for a demo.
magnifiercross