💡 Download the complete guide to AI-generated synthetic data!
Go to the ebook
September 11, 2023
57m 47s

Data literacy and AI for all - Data Democratization Podcast S1E44

Alexandra Ebert, MOSTLY AI's Chief Trust Officer and DataCamp's CEO, Joe Cornelissen, and VP of Curriculum, Maggie Remynse discuss the importance of data literacy and how data not just for data scientists but for everyone.

Tune in to learn more about:

  • Everyday Interactions with data
  • Governments and data regulations
  • Self-Education: DataCamp for improving data literacy.
  • Data Access & Democratization: the potential of the ecosystem where startups, researchers, SMEs, nonprofits, and the public sector harness data.
  • Risks of Rapid Adoption: Dangers of rapid adoption of AI and data technologies.
  • Essential Skills for the Future: Three vital skills that everyone should have in the next five years.
  • AI & Data Literacy Month at DataCamp

Transcript

Alexandra Ebert: Hello and welcome to the Data Democratization Podcast. I'm Alexandra Ebert, your host, and MOSTLY AI's Chief Trust Officer. I'm so excited to be back after our summer break, particularly because we are kicking things right off with a bang. Today, it's my pleasure to welcome not only one but two esteemed guests and true thought leaders in the space of data and AI literacy. DataCamp CEO, Joe Cornelissen, and their brilliant VP of Curriculum, Maggie Remynse.

As many of you know, DataCamp's mission is to democratize AI and data skills, which is essential not only for aspiring data scientists but literally for each and every one. In this episode, we are going to cover AI and data literacy in detail. What it is, why it is important, and also Maggie's and Joe's best tips and tricks, how you can successfully introduce related programs into your organization. Not only that, we also talk about the importance of data access and synthetic data to make sure that your data and AI literacy efforts stick and that you can demonstrate an impact both on the organization's bottom line as well as on individuals in their day-to-day jobs.

We also shed light on how governments can approach AI and data upscaling on a national scale. As well as on how generative AI is going to influence and hopefully democratize education at scale. There's a lot to take away for everyone in this episode, particularly if you think that data literacy isn't for you, so stay tuned. Before we dive in, definitely go and check out the vast resources DataCamp is going to share over the course of the entire month of September during their annual Data and AI Literacy Month.

Their top-notch experts sharing their knowledge during webinars, podcast episodes, and even a virtual conference on September 28th, and the best of all, it's completely free of charge. Check this out directly on DataCamp or via the link in the show notes, but now, let's welcome Maggie and Joe.

[music]

Welcome to the Data Democratization Podcast, Maggie and Joe, it's so great to have you here. I'm a huge fan of DataCamp and I was very much looking forward to have this discussion today with you. Before we dive into all the interesting topics that we have prepared, could you maybe please both introduce yourselves just briefly. Maybe also share what drives you or what makes you passionate about the work you do?

Maggie Remynse: Sure, yes. I'll go first. My name is Maggie Remynse. I'm the VP of curriculum here at DataCamp and I've been at the ed-tech space for, gosh, several years now Five, six years. Not as long as Joe, so he'll get to tell you all about that. I love being in the data space. It's something that I never got to experience really when I was an undergrad, and I didn't get to experience it early in my career and I found data. That's really what drives me, is that I found data organically on my own and I really want to help people do the same thing. I want to help people find their own voice in data and their own community and career if that's what they're looking for.

Alexandra: That sounds great. If you say you found data, how did it happen? Did it just arrive under the next corner or what was the process like?

Maggie: I was in grad school and I had never taken a business class in my life. I was a science major undergrad, and I worked at a subsidiary of Pfizer for several years out of undergrad and I worked in healthcare. I had never done anything in finance, never done anything with data. My first class actually was a computer information systems class in grad school. I realized right away like, "Wow, this is a really cool space. It's numbers." For me, business was out-of-the-box topic as I was a science major so anything that's wishy-washy was not maybe as [unintelligible 00:04:12].

Being able to see data and see that there was really you can find something that is concrete in it, really drove me into it. Then in doing different internships and things, I realized the impact that data had in different organizations and that you can really make such a massive impact with really not as much effort as maybe one thinks. Being able to see that kind of transformation is really what sparked my interest initially in data.

Alexandra: That's already, I think, a very good takeaway, that it isn't that much effort as many people might imagine, who are scared about data and say, "Okay, they haven't studied computer science, it's not for them," so very, very cool. What about you, Joe? What brought you into data? What brought you into founding DataCamp and what drives you?

Joe Cornelissen: First of all, very excited to be here, and thanks for the invitation. I'm one of the founders and the CEO of DataCamp. I feel very fortunate to work on this because I've always been really passionate about education, about data science, and about entrepreneurship. DataCamp is really at the intersection of all three of those. What motivates me the most is seeing the impact of the work we do. That can mean people getting new jobs in the data space, it can mean people getting promoted because of their data literacy and data and AI skills or it can mean helping organizations flourish. Because they can take stepwise improvements when it comes to data and AI literacy.

Maggie: I can imagine that this must be super rewarding so really, really cool. Maybe we've mentioned DataCamp now several times. I assume most of our listeners have heard about it but for those that are new to DataCamp, can you give us a brief overview what it is all about?

Joe: Sure. Our mission is to democratize data and AI skills. I think we're on the right podcast here.

Alexandra: I think so, too.

Joe: You can go to DataCamp and build skills on almost every data or AI topic. That goes from basic data and AI literacy to Python, SQL, to advanced topics in machine learning, and so on. We really try to cover the whole spectrum. In addition to that, we allow people to create a portfolio to get certified. What's unique about our approach compared to maybe other platforms is it's really not about passive learning. It's about active learning. Most of the time when you're on the platform, where you're on DataCamp, you're solving challenges, you're practicing your skills.

You're attempting to complete a project. That way we've helped now over 12 million learners. We're helping more than 4,000 organizations as well upskill their workforce. The latter has become more and more of a focus in the last few years.

Alexandra: That's truly amazing to hear how many people you reach and also motivate to build up this important skills of data and AI literacy. Maybe since we're going to talk about data and AI literacy quite a lot in this episode, how would you define data literacy and how does data literacy differ from AI literacy?

Maggie: Data literacy is really the art of learning how to communicate with data in different ways. That could be actively working with data and coding or transforming it. It could also be just being able to talk to your colleagues about sales metrics or some things that are KPIs relevant to your own organization. AI literacy is really a very similar kind of concept in the same way. It's being able to talk about AI skills in the same way and how they could impact your organization.

How can you use prompt engineering or prompt creations in ChatGPT to better ask the right questions or to come up with the right answers that you're looking for. It can be the smallest thing of building again, just talking to ChatGPT, but it could also be building an advanced model and using AI to help finish your code or code correct for you and give you the right or wrong answer in the end of the day. It's really the spectrum. No matter where you are at in your data journey, there's a space in AI literacy or data literacy for you.

Alexandra: That makes sense. I think this also beautifully highlights that it's not only a topic for the data scientists or for those which have the objective to build AI themselves. If you just think a few years in the future, potentially every role will have some touch points with AI, with data. There I think it's good to have some general understandings, concepts in mind about privacy, about AI ethics, responsible AI, prompt engineering, and all the things that already quite early in your contact with AI are of importance and that are far, far earlier from building AI yourself.

Maggie: Absolutely.

Alexandra: Maybe Joe since you mentioned that for DataCamp it's now becoming increasingly important to help organizations on their journey towards more data and AI literacy why should businesses even care about that? Why should they care about establishing data and AI literacy within their workforce?

Joe: Sure. It's a great question. If you go back 5 to 10 years I think the challenge a lot of organizations were facing in the past, was they were just getting ready to set up their infrastructure and they were moving their data to the cloud. They were ensuring their data infrastructure in general was in place, and the data availability was done. I'm not saying that it's done, but I think a lot of organizations have tackled those challenges or have started to tackle those challenges.

The big problem a lot of organizations face now, is now that they've done that, is the fact that they're struggling to get their employees to have the right skills, and they're struggling to shift their culture to be more data-driven and more data-informed. If you don't have that, it creates all sorts of problems. Number one problem is simply communication. Communication between technical and non-technical employees. The value of having data teams, data science teams, goes down tremendously if it's really tough for data teams to communicate with rank-and-file employees, and the other way around as well.

You have to ensure that people can speak the same language and that they can understand each other. If you want to make data-driven decisions at scale and make better business decisions, you need to have a basic data and AI literacy across the organization for that to work. I think that's why it really matters and it's a top priority for a lot of companies is what we're hearing.

Alexandra: This absolutely makes sense, and we oftentimes also when we have guests on The Data Democratization Podcast look into okay why do so many AI projects not make it into production? There it so often comes down to this expectation management that those who are not data and AI literates, either have overinflated expectations or not necessarily in understanding what the cool things are you can do with data and AI. I can totally imagine that bringing this more on the same level can help an organization to really get use out of these machine learning and AI initiatives.

Maybe, I think everybody or I hope everybody listening will agree that building up data and AI literacy is an important point in an organization's journey towards becoming more data-driven but how to do it. Do you have any lessons learned, success factors that you encountered when you worked with so many organizations?

Maggie: Yes. It's a great question because I think the big thing is that every organization is different and organizational structure is a huge thing in this. I think we see all these startups that do really well with taking advantage of AI early, and it's a lot of times because of the size of the company. They're small, they're a little bit more nimble, maybe their technology is a little newer. It's easier to some of those newer technologies that are just popping up. Whereas an older company, something that has been around a little bit longer, has a little bit more of a-- It's more challenging times sometimes adopting to those types of things.

Just more because technology resources they're quite different if you're changing or implementing a new technology or tool within an organization that has 20,000 people and 1,000 different archaic systems, it can be a challenge in that sense. The big thing is having people who are super passionate within an organization that really want to drive adoption of those types of things.

Seeing people that are not necessarily just part of HR and a learning and development team I think is the biggest key in driving company-wide adoption of any data initiative. You want to have it coming from those data teams. You want it coming from the technology teams and really pushing the impact that a company can truly see. If you only have buy-in from HR, that's like, "Oh look, we've heard that this is important, you should do this."

That's when the business teams are going to be like, "I'm not really sure I buy this." It really can truly show the impact that you're having from what is data doing for your company? How are your data-driven decisions impacting your bottom line? How is it impacting each person within that organization, or how does your job really relate to data? Those are things that help to ensure company-wide buy-in because it's not just a top-down approach. It has to come from the bulk of the people. You have to have buy-in at all different levels for there to be success in any type of initiative.

Alexandra: Understood. Basically, if HR drives the initiative, then there may be not the best equipped to demonstrate this impact and know where to put projects to make sure that you can see these early successes. What else is important when you say it should be the data teams not the HR teams driving this, having buy-in from everybody? What else is important to be successful or what are challenges that organizations frequently need to overcome?

Maggie: I think going farther into this is that you always want to have people that can show the value that they're bringing to the table and really what is that impact? That can be a challenge. I think a lot of times people see or they look at data initiatives and they're like, "I don't really know what the benefit is in the long run. What are we really doing?" I think that's always been an issue and it continues to be an issue that a lot of companies continue to have of showing the value that they're really bringing because it's not always like a dollar sign that just is like, "Oh, this is our data value. We did this and now we have this much money that came back to us."

It's a little less concrete than that. You want to have an organization that understands that growth and this type of thing isn't going to necessarily impact bottom line today. By giving their time to their employees, and that's a challenge, can be a challenge of giving your time to the employees. Giving them the time to really educate, learn blocking off time in their calendars, giving them access to a thing like DataCamp where they can learn is really important.

Those are the types of things that over time you're going to start to see the value kick in. You're going to see operational efficiencies in the work that they're doing, whether it's creating a data model or running the same types of reports that they do. They might be able to find efficiencies by utilizing new data technologies, new data tools, new data skills that they're gaining by devoting time to learning.

Alexandra: Understand. I see, Joe, you want to add something? Please go ahead.

Joe: I just want to add one thing, which is we've seen the most success in organizations where it's both learning and development as well as the data leadership supporting initiatives to upskill the organization when it comes to data and AI skills or change in culture in a certain way in that area. One particularly interesting thing is we've seen now several organizations that have a head of data literacy even as a role. It's sort of a role that sits between L&D and the analytics and data team. The sole purpose of that role is to uplevel the organizations in this area. That's obviously the dream scenario. A lot of companies are either not large enough or they're not-

Alexandra: Not yet there.

Joe: -supporting that journey. It's a very interesting sign to see new roles emerge because it usually means something is really changing. I also agree with what Maggie says, you need to top-down that buy-in, but quite often what we see in successful organizations is that it starts very much bottom up. Even organizations that have a push from the top down, they still underestimate the demand that exists. The latent demand that exists with their workforce. Quite often, organizations are surprised how many people sign up for workshops, like DataCamp, for example. It's not top-down. It's not bottom-up. I feel it's a combination of the two.

Alexandra: I can totally relate to that. When I think of our customers we are working with, we also see this. Some organizations use synthetic data to solve a specific problem. There's, I don't know, an AI team that wants to build a better prediction model, but they don't have granular data to access, so they're to use synthetic data. It's really stunning to see those organization where also the leadership, the data leadership has recognized the potential and also the need for data and builds up these internal synthetic data lakes, where then suddenly every department every intern, every leader can access the customer base in a privacy-safe form.

Then to see what type of innovation emerges from that, and it's so many departments that are not traditionally data-driven, are super interested to get their hands on this type of data because they've never ever seen it before. This actually brings me also to my next question, because Maggie emphasized how important it is to show impact and also have this connection to, okay, this is not only an important skill to have, but how can you apply it in your business?

How would you imagine this playing out if organizations both work on the data literacy, but then also on the data access with synthetic data and other technologies, to make sure that employees can work with relevant data that belongs to the organization and that they can actually also then use much more practically relevant as opposed to just learning data literacy with the toy datasets that we have on Kaggle and other platforms.

Maggie: Yes, great question. One of the big things that we definitely see, or I've experienced, as I've worked in a large bank is data access. It's one of the foundational pieces of data literacy and building a data culture is getting people to feel comfortable utilizing data and being able to find it. That's also one of the things that is under the biggest lock and key. We want to give people access to data, more also like, "Wait. You shouldn't be seeing somebody's social security number, or their really private information isn't necessarily relevant," so what do we do?

A lot of times the answer is, we're just not going to give anybody access to anything because that's like the same thing, but I have seen with synthetic data, it really gives an opportunity for people to be able to have access to data in different ways, and it's in a safe environment, they're able to actually see the data, interact with it. They don't necessarily even need to know that it's not real data, but it's mirroring the information that they have in their own organization, and this way, they're able to actually explore data.

I've seen so many people in organizations, especially when I was in banking, that they were working in a branch and they got access to a data platform, they were able to see it, and then they start coming to our Tableau workshops, because they were like, "Hey, I saw this, I want to be able to do this. How do I do it? I don't even know what Tableau is, but can I use this? Can I try to do it?" Because giving access to data really creates a data curiosity, and that's how I got involved in data in my own data community.

That's what I saw. As I had access to data, I learned a couple of skills, and I was able to apply those skills to what I was working on, and I had fun doing it, and that's what drove me to the data community. That's why we see a lot of people getting involved in data. If they never have access to data and they don't have access to tools, they're not going to understand the impact. They're not going to be able to make those connections in the same way. Synthetic data definitely gives such a great opportunity for so many people that never would've had it before.

Alexandra: Absolutely makes sense to say if you have never touched it you don't even know what you're missing out on. I think also when we had the prep call, you had this fantastic story of your husband who was also not traditionally a data-savvy individual. Maybe you can retell this for our listeners.

Maggie: For sure. I feel if you've heard me on a podcast you probably also heard this.

[laughter]

It is definitely one of my favorite. I've been in the data space for years now, and my husband is definitely not in the data space. He's in sales in an organization and I constantly preached to him about using data and showing him how to create different dashboards and different reports and things like that to show the impact of the metrics that they're really trying to track.

He started doing this on his own and began bringing this to his own team of like, "Hey, look, when we do more calls each week like this over the next three weeks, we end up closing more deals." Or, "Hey, look, you did more proposals and now our bottom line or your like bonus is growing, because if these more proposal go out you're able to translate it in."

For him, having these really small things which are now his automated little reports that he's put together, his team has been able to make a huge leap in seeing like, "Oh my goodness, like what I'm doing here I can visually see the impact that it's having in my own bottom line." They make a bonus based on how much money or how many deals they close and they're able to really see the impact that they're having and it motivates them in a different way to actually do the role that they're set out to do.

Now, they understand why do they have these specific targets? Instead of it being like, "Oh, having five proposals a week, it's an arbitrary number in their head." Now, they're like, "Oh, I have five, because if I have five and like three drop off now I still have two and these two can translate into closed deals."

They're really able to see that. When Ken is onboarding new people, he shows this to them right away, because they're like, "How do I have success, how do I get to success?" He's able to show them like, "Look, if you do this, this is what has happened with your peers on the team. If you're doing these types of things, it's going to lead to this type of an outcome for you."

Those are just small data like wins that they're able to see and really have. It impacts them in a much different way because now they're motivated to do these things that seem boring and silly, but when you put something visually in front of them. That's really why I think data storytelling or dashboarding visualizations have really transformed data in a different way as well.

Alexandra: I really love this story. Thank you very much for also telling it on The Data Democratization Podcast. I believe, Joe, you also wanted to add something or if you don't have it like on top of your mind, I would be super curious to also get your input in terms of, you mentioned data culture earlier, so we had establishing data literacy, having access to data. Is there anything else that when you work with organization you find to be important so that at the end of the day, organizations really are in this position where the accelerated data-driven innovation, AI innovation, are there some other puzzle pieces that are important in this context?

Joe: I think if you break it down, there's different components. I think if you start with people, it's about data and AI literacy for the whole organization. When it comes to experts, it's less about upskilling people to become experts. In a lot of organizations, their biggest challenge right now, even in this market is finding the experts, senior data engineers, senior data scientists, the senior machine learning engineers.

They're still really hard to find. Hiring is definitely still a challenge for a lot of organizations. That's true at the expert level and you can solve some of that through upskilling, but not 100% typically. When it comes to people and other challenges, just people in leadership, having enough of an understanding of data and AI, and some of that can be achieved through upskilling and reskilling, but ideally there's a healthy mix there where enough people in leadership positions actually have a bit more experience as well.

I think because it's so new, especially everything related to AI, there's just not enough leaders out there-

Alexandra: That's true.

Joe: -to go and hire. Typically, you don't have a choice, you have to invest in upskilling your workforce.

Alexandra: How do you actually foresee this to play out in, let's say three to five years' time? Just thinking of all these building blocks that are currently popping up, tools like GPT-4 that help you with certain tasks and get much more people into a position. I used Workspace the other day that you have on DataCamp where you have now this API to use AI and code something for you, which is really fantastic and allows you to just prompt something in natural language.

Then you have the code there. This of course gives a much broader access to talent that can operate these types of tools. Then we also have all these different building blocks and pre-trained systems that we can get from large cloud providers, which I think would also develop massively in the coming years. How do you envision this to play out? Will we always need to have more of these experts? If we don't have enough now, how are we going to get them?

Does it have to be upskilling at the end of the day or do you think that increasingly we can have more and more data science citizens or something like that because the tools just make it much more easier to be operated if you're not the true expert?

Joe: I think what's exciting right now is that AI itself is going to create more citizen data scientists and is essentially giving superpowers to citizen data scientists. You used to have to be able to write in SQL or write in Python. It's still an essential skill, don't get me wrong, but you're also going to be able to use natural language to query data and to kind of adjust reports and so on. We have our own product called Workspace, where we essentially want to help people transition from learning to doing and make it really easy for them to get started with data and write their own reports or write their portfolio.

I think one impact AI is going to have is it's giving superpower to citizen data scientists. At the same time, I don't think that's going to be a replacement from having experts. I think at the expert level, the same thing will happen, and it has happened to some extent. If you look at software engineers, these new tools already make some software developers and especially the more senior ones, they make them 30%, 40%, 50%, 60% more effective. We're just at the start of this, so imagine what that's going to look like one to two years from now.

The same thing is happening for data scientists. You might have a 10X developer or data scientist, you might get 100X or 1,000X data scientists in the future. That means those people become even more valuable. I'm not so sure that's actually going to solve the recruiting problem because these people will become even more of a valuable so it could actually achieve the opposite as well.

Alexandra: Makes sense. Truly exciting times ahead. Maybe, Maggie, I'd love to get your perspective on this as well. Joe and I already talked now a little bit about this. On the one hand, organizations want to establish AI and data literacy, but just if we observe what has happened in the past few months and this widespread adoption of GPT-4 and other tools like that. How is AI impacting the field of education at large? What's your take on that and what are you most excited about?

Maggie: I think that AI is really impacting the education sphere in the sense that it's giving access to people who might not have had access necessarily before. You can go into GPT-4 and really query something and learn something right away. You're finding things much easier and the accessibility is really through the roof. What I think that's going to happen long term is that, in a lot of countries, especially in the US, education is super expensive. It's out of reach for a lot of individuals and so having access to things like GPT-4 or DataCamp or other kind of tools gives an easier, much less expensive, much faster way to learn the skills that they might have learned in an education atmosphere.

The other thing is that higher ed has traditionally always been a little bit behind. Things change so drastically that I remember even when I was a kid in school, you're using books that are 20 years old. Those are the textbooks. I will say probably when I was in school, that was mostly okay. Today, that's not necessarily okay. You have to be really up to date and really in today's understanding of what's happening to really understand.

Otherwise, these kids are going to be graduating undergrad. Yes, they're going to have some great tools and there's great skills, but they might not have already know the things that they really need in those first roles that they're going to have, especially in the data space. If they come to DataCamp, they're going to be able to learn a lot of those skills because we're really working to update content in real-time and give access to the newest technology skills. The flip side to this is also that there has to be an understanding or this agreement with companies that four-year degrees aren't necessarily the key to success.

I see a lot of jobs that are like, "Oh, bachelor's degree required, master's degree required." If somebody's certified as a data scientist and has taken a ton of skill assessments in AI and machine learning, who's to say that they're not an expert in those topics? I think that we have to be more accepting of these non-traditional routes because AI is really disrupting education and we have to let it disrupt it because that's the only way we're going to grow as an organization and really as society.

Alexandra: That makes sense. It's particularly true for data and AI roles or heavy fields. I'm just thinking I have quite a few professors of marketing in my network and they always tell me to have this real struggle of what do they teach students now, because once they finish their bachelor degrees in three years' time, the tools already will have developed so impressively that you maybe use something completely differently. I think that's really interesting to see how it impacts education.

One other thing, since you mentioned that DataCamp strives to have the most up-to-date and relevant topics, I'm always wondering, how do you pull that off? Because whenever I observe the AI space, it's so hard to keep track with everything that's happening. How can you manage to do what you do so successfully and how do you decide which topics are most important to have courses, webinars, tutorials on next?

Maggie: We're lucky that we have a really diverse group that works at DataCamp that are really skilled in lots of different areas. We're able to bring in tutorials or podcasts like this or webinars or different things really quickly that showcase a lot of the hottest things that are in the world. And we're able to bring in the best, the top kind of people in those industries as well to teach us those things.

For courses, I will say it's a never-ending process of research and more research and more research and talking to people and all that stuff. DataCamp, we are great that we have already been around for about 10 years and have a really good network of instructors. Because of that, we have a great network to be able to find new great instructors that are all over the world, that are already experts in these things. Generative AI is pretty new, LMs, they've been around for a while, but the topic has definitely exploded in the last six months.

We're able to tap into those networks of people who are the most skilled in the world in those to come and help us teach our learners. My team, we don't necessarily tell everybody that-- we don't pretend that we're experts in every single technology. We want to help find the best experts in the world to help us teach those technologies and those skills to everybody because we definitely value the high-quality learning and we definitely want people to feel that they're able to come to DataCamp and be getting the most up-to-date content and learning and knowledge and do that across a wide variety of different modalities.

Alexandra: That's truly stunning. In July we also did this webinar in synthetic data together, and it was just so wonderful to be in context and in touch with your community. I think it felt like a synthetic data world tour because we had people joining from all the different continents. Truly amazing what you've built up here. Maybe also, Joe, what's your perspective on how AI is impacting education? What I would also love to hear from you again, we discussed this already in our pre-production call.

You shared something I find truly impressive that DataCamp also is available for free for so many teachers, and professors, and also students to really help us upscale or upscale at scale. Was this one of the driving factors for you that you observed the challenges that organizations, nations even face, that you decided to go this step and provide this valuable tool for free?

Jonathan: Maybe let's take the last part first and then let's talk about how AI will impact education in general. DataCamp is available for any teacher, any professor, anyone in high school that wants to use this in a classroom setting. We essentially provide it for free. We have thousands of professors from all over the world enabling a few hundred thousand students through this mechanism. I think the fact that we have such adoption even though we don't do any huge marketing push around this, shows that a lot of teachers really want to bring those courses and those skills into the classroom.

Part of the challenge is that they're trying to catch up as well. They don't always have the time to become experts in all these new fields, and so we help that way. It's been a very effective program. We do it for teachers and we help non-profits as well around the world. Now, to address the first part of your question, I'm super excited by the impact AI will have on education. If you go to the essence, the most impactful, the most efficient, effective way of educating someone, there's been a ton of research around this, is usually to have one-on-one tutoring, coaching, and to then have someone do challenges and bring the learning experience as close as possible to what they'll eventually do. It's like the apprenticeship model.

The challenge with that is it's incredibly expensive in most countries. I think what AI will now enable us to do, not just us, but any education company in the world, is to create these one-on-one tutors. To create these AI tutors for all different use cases and to offer the best educational experience for every student in the world. Most importantly, to be able to do that at a very low cost.

I'm super excited about everything that will happen there. DataCamp has already integrated some smaller AI features into our product right now, and we're also building towards that AI tutor that will be your personalized guide. I think those will eventually have a massive impact on the effectiveness of education and the availability of education as well.

Alexandra: I can imagine. This is also an area and I'm excited about, similar to what Maggie said earlier. Just when we think of which percentage of the world is not capable to write English or one of the common languages of the internet, the capabilities that we have with natural language processing now actually are the key to unlock the knowledge base that we have here for the entire world. That's definitely one area I'm super excited about.

One other thing though, I'd be curious to get your perspective. I did a lot with policymakers with responsible AI topics. One thing that oftentimes comes up, particularly now with the discussions on ChatGPT and similar tools, it's not necessarily clear which data they were trained on. Also, with the kind of free text responses that you get, what type of biases, what type of world views get into the output that's provided.

When I think back to school, for example, when we were taught critical thinking, teachers exposed us to a whole spectrum of different information, more left-wing, right-wing media, something like that. How do you foresee this impacting what we just discussed, on the one hand, to have the potential to use AI for more one-to-one setting, but hasn't yet fully figured out what is actually taught, which kind of worldview, where is it on the spectrum? What do you get out? What's your perspective on that?

Joe: I think it's what's happening now. It's just going to happen at scale. If you look at integration around the world right now, there's impact of the country you grew up in. I grew up in Belgium, Belgium has certain biases. Maggie grew up in the US, and I'm sure there are already differences in what teachers told us and what they didn't tell us. That's not just going to get encoded into the AI tutor, but then there's differences between school systems within different countries, and I think that's going to get involved with Catholic schools and different types of schools which I think also are going to get encoded.

I think you'll get different layers and the dream scenario ultimately is that you have choice at the student level around how you want to personalize your AI tutor. That's what I hope will happen. Let's put it that way.

Alexandra: When [unintelligible 00:40:11] just described them I'm also not sure how it will play out. I think it would be super helpful to have this diversity and have a Belgium worldview AI tutor, and a US worldview AI tutor. But just if we think on the current costs of training these large language models, I'm wondering if we will, for example, have a, I don't know, Swiss, Austrian, Austria, also not the largest nation on the earth, so will we have all this diversity or will it monopolize in some areas. That's definitely I think a problem we might need to solve, but if we can solve it, I think the potential of having this one-to-one tutoring is really going to have a massive impact on the education ecosystem.

Maybe in interest of time, one other thing that I absolutely wanted to discuss with you, we now talked more on how to establish data and AI literacy within an organization, within a business, but of course, data literacy is said to be one of the most important skills of the 21st century. We also need to have it on a global scale, on a nationwide scale, how to actually, as a government approach this topic. Do you have any recommendations from your experience to establishing data literacy, how governments, how nations could go about establishing this at scale?

Joe: I think the first thing I would say is that it's really important for countries to invest in this area to preserve competitiveness, but also, and this ties back to your earlier question, to preserve the values of a certain country. They need to be part of this game and I think most governments at this point, they're aware of that. At the same time, if you look at the level of investment from governments, I think it's still fairly low compared to how big the shift is and how important it is geopolitically. For example, if you look at the level of investment in China in AI, AI education it's really high and I do think some other countries have to probably step up their game if they want to remain competitive.

We have worked with some government clients, we have worked with US government agencies as well as the government of Singapore, and it's not that dissimilar in a way from working with organizations in terms of upleveling and upskilling the government itself. Where it is different is that the government obviously can mandate that in every high school, certain skills need to be taught. I think that's a really important mechanism where we are starting to see change but it's still at the very early stages.

If you think about data science, there's been a big shift at universities. In a lot of high schools, it's still not really a priority or not as much as it should be, in my opinion. I think with the evolution of AI that should be something everyone in high school is taught in my opinion. The risks, the opportunities and it's not yet the case.

Alexandra: That's true. Maybe Maggie, since Joe mentioned the level of investment is not necessarily there where we want to have it. You earlier mentioned that the education system so far not necessarily were the fastest in integrating new knowledge, new content. What would you wish governments would do now to set their nation up for a better future in the space of data and AI?

Maggie: Just like an organization or people, CEOs need to understand the impact of data and AI. I think that government leaders have to do the same thing and I think that that's something that hasn't necessarily always been at the forefront. I think about the US and so many of our leaders in the US, they're older we'll say that like in a nice way. They're not as up-to-date on technology. You watch C-SPAN all day and see some of the questions that get asked.

I think the first step has to be that they have to understand what data, AI, what these skills can really do for them, but also what they can do for their constituents, the communities that they're within, and really advocate for them. That has to be the first step in my opinion. We want to get people to a place where we're then able to adapt quickly. I think in a lot of countries, policymaking takes a really long time and AI and data is fast moving. Joe already mentioned that China's already on top of this and so if other countries really want to be on top of it, they also have to be maybe a bit more nimble in the way that we make decisions and quickly adapt to things.

Also, it goes into even if you think about, I think about the US because I live here and I've always been here, but if I think about going into different communities, there's a huge disparity within each community of their economic wealth within those communities and what are the resources that are capable or we're able to provide. I look at the school districts and it's based on tax dollars that go into it. It's a huge disparity.

There's also a big chance that we're going to continue to further divide those economic classes with data and AI investment if it's not really done uniformly within communities equally. Because if we're basing it on tax dollars for instance, then there's some communities that are really going to flourish and then there's others that are going to fall behind and suffer and continue to expand those disparities. I think it's got to be a really sensitive approach, but like I said, first step is definitely for our leaders to really understand these impacts and educate themselves further.

Alexandra: Yes, absolutely. Makes sense. One thing I'm just worried about is the speed. If we first have to build up these skills within public sector, and at the same time, everybody coming from the stage in the AI space wished that this was already started in education 10 years, 15, 20 years ago, how we will actually be fast enough to transition to a more data literacy and AI literacy at scale. What do you think is realistic or where could we get in the next five years in terms of data literacy at scale? What do you envision the status quo to be then?

Maggie: I guess I'll say it's hard to envision a status quo per se compared to where we're at today. I do think that it's important for, in the short term, everybody to do what they can and to really take advantage. Obviously, it's great if the government gives us all these resources, but it's also something that anybody can take advantage of right now. Most people have a smartphone or a computer or something, or even public library access most likely. We're all able to really take advantage of learning if we really want to.

Right now, data, AI, they're part of our everyday lives, whether we really know it or think about it or not. I'm sure most people have an Instagram or a Facebook or a Twitter or the new threads that just launched a couple of days ago. We all have something and we're all constantly looking at the different data points that come in or the weather app on our phones. We hear all these different things.

We're already knee-deep in data. We can take advantage of those things. We don't have to wait for the government to also be onboard, but it's also something that the government can take advantage of now. Yes, I want them to understand data more, but they can also bring in people who know more about data and rely on their expertise a bit and learn while they're also updating our infrastructure, our systems, or things that are publicly available in the country right now. They don't have to wait for all that stuff to happen.

My hope is that those types of things can really be expedited. We're living in a day and age in five years where data and AI skills or access is something that we're not talking about right here like this or like this, "How can we give people access?" It already is. It's already accessible.

People already are aware that it's there and we're able to take advantage of it.

Alexandra: Absolutely. Of course, everybody can just go on DataCamp and the internet and build up and start taking the education in their own hands. With the data access, I'm a little bit more skeptical. I'm working a lot on the policy level. While there is this vision of so many nations or economically unions to become an AI leader, in my opinion, they're not necessarily looking for this data democratization, open, synthetic data aspect enough. Because if only large organizations have access to data, then I not necessarily see this ecosystem flourishing where startups, SMEs, researchers, nonprofits, and even public sector can really work with data and drive an impact for that.

I think there's, again, quite some work that needs to be done. Maybe, Joe, building up on what Maggie said, what would you say is the best way to get started for every individual to become more data literate if they haven't yet touched into the data space so far, not waiting on the government to serve everything on a silver platter?

What would be your advice?

Joe: My advice would be to go to datacamp.com, create an account and get started. In all seriousness, I think I'm very optimistic that this is like when the internet came about or when computers came about, people do it just really fast when they have to, when they seek the benefit for them personally and for their organization, for their community.

I'm pretty optimistic that we'll see a very rapid upskilling of a lot of people in society. If you just look at the growth numbers of OpenAI, if that's any indication, that is the fastest adopted product in, I think, history of-- I think the data is telling us like, "This is happening faster than anything we've ever seen."

If you extrapolate that trend, I'm pretty optimistic.

That being said, I do think, especially from an organizational perspective and from a government perspective, there are real risks here. I think the speed of adoption and the speed at which people get excited about this also creates massive risk which then in turn mean that there's additional importance of education, ethical risks, all sorts of risks related to bias. I think it's important people are aware of those, and they excitedly adopt these new technologies.

Alexandra: Definitely. Maybe the advice for the individuals would be to go on DataCamp, sign up, create an account, and the advice for government public sector would be to just join forces with DataCamp for AI, and Data Literacy Month which is forthcoming, but we will air this episode when DataCamp has the Data and AI Literacy Month, and then take at least one month to propel us forward, and making AI and data literacy a focused topic in a given country. I think this would definitely be nice to see. Wonderful. I think in interest of time we need to come to an end, even though I would have loved to continue talking AI and data literacy, and how to establish it at scale with you for quite some time.

If you have a fairy of AI that would grant you one wish and say, "Okay, these are the three skills that every person on the planet, data and AI skills of course, would have in five years time. What would be your pick?" What would you choose as the three skills that every person on the planet should have in five years time?

Maggie: Sure. I can take this one. In general, I think having basic skills would be the biggest thing. Understanding data visualization so can you read a map or a weather map? Can you read a bar chart, pie charts, things like that? Can you understand probabilities and risks because I think that's the most common way for people to understand how statistics might be misleading. You hear about all the time different election results or different kinds of polls that we have, and just the way that they're worded can definitely skew.

Then I think understanding different types of bias because, again going into that misleading, you want to be able to make your own opinion and come up with your own outcome on things and data, AI. Especially when you have a model that's trained on certain data, you want to be able to read through the line sometimes and be able to say, "Ah, this might not be what I'm reading." It might not be or how as good as it seems type of thing. Being able to really determine those things on your own is going to be super important.

Alexandra: I can fully agree. I also think that understanding bias is more, and maybe also limitations of AI systems is something that's super important for everybody out there because we just have these tools now in our pockets at our fingertips so that's quite important. Joe, since we have our hypothetical data and AI fairies still here, is there something you would add to this wishlist?

Joe: Yes. I would tell the data and AI fairy that I wish there would be a lot of people who understand what's possible with data and AI, and who can imagine new experiences, new companies to emerge because I do think in the next five, 10 years, we'll see the tremendous amount of positive impact on people's lives from new companies, new products, new features of existing products that will emerge and make people's lives better, whether that's in education or any other sector, I think we'll we see a tremendous amount of progress. The more people that can participate in that progress, the better for everyone.

Alexandra: Absolutely. Because only then we will get this diversity and this impact and the way to realize the potential of data and AI for good, so I'm totally on board with that as well. Maybe as a last point, at the beginning we talked about this misconception that data and AI literacy is something for data scientists and it's a misconception that still many people have when they talk about this, so for those of our listeners who are rather new to this field who are maybe even afraid to embark on their own data and AI literacy journey, do you have some encouraging or last inspiring words for them to step over this fear and get started?

Maggie: Yes, for sure. I would start by saying you've already begun your data journey regardless, even if you don't know that, and again, data is for everybody. It doesn't take data scientists. I feel like I'm a perfect example of that. I'm not an expert in any coding language. I could get by on all of them, but I'm not great on any of them, and I have been able to find my way and make a really great data career despite not having the best coding background. Being in a data career doesn't mean you have to have hard data skills. It means you have to have a passion for what you're working with and who you're working for and what you're trying to teach. I have that passion and so if I can find my way in a data career, you can definitely find your way too.

Alexandra: Perfect. What about you, Joe? How would you inspire or encourage our not yet data-savvy listeners?

Joe: Yes, I would second that and say, if you're listening this to this on a mobile phone or if you're watching this on a computer, you have the skills because those are all new devices people didn't understand a few decades ago. I think it's similar and everybody will reach their own level, but there is an opportunity to build a better life if you get ahead of it, if you're faster in learning these skills. It's a great time to take advantage of that opportunity.

Alexandra: Perfect, and since everybody has a phone or computer who is capable to listen to this, then there are no excuses and you can definitely get started on your data journey.

Joe: Exactly.

Alexandra: Right, after today. Well, Maggie, Joe, thank you so much for being with me today here on the podcast. It was a pleasure to talk with you and I'm very much looking forward to what's yet to come in the upcoming AI and Data Literacy Month on DataCamp, so last year it was really super cool to follow. Very excited for what you will come up with this year.

Maggie: Yes, thank you for having us, Alexandra.

Joe: Yes, thanks a lot. This was a lot of fun.

[music]

Alexandra: Wow, what a great conversation. I hope you enjoyed talking to Maggie and Joe as much as I did. As always, if you have any questions, comments, or remarks, you can reach us either on LinkedIn or email at podcast@mostly.ai. As mentioned in the intro, go check out all the resources DataCamp is going to share during this month. You can find the link to their Data and AI Literacy Month in the show note or directly on DataCamp. Also, I'm excited to be part of Data and AI Literacy Month and also be a guest on the Data Framed podcast, so be sure to stay tuned for that as well. With that, thank you so much for listening today, and stay tuned for our upcoming episodes.

Ready to try synthetic data generation?

The best way to learn about synthetic data is to experiment with synthetic data generation. Try it for free or get in touch with our sales team for a demo.
magnifiercross