[00:00:00] Hi, everyone. In this video, we'll talk about mock data, and how this generation method can be used in a synthetic data set generation.
[00:00:10] We have here a data set that I provided, and it contains some information about users. It has fields like first_name, last_name, and email.
[00:00:19] Our platform automatically detects this as AI generation Categorical, and we could generate it this way. What I want to show you today is mock data generation.
[00:00:27] If we go in here into the Generation method, and we pick Mock data, a drop-down opens for the Data type.
[00:00:33] We can pick here first Person and First name, for the first name. We can also do the same thing for the last name, Mock data, Person, Last name.
[00:00:48] We can also do this for the email address, we can go in here Mock data and Email.
[00:00:57] What does this actually do? In principle, our platform when we use the AI generation method tries to learn as much as possible about the input data set.
[00:01:07] All the correlations between variables, but obviously also the information about a specific variable itself, its distribution, and so forth.
[00:01:16] With mock data, that is not the case. Mock data is just random data, some dummy data that's created based on the characteristics that we define here.
[00:01:26] In this case, it's going to be first names, here's going to be last names, both by the way from the English language, and then the email address is going to be an intact email address. It's just made-up data.
[00:01:38] It looks like proper data, but it's just made up. It's not based on any learning of the first names that we actually saw, or that the platform saw in the original data set.
[00:01:50] It's also not taking into consideration any correlations with other variables. Sometimes this can be useful if you just want something that's quick and looks like proper data, or you really don't care so much about the statistical properties.
[00:02:07] It's certainly very fast to create,
[00:02:10] but again, it doesn't contain any statistical information.
[00:02:14] Hope it was helpful.
[00:02:15] Thanks for watching.