CSV requirements
To successfully synthesize your dataset, the content must be encoded in UTF-8, have commas (,) or semicolons (;) as comma separators, and adhere to the following rules:

1. Header row
- The first row must contain the column names.
- Each column name in a table must be unique and may not exceed 255 characters.
- These names cannot have special characters like commas, semi-columns, columns, slash, dollar-sign, backslash, quotes, double-quotes, etc.
2. Rows
Each row in the file must contain the same number of cells.
3. Alphanumeric entries (text, categories, strings)
- Entries containing line breaks, and spaces at the beginning or end, must be quoted with double-quotes.
“this is, one column” “this is \n two lines” “ space at the beginning and end “
- double quotes in entries must be escaped with double quotes itself
“this does contain “”quoted text”””
4. Datetime values
- must be encoded in one of the below formats
- must have missing values encoded as empty strings
Format | Example | |
---|---|---|
Date | yyyy-MM-dd | 2020-02-08 |
Datetime with hours | yyyy-MM-dd HH yyyy-MM-ddTHH yyyy-MM-ddTHHZ | 2020-02-08 09 2020-02-08T09 2020-02-08T09Z |
Datetime with minutes | yyyy-MM-dd HH:mm yyyy-MM-ddTHH:mm yyyy-MM-ddTHH:mmZ | 2020-02-08 09:30 2020-02-08T09:30 2020-02-08T09:30Z |
Datetime with seconds | yyyy-MM-dd HH:mm:ss yyyy-MM-ddTHH:mm:ss yyyy-MM-ddTHH:mm:ssZ | 2020-02-08 09:30:26 2020-02-08T09:30:26 2020-02-08T09:30:26Z |
Datetime with milliseconds | yyyy-MM-dd HH:mm:ss.SSS yyyy-MM-ddTHH:mm:ss.SSS yyyy-MM-ddTHH:mm:ss.SSSZ | 2020-02-08 09:30:26.123 2020-02-08T09:30:26.123 2020-02-08T09:30:26.123Z |
💡
The following formats are not supported:
- Any format with a week number
Example:2020-W06-5
(Week 6, Day 5 of 2020) - Any format with ordinal dates.
Example:2020-039
(Day 39 of 2020) - Formats with a time zone offset that do not contain a
Z
Example:2020-02-08 09+07:00
- Short formats that do not contain any special characters, such as
-
,T
,Z
, etc.
Example:20200208T0930
- Formats that separate seconds and milliseconds with a comma
Example:2020-02-08T09:30:26,123
- Formats that separate seconds and milliseconds with a colon
Example:2020-02-08 09:30:26:123
- Date only formats that have a time zone component
Example:2020-02-08Z
5. Numerical values
- must have a
.
as decimal separator - must not have a thousands separator
- must have missing values encoded as empty strings