To successfully synthesize your dataset, the content must be encoded in UTF-8, have commas (,) or semicolons (;) as comma separators, and adhere to the following rules:

CSV requirements


green 1 Header row

  • The first row must contain the column names.

  • Each column name in a table must be unique and may not exceed 255 characters.

  • These names cannot have special characters like commas, semi-columns, columns, slash, dollar-sign, backslash, quotes, double-quotes, etc.


green 2 Rows

  • Each row in the file must contain the same number of cells.


green 3 Alphanumeric entries (text, categories, strings)

  • Entries containing line breaks, and spaces at the beginning or end, must be quoted with double-quotes.

    “this is, one column”
    “this is \n two lines”
    “ space at the beginning and end “
  • double quotes in entries must be escaped with double quotes itself

    “this does contain “”quoted text”””


green 5 Date and time values

  • must be encoded in one of the below formats

  • must have missing values encoded as empty strings

Format Example

Date

yyyy-MM-dd

2020-02-08

Datetime with hours

yyyy-MM-dd HH
yyyy-MM-ddTHH
yyyy-MM-ddTHHZ

2020-02-08 09
2020-02-08T09
2020-02-08T09Z

Datetime with minutes

yyyy-MM-dd HH:mm
yyyy-MM-ddTHH:mm
yyyy-MM-ddTHH:mmZ

2020-02-08 09:30
2020-02-08T09:30
2020-02-08T09:30Z

Datetime with seconds

yyyy-MM-dd HH:mm:ss
yyyy-MM-ddTHH:mm:ss
yyyy-MM-ddTHH:mm:ssZ

2020-02-08 09:30:26
2020-02-08T09:30:26
2020-02-08T09:30:26Z

Datetime with milliseconds

yyyy-MM-dd HH:mm:ss.SSS
yyyy-MM-ddTHH:mm:ss.SSS
yyyy-MM-ddTHH:mm:ss.SSSZ

2020-02-08 09:30:26.123
2020-02-08T09:30:26.123
2020-02-08T09:30:26.123Z

The following formats are not supported:

  • Any format with a week number.
    Example: 2020-W06-5 (Week 6, Day 5 of 2020)

  • Any format with ordinal dates.
    Example: 2020-039 (Day 39 of 2020)

  • Formats with a time zone offset that don’t contain a Z
    Example: 2020-02-08 09+07:00

  • Short formats that do not contain any special characters, such as -, T, Z, etc.
    Example: 20200208T0930

  • Formats that separate seconds and milliseconds with a comma.
    Example: 2020-02-08T09:30:26,123

  • Formats that separate seconds and milliseconds with a colon.
    Example: 2020-02-08 09:30:26:123

  • Date only formats that have a time zone component.
    Example: 2020-02-08Z


green 5 Numerical values

  • must have a . as decimal separator

  • must not have a thousands separator

  • must have missing values encoded as empty strings