Context processing in multi-table schemas
MOSTLY AI provides powerful features for the training on and generation of multi-table datasets. For such datasets, the Sequential Context Processor (SCP) is a key feature that aids the generation of nested child tables with rich context from any surrounding tables. Thus, synthetic data that you generate with MOSTLY AI can fully retain the existing correlations between nested linked tables in complex schemas.
Example of context processing
To examine how sequential context processing works, let's consider a multi-table scenario with six tables, as listed below.
customers
tableloans
tablepayments
tableaccounts
tablecards
tabletransactions
table
The diagram below illustrates the relationships in this table schema.
For ease of reading, the text below refers to generating records in the context of other records. However, the same rules of context apply when training MOSTLY AI generators.
So, when you read generated in the context of, bear in mind that trained on in the context of applies just as equally.
customers
table
At the top of the hierarchy is the customers
table. This table acts as the primary context for all child and grandchild tables that follow in the hierarchy. This is also known as a subject table.
loans
table
The loans
table is the first child table in the hierarchy. Along with the parent customers
table, both tables represent a two-table scenario, where the customers
table is a subject table and the loans
table is a linked table.
The records in the loans
table are generated in the context of the parent records from the customers
table.
In addition, each record in the loans
table is generated in the context of all same-table sibling records. Let's consider how the loans
records are generated for the parent Lucia Garcia
.
After the first loan
record for Lucia Garcia
is generated, that first record is then used as context for the second loan
record of Lucia Garcia
.
Afterwards, all previously generated sibling records (with parent Lucia Garcia
) are passed as context for each new sibling to be generated.
To summarize, when MOSTLY AI generates a loan
record, it does so in the context of:
- parent
customer
record (Lucia Garcia
) - same-table sibling
loan
records (the secondConsumer
loan is generated in the context of the first two loansConsumer
andMortgage
ofLucia Garcia
)
For details on how to set up a two-table scenario in MOSTLY AI, see Two-table relationships.
payments
table
The payments
table is the first grandchild table in the hierarchy. It is a child to the loans
table, and a grandchild to the customers
table.
When MOSTLY AI generates a payment
record, it does so in the context of:
- the grandparent
customer
record (Lucia Garcia
) - parent
loan
record (Consumer
loan) - parent sibling
loan
records (theMortgage
and the twoConsumer
loans) - same-table sibling
payment
records (thepayment
records that belong to the sameConsumer
loan)
The context not used is as follows:
- same-table cousin records (
payment
records that belong to otherloan
parent records)
accounts
table
The accounts
table is the second child to the customers
table and a sibling to the loans
table. Just like the customers
records, the accounts
records are generated in the context of the customers
records.
However, what the Sequential Context Processor includes is also the loans
records as context. This means that every time an account
record is generated, MOSTLY AI provides as context:
- parent
customer
record (Lucia Garcia
) - cross-table sibling
loan
records (the 2Consumer
loans and theMortgage
loan) - same table sibling
account
records (theSavings
account is passed as context when generating theChecking
account)
The context not used is as follows:
- any records from the
payments
table
cards
table
The cards
table is the second grandchild table in the hierarchy. It is a child to the accounts
table and a grandchild to the customers
table.
When MOSTLY AI generates a card
record, it does so in the context of:
- the grandparent
customer
record (Lucia Garcia
) - the parent
account
record (theSavings
account ofLucia Garcia
) - all same-table parent sibling
account
records (theChecking
account ofLucia Garcia
) - all cross-table parent sibling
loan
records (the 2Consumer
and 1Mortgage
records) - all previously generated same-table sibling
card
records
The context not used is as follows:
- all cross-table cousin records from the
payments
table (payments
records whose parentloan
record has as parentLucia Garcia
) - all same-table cousin records from the
cards
table (cards
records that have anotheraccount
as a parent whose parent isLucia Garcia
)
transactions
table
The generation of the transactions
records occurs with the richest context compared to the rest of the tables.
When MOSTLY AI generates transaction
records for anaccount
(for example, the Savings
account as shown in the diagram below), it does so in the context of:
- the grandparent
customer
record (Lucia Garcia
) - the parent
account
record (theSavings
account that belongs toLucia Garcia
) - all same-table parent sibling
account
records (theChecking
account that also belongs toLucia Garcia
) - all cross-table parent sibling
loan
records (the 2Mortgage
and 1Consumer
loans that belong toLucia Garcia
) - all cross-table sibling
card
records (theDebit
andCredit
cards that also belong to the same parentSavings
account) - all previously generated same-table sibling
transaction
records
The context not used is as follows:
- cross-table cousin records from the
cards
table (cards
that have anotheraccount
as parent whose parent isLucia Garcia
) - cross-table cousin records from the
payments
table (payments
records whose parentloan
record belongs toLucia Garcia
) - same-table cousin records from the
transactions
table (transaction
records with a differentaccount
parent record that belongs toLucia Garcia
)
Summary of context processing scenarios
The table below summarizes the types of records that can be passed as context by the Sequential Context Processor.
Records types | TABULAR | LANGUAGE |
---|---|---|
parent | yes | yes |
grandparent | yes | yes |
same-table siblings | yes | yes |
cross-table siblings | yes | yes |
same-table parent siblings | yes | X |
cross-table parent siblings | yes | X |
same-table cousins | X | X |
cross-table cousins | X | X |