This guide details the supported file formats, data types, validation rules, and transformation capabilities of ContentAtlas.
The system supports the following formats. For best results, ensure your files are UTF-8 encoded.
col_0, col_1, etc.,) is default.[
{"name": "Alice", "email": "alice@example.com"},
{"name": "Bob", "email": "bob@example.com"}
]
The system automatically detects and validates data types during import.
| SQL Type | Logic & Validation | Example Valid Values |
|---|---|---|
| INTEGER | Whole numbers only. Decimal values (e.g., 1.0) are converted if whole, otherwise rejected. Currency symbols ($) and commas are stripped automatically. |
123, "$1,200", 1.0 |
| DECIMAL | Floating point numbers. | 10.50, "$19.99" |
| TIMESTAMP | Auto-detects formats (ISO 8601, US, etc.). Converts to UTC ISO 8601. | 2025-01-01, 01/31/2025 |
| BOOLEAN | Case-insensitive match. | true, Yes, 1, T |
| TEXT | Default for strings. Preserves formatting. | Any string |
NULL for that row, and a warning is logged. The row is not rejected unless it violates a specific uniqueness constraint.
You can define rules in the mapping_json to clean data before it enters the database.
Splits a cell containing multiple values (e.g., "tag1, tag2") into separate columns.
{
"type": "split_multi_value_column",
"column": "tags",
"delimiter": ",",
"outputs": [
{"index": 0, "target_column": "tag_primary"},
{"index": 1, "target_column": "tag_secondary"}
]
}
Combines multiple columns into one.
{
"type": "merge_columns",
"columns": ["first_name", "last_name"],
"target_column": "full_name",
"separator": " "
}
Cleans data using Regular Expressions.
{
"type": "regex_replace",
"column": "sku",
"pattern": "^SKU-",
"replacement": ""
}
Formats phone numbers to E.164 standard (e.g., +14155552671).
{
"type": "standardize_phone",
"column": "phone_number",
"default_country_code": "1"
}
"duplicate_check": {
"enabled": true,
"unique_columns": ["email"],
"check_file_level": true
}
Use these JSON templates in your API calls (mapping_json field).
Validates emails and standardizes phone numbers.
{
"table_name": "customers",
"db_schema": {
"customer_id": "INTEGER",
"email": "VARCHAR(255)",
"phone": "VARCHAR(20)",
"signup_date": "TIMESTAMP"
},
"mappings": {
"customer_id": "ID",
"email": "Email Address",
"phone": "Contact Phone",
"signup_date": "Joined"
},
"rules": {
"column_transformations": [
{
"type": "standardize_phone",
"column": "Contact Phone",
"target_column": "phone",
"default_country_code": "1"
}
]
},
"duplicate_check": {
"enabled": true,
"unique_columns": ["email"]
}
}
Handles currency parsing and ID deduplication.
{
"table_name": "sales_transactions",
"db_schema": {
"transaction_id": "VARCHAR(50)",
"amount": "DECIMAL",
"product_name": "VARCHAR(100)",
"sale_date": "TIMESTAMP"
},
"mappings": {
"transaction_id": "Txn ID",
"amount": "Total Price",
"product_name": "Item",
"sale_date": "Date"
},
"duplicate_check": {
"enabled": true,
"unique_columns": ["transaction_id"]
}
}
Schedule a demo to see Content Atlas in action.