Formattings
What is a Formatting
Formattings in the Data Hub refer to the process of transforming and structuring data into different formats to ensure compatibility with various systems, applications, or reporting tools. Data often comes in various raw or unstructured forms, and formatting allows you to convert it into a consistent, standardized, or machine-readable format that meets specific requirements. This can include converting data into formats such as JSON, XML or CSV.
The key benefits of Formattings include making data interoperable across different platforms, improving readability for users and systems, and ensuring that data is structured in a way that makes it easy to process, analyze, or integrate. For example, when exporting data for use in a reporting tool, you might need to format it as CSV for easy import. Or, when interacting with an external system or API, you may need to format data as JSON to match the expected request and response structures.
In the Data Hub, formatting offers the flexibility to adjust data presentation, making it adaptable to the needs of various clients, applications, or systems. Whether you need to standardize the data to a specific format or dynamically adjust the format based on the user’s requirements, Formattings provide the necessary tools for smooth data integration and efficient communication. This ensures that the data you work with is always in the right structure, enhancing overall data accessibility and usability.
Navigation to the Formattings page
To get to the Formattings page from the dashboard, find the Data Hub Tile and select Formattings.
If the tile shown is deactivated, you do not yet have a Data Hub license. In this case, please contact sawsconnector@saws.de.
If you are not on the dashboard click the most left button in the toolbar to navigate to the dashboard page.
If you are already in the Data Hub you can use the dropdown menu in the toolbar and select the Formattings item.
Now you should be in the Formattings page which shows the Formattings table list.
Creating a Formatting
To create a new Data Collection, click on the Plus Button on the top left corner of the Formattings table.
The Editor will open where you can configure the metadata of your Formatting which consists of the following:
- Name
- Description
- Type
Further fields will depend on the type chosen.
Name
The name should be descriptive and unique.
Description
With the Description you can provide additional information and it will appear when you select the Formatting somewhere as a reference.
Type
The type field lets you choose between JSON, XML and CSV as your target output type.
Record Format
The Record Format is the most granular level of formatting. It transforms each individual record using the configured template.
Data Collections
Restrict the Record Format to specific Data Collections. Only Endpoints or Record Formatting References that point to one of the selected Data Collections can use this Record Format. If no Data Collection is selected, this Record Format can always be used. Selecting Data Collections will give you the intersection of all fields of every Data Collection as autocomplete in the Template.
For more information on how the restriction works read the Filter article.
Template
In the Template you can define your target shape of your data. You dynamically insert different data with placeholders which use a JavaScript like syntax. You can reference the current formatted record with ${record}
. For accessing a field you have to write ${record.<field>}
.
To access the id
of the record: ${record.id}
If your field name does not start with a letter or an underscore or one of the following characters are not letters, numbers, underscores or dollar signs you have to use the bracket notation.
To access the user-name
of the record: ${record['user-name']}
To access the first element in the array field roles
: ${record.roles['0']}
To access the index of the record: ${index}
To access a reference: ${refs.<ref-name>.records}
(this will always be the formatted reference)
References
Record Formats support referencing other Data Collections, allowing the inclusion of related data.
To add a reference click on the Add Reference button in the References field.
This will open another editor where you can configure your reference.
Name
The name how you can access the reference in the Template.
${record.<ref-name>}
will be the raw reference data. Here you can select individual fields like ${record.<ref-name>.<field-name>}
. If the name of your Reference is also a field in the original record, the original record field will be overwritten with the raw reference data.
${refs.<ref-name>.records}
will be the formatted reference data (via the Reference Template). Here you can not access individual fields because the formatting could be something other than JSON.
Data Collection
The Data Collection you want to reference.
Filter
The Filter you want to use to find your Data Collection record reference. You can either use your own Filter or the Unique Key. In the Placeholders you can reference the original data with ${record.<field>}
.
Template
Here you can select a Record Format for the Reference.
Multi + Offset
When setting up a Reference in a Record Format, you can use a Filter to search for data in another Data Collection. This filter may return multiple matching records.
If multiple is turned off, only a single record will be selected. You can use the offset option to choose which result from the filtered list should be returned (e.g., offset: 1 returns the second result).
If multiple is turned on, the reference will return an array of matching records instead of a single one.
In this case the offset defines where the array starts (i.e., skips the first offset records).
You can also specify a limit to control how many records to return.
Additionally, you can apply sorting to the results to define their order before offset and limit are applied.
This setup gives you fine-grained control over how referenced data is selected and structured, whether you need a single related item or a whole list of them.
Virtual Fields
With Virtual Fields you can create additional fields in your ${record}
placeholder. Fields defined in Virtual Fields will overwrite native fields of ${record}
and will overwrite added fields via References.
In Virtual Fields you can select fields of you record with $<field-name>
or references with $<ref-name>.<field-name>
.
Virtual Fields provides special syntax for calculating and manipulating data.
Arithmetic Operators
$add adds numbers or combines dates.
{ totalPrice: { $add: ["$price", "$tax"] } }
$subtract subtracts numbers or dates.
{ totalPrice: { $subtract: ["$endDate", "$startDate"] } }
$multiply multiplies numbers.
{ totalPrice: { $multiply: ["$price", "$quantity"] } }
$divide divides numbers.
{ "totalPrice": { "$divide": ["$total", "$count"] } }
String Operators
$concat joins strings together.
{ "fullName": { "$concat": ["$firstName", " ", "$lastName"] } }
$toUpper / $toLower changes the string case.
{ "upperName": { "$toUpper": "$name" } }
$substr extracts part of a string.
{ "partialTitle": { "$substr": ["$title", 0, 10] } }
Date Operators
$dateToString formats a date into a string.
{ "$dateToString": { "format": "%Y-%m-%d", "date": "$createdAt" } }
$year, $month, $dayOfMonth, $hour, etc. extracts parts of a date.
{ "$year": "$createdAt" }
Conditional Logic
$cond expects a if-then-else logic.
{ "$cond": { "if": { "$gte": ["$price", 100] }, "then": "expensive", "else": "cheap" } }
$ifNull returns the first non-null value.
{ "$ifNull": ["$nickname", "$name"] }
$switch returns the first match between multiple choices.
{ "$switch": { "branches": [ { "case": { "$eq": ["$status", "A"] }, "then": "Active" }, { "case": { "$eq": ["$status", "I"] }, "then": "Inactive" } ], "default": "Unknown" } }
Array Operators
$size gets the number of elements in an array.
{ "$size": "$tags" }
$arrayElemAt gets an item at a specific index.
{ "$arrayElemAt": ["$items", 1] }
$concatArrays merges arrays.
{ "$concatArrays": ["$array1", "$array2"] }
$in checks if a value is in an array.
{ "$in": ["$role", ["admin", "moderator"]] }
Object Operators
$mergeObjects combines multiple objects into one.
{ "$mergeObjects": ["$defaults", "$overrides"] }
$objectToArray converts an object to an array of key-value pairs.
{ "$objectToArray": "$metadata" }
$arrayToObject converts an array of key-value pairs back into an object.
{ "$arrayToObject": "$keyValueArray" }
Type Conversion
$toString, $toInt, $toDouble, $toDecimal, etc. convert values between types.
{ "$toInt": "$stringValue" }
Comparison Operators
Useful for $cond, $match, etc.
- $eq: Equal
- $ne: Not equal
- $gt: Greater than
- $gte: Greater than or equal
- $lt: Less than
- $lte: Less than or equal
{ "$gt": ["$price", 100] }
Example
Let’s say you have a products
Data Collection with the following sample record:
{ "_id": "p1", "name": "Sneaker A", "price": 120, "discount": 20, "stock": 15, "variant": "v1", "createdAt": "2023-09-01T10:00:00Z", "categories": ["shoes", "men", "sports"] }
The Goal is this output:
{ "id": "p1", "name": "Sneaker A", "finalPrice": 100, // price minus discount "isInStock": true, // stock > 0 "categoryCount": 3, // the number of categories this product belongs to "createdYear": 2023 }
In Virtual Fields:
{ "finalPrice": { $subtract: ["$price", "$discount"] }, "isInStock": { $gt: ["$stock", 0] }, "categoryCount": { $size: "$categories" }, "createdYear": { $year: "$createdAt" } }
In Template:
{ "id": "${record.id}", "name": "${record.name}", "finalPrice": "${record.finalPrice}", "isInStock": "${record.isInStock}", "categoryCount": "${record.categoryCount}", "createdYear": "${record.createdYear}" }
Document Format
The Document Format is used in Export Endpoints. It defines the overall response of the Endpoint.
Record Formatting
Each Document Format must reference a Record Format which defines how a Record is transformed.
Template
In the Template you can define the shape of you response. You can reference the formatted records with ${document.records}
.
You can access additional metadata with ${document.meta.<field-name>}
. Possible metadata fields are:
count
: the total number of records (after they got filtered)limit
: the limit set in the queryname
: the name of the Data Collectionoffset
: the offset set in the query
Import Format
The Import Format is used in Import Endpoints. It defines which Record Format is getting used as well as how the parsing options are defined.
Record Formatting
Each Import Format must reference a Record Format which defines how a Record is transformed.
Arguments
Arguments are fixed set parsing options for JSON or CSV. Every option can be overwritten if it is set in the options query.
CSV
charset defines the encoding of the files which will be used with this Import Format.
delimiter defines how columns are separated. <auto>
tries to automatically guess your delimiter.
<auto>
only works if your delimiter is one of the following: ,
, \t
, |
, ;
, RS
(ASCII code 30), US
(ASCII code 31)
The delimiter can not be one of the following: \r
, \n
, "
, \uefeff
escapeChar is used to escape the quote character within a field.
quoteChar is used to quote fields. The quoting of all fields is not mandatory. Any field which is not quoted will correctly read.
skipFirstNLines to skip first N number of lines.
dynamicTyping will convert numberic/boolean data to their type instead of remaining strings. Numeric data must conform to the definition of a decimal literal. Numerical values greater than 2^53
or less than -2^53
will not be converted to numbers to preserve precision. European-formatted numbers must have commas and dots swapped.
comments is a string that indicates a comment (for example, #
or //
). When the parser encounters a line starting with this string, it will skip the line.
JSON
path is a point separated string to the record array or a single record which should be imported.
For a single record use the path where the record is located like data.product
. If the record is located at the root leave the path option empty.
For an array of records use *
like data.products.*
. If the record array is at the root level just use *
.
Default Formattings
Document Format
Standard (Owl):
{ "count": <number-of-records>, "<data-collection-name>": <array-of-records> }
Only Data (Owl):
<array-of-records>
Record Format
No Record Formatting (Owl) will not format the record at all.
Import Format
Standard Import (Owl):
- CSV
{ "charset": "UTF-8", "delimiter": <auto>, "escapeChar": "\"", "quoteChar": "", "skipFirstNLines": 0, "dynamicTyping": false, "comments": "" }
- JSON
{ "path": "*" }