Skip to main content
Skip table of contents

Flat File Schema Editor

Contents

This section describes the Fiorano Text Schema Editor tool which is used to design XML schemas as .tfl Text Format Layout files. These TFL files are used by the XML2Text and Text2XML prebuilt components to facilitate data conversion of non-XML data from and to its corresponding XML format respectively.

In case you require your composite component flow to read or write data from your data repository which exists as text or flat-files, you can use the FileReader component to read this flat-file and transform flat-file data into its corresponding XML using the Text2XML component.

The opposite can be done using a combination of the XML2Text component and the FileWriter component. But before you can transform data from flat-file format into its corresponding XML or vice versa, you require defining a File Schema which can aid the transformation. This File Schema may be understood as the format meta-data that is required in both the above mentioned instances.

The Text Schema Editor (TSE) is a tool which assists you to visually define the format and hierarchy of the non-XML data graphically. The format structure created by this editor is called the File schema in which the structure of the non-XML data is defined in terms of records and fields. This format is stored using XML grammar in tfl (Text Format Layout) files.

The Schema defines the rules used to convert non-XML text to XML text and vice versa.

Once this schema format is defined, it can be used by:

  • Text2XML transforms flat-file data to its corresponding XML
  • XML2Text transforms XML data into its corresponding flat-file format

The following diagram shows how the FileReader component uses the transformation components to read XML and non-XML data.


Figure 1: Using File Reader and Transformation components

The non-XML data mentioned above can be delimited, positional or both. TSE also provides the test functionality in which the user can verify and test the schema formats created. In the test functionality, the user can generate sample data and can also transform sample non-XML data to XML and vice versa.

Text Format Layout Concepts

A tfl (text format layout) document is a specialized XML grammar which is used to describe the structure of non-XML structured (delimited, positional) data. In the tfl document, the structure of the data is defined as a hierarchical tree of records and fields in a given order.


Figure 2: Structure of a tfl document

The schema of the structured data is added as a child node to this Root Node. This node is called the Schema Node. When you create a new schema in Fiorano Schema Editor, the Root Node and the Schema Node are created automatically.

  • Schema Node: In the schema structure, each opened schema file is shown as Schema Node and is the child of the Root Node. The Schema Node corresponds to the Root tag of the output XML which is generated from the structured non-XML text or input XML which is to be converted to the structured non-XML text. Schema Node can also be renamed. The properties of the Schema Node represent the default properties which can be used during data transformation. In a Schema Node, you can add multiple record nodes which represent the structure of input/output data. Adding fields to the Schema Node is not allowed.
  • Record: Record represents a collection of information. It can contain a set of fields and/or other records.
  • Field: Field represents items of information that are simple in nature, such as strings and numbers.

Creating Flat File Projects

Flat file projects can be created using Flat File Schemas view in Fiorano Tools Perspective. In the eStudio menu bar, navigate to Windows > Open Perspective > Other, select Fiorano Tools and click OK. This takes to Fiorano Tools perspective where a view called Flat File Schemas can be seen.

In this section a sample Employee Schema will be created to illustrate the usage of Flat File Schema Editor. Let's assume a flat file containing all the Employee Records (Employee Name, Age, Address) with each employee record in a new line where individual fields are comma separated as shown below.


Figure 3: Sample csv data

To define schema for this data, first create a new Flat File Project. To do this,

  1. Right-click on Flat File Schemas Node and select New File Schema Project option.


    Figure 4: New File Schema project
     
  2. This launches a wizard where project details can be configured. Provide name for Project and click the Finish button. A new project will be created with project name as root node and "Empty Schema" as default schema node.

    Load from flat file option can also be used to create flat file schemas by loading the data from the flat file. This will be discussed in section Generating Flat File Schema using sample data.

  3. Rename Empty Schema as Employee Schema by selecting the node and editing the Name property in properties view. Properties like Record Delimiter and Default Filed Delimiter can be configured. Since the Employee records are separated by a new line provide the value for Record Delimiter as \r\n.


    Figure 5: Flat file editor
     
  4. Right-click on Employee Schema option and select Add->Record option. Give the record name as Employee and click OK.

    Record will be added to the tree on the left hand side and the editor content is updated accordingly.

  5. Since an employee record is comma separated, select the parsing type as Delimited and provide comma (,) as the delimiter value.


    Figure 6: Configuring Flat File Schema elements
     
  6. Right-click on Employee Record and select Add > Fields option.

    Fields can be added one by one or comma separated values can be provided to add multiple fields to a record.

  7. Provide Field Name as Name, Age, Address and click OK. Three fields Name, Age, Address are added and the schema is updated accordingly.

Testing Flat File Schema

Flat file schema generated can be tested in Test page.

  1. Click on the Test tab in the schema editor to open the Test page.


    Figure 7: Flat file schema test
     
  2. Sample data can be generated by clicking the Generate Sample Flat Format button or the sample can be pasted in the Flat Format section.


    Figure 8: Flat format sample data
     
  3. Click the Convert Flat Format to XML button. The flat format sample will be converted to XML and is displayed in the XML Format section.


    Figure 9: XML output

Generating Flat File Schema using sample data

Flat file schema can be generated by configuring flat file elements as mentioned in section Creating Flat File Projects. The same can be done by loading the tfl data which is detailed in this section.

To define schema for this data, first create a new Flat File Project. To do this,

  1. Right-click on Flat File Schemas Node and select New File Schema Project option.


    Figure 10: New File Schema project
     
  2. This launches a wizard where project details can be configured.
  3. Provide the project name and select Load From Flat File option. Select the flat file and click Next.
  4. Provide the Schema Node name, record delimiters values. Since the Employee records are separated by a new line provide the value for Record Delimiter as \r\n. Click Load Data button to load the data from flat file. The data is loaded and records are displayed based on the delimiter value.


    Figure 11: Schema configuration
     
  5. Since all the data corresponds to individual employee records, duplicate rows can be removed. To remove a row, right-click and select Delete. Duplicate rows Record2 and Record3 can be removed.
  6. Rename Record1 as Employee and click the Next button.


    Figure 12: Configuring Records
     
  7. The child elements can be configured in the Schema Configuration page. Select Employee node in the left hand tree viewer. The details of the node are displayed.
  8. Provide comma (,) as the Field delimiter value.


    Figure 13: Configuring Record child elements
     
  9. Click the Configure Child Elements button. The data is parsed using the child delimiter value and is displayed in a table. The element Name, type and data type can be chosen in this page. Provide the details and click OK.


    Figure 14: Defining Record child elements
     
  10. The individual Fields are generated. Details of each node can be seen by selecting the node on the left hand side tree viewer. Click Finish to finish the configuration.


    Figure 15: Employee Schema configuration
     
  11. The schema is generated and is shown in the editor. This can be tested as described in section Testing Flat File Schema.

Sample Schemas

This tool is shipped with five samples that represent various schema types. These are broadly classified under two categories, namely Delimited File Schema samples and Positional File Schema samples.

The prebuilt schema samples are given below:

  • Delimited File Schema samples
  • CSV File Schema
  • Nested CSV File Schema
  • Positional File Schema samples
  • Positional File Schema
  • Nested Positional File Schema
  • Positional in Delimited File schema

To import the prebuilt schema samples right-click on Flat File Schemas node and select Import Sample Project option.


Figure 16: Import Sample Project

A dialog is launched listing all the available samples. Select a sample and click ok to load it in the editor.


Figure 17: Select sample projects

Points to note

  • Records can be positional or delimited. A Delimited record can contain a positional record as a child but a positional record cannot contain a Delimited record as a child. Delimiters have to be provided for delimited records where as Start and End Positions have to be provided for Positional records.
  • The Field properties change based on the parent record parsing types (Delimited/Positional)
  • Whenever there's any error in the generated schema, an error badge is shown on the corresponding element indicating the error. Place the cursor to see the error message.


Figure 18: Schema Errors

  • The order of Fields or Records can be changed. To change the order, select the parent element and select Change Order option. A dialog is displayed where the order can be altered.


Figure 19: Change elements order

  • A flat file schema project can be exported using the Export option available on the context menu of the project. Similarly a project can be imported by selecting Import Project option available on Flat File Schemas context menu.


Figure 20: Import/Export project

  • We can close a project and load them later by using Open Closed Projects option on Flat File Schemas Node.

Flat File Element Properties

The properties associated with flat file nodes are described in this section.

Schema Node Properties

The Schema Nodes of all the file schemas represent the same set of properties. These properties act as global properties of the file schema which are available to all the descendent records and fields.

If you change the Name value on the Properties panel, the name of the Root Node in the specification tree automatically changes to match it and vice versa. The name of the node should be a valid XML name.

The following table lists all properties associated with the Schema Node:

Property

Value

Comment Start Identifier

An identifier which indicates the start of a comment in the source file.

Comment End Identifier

An identifier which indicates the end of a comment in the source file. The data between the 'Comment Start' and 'Comment End' identifier is ignored.

 

Comment Start and Comment End Identifiers must not be identical.

Name

The name of the Root Node.

Description

The description of the specification.

Delimiter Value

Type or select a value for the delimiter. To specify a delimiter value, you must first set the Delimiter Type to Custom Delimiter. The delimiter can be multi-character.

Escape Character

Specifies the default value of the escape character for this Schema Instance. Type or select a character value for the escape character. To specify an escape character value, you must first set the Escape Type to Character.

Delimiter Type

Select one of the following options to choose a delimiter for the records/fields directly below the current record.

  • Default Field Delimiter Indicates that the delimiter is the value of the Default Field Delimiter property, which is defined for the schema instance.
  • Custom Delimiter Allows the user to designate a field delimiter value for the record. If you select Custom Delimiter, you must specify a delimiter value.

Escape Character Type

You can choose the escape character type from the following values:

  • Default Escape Character - Indicates that the escape character is the value of the Default Escape Character property which is defined for the schema instance.
  • Character - Allows the user to designate an escape character value. If you select Character, you must also specify an escape value.

An escape character is useful if you have a character in your field data that is also used as the delimiter character in the field's parent record. For example, if your field data is the following and you have chosen a comma as the delimiter value of the record that contains the field, TSE interprets the comma after "Fiorano" to be a delimiter, even if you intend for it to be part of the field data:
Fiorano,Software,USA

Solution for this is to place an escape character directly preceding the delimiter character that you want to include in the field data. For example, if your escape character is specified as a backslash, you can place a backslash directly preceding a delimiter character, as in the following example:
Fiorano\,Software,USA

TSE interprets the comma after the backslash as field data rather than a delimiter character.

Escape Character

This is the escape character which is to be used as the field delimiter.

Delimiter Value

Type or select a value for the delimiter. To specify a delimiter value, first set the Delimiter Type to Custom Delimiter. The delimiter can be multi-character.

Escape Character Type

You can choose the escape character type from the following values:

  • Default Escape Character Indicates that the escape character is the value of the Default Escape Character property which is defined for the schema instance.
  • CharacterAllows the user to designate an escape character value. If you select Character, you must also specify an escape value.
  • An escape character is useful if you have a character in your field data that is also used as the delimiter character in the field's parent record.

For example, if your field data is the following and you have chosen a comma as the delimiter value of the record that contains the field, TSE interprets the comma after "Fiorano" to be a delimiter, even if you intend for it to be part of the field data:
Fiorano,Software,USA

Solution for this is to place an escape character directly preceding the delimiter character that you want to include in the field data. For example, if your escape character is specified as a backslash, you can place a backslash directly preceding a delimiter character, as in the following example:
Fiorano\,Software,USA

TSE interprets the comma after the backslash as field data rather than a delimiter character.

Delimiter Type

This is the field delimiter of this file schema. The delimiter can be multiple characters.

Record Node Properties

Every file schema is a unique entity with a unique set of records and fields. You can create a new schema by modifying an existing schema. To modify an existing schema, you need to add and/or remove records. After adding records, specify the properties associated with it. If you remove a record, its properties are also removed along with all child records and fields. In addition to adding and removing records, you can also rename them. You can edit the name of an existing record and its properties by selecting the record and editing it.

Following are some basic rules pertaining to records.

  • Every new record, which you create, is inserted as a descendant of the record that you selected.
  • The name of a record or field needs to be unique. The tool will display an exception if you specify a name that has already been assigned to an existing record or field.
  • When you delete a record, all child records and fields are also deleted.

The following table lists all the properties associated with the record node:

Property

Value

XML Type

The target XML type for the field. Depending on this value, the tag in the resultant XML is generated. Its value can either be Element (default) or None. If 'None' is selected, then the field is NOT mapped to the resultant XML.

Minimum Occurrences

The minimum number of occurrences specified for a particular record. If the record does not occurs the specified number of times, then an exception is thrown.

Maximum Occurrences

The maximum number of occurrences allowed for a specified record. After these many occurrences, the parser will not attempt to match the record and an exception is thrown.

Parsing Type

Specifies whether the data input is to be considered as Positional or Delimited.

Record Identifier Type

Type of the Identifier to be used for identifying a record.

You can choose the Record Identifier from the following values:

  • Field Value: Choose this option if you want to identify the record based on the value of some child field. In this case you need to select the field value in the Record Identifier Value property and must specify a default value for the field which identifies the record. If the identifier field value in the record doesn’t match the default value, a parsing error is thrown on console.
  • Child Count: The record is identified based on the number of child counts. While parsing, if the child count in the record data does not match the number of children defined in the file schema, then parsing error is thrown.
  • Record Length: By selecting this option, a restriction is imposed on the record length which in turn is imposed on the length of each field specified by “start position” and “end position ” properties. Parsing error is thrown if the start and end positions for fields overlap or when the field value at that position is unspecified.
  • None Record: data is parsed against the record definition irrespective of the fact that the data satisfies the complete record definition or not.

Name

The name of the record. The name of the node should be a valid XML name. You cannot provide an existing record the same name as an existing record. Sibling record cannot have the same name.

Description

The description of the record.

Escape Character Type

You can choose the escape character type from the following values:
Default Escape Character - Indicates that the escape character is the value of the Default Escape Character property which is defined for the schema instance.
Character - Allows the user to designate an escape character value. If you select Character, you must also specify an escape value.
An escape character is useful if you have a character in your field data that is also used as the delimiter character in the field's parent record.
For example, if your field data is the following and you have chosen a comma as the delimiter value of the record that contains the field, TSE interprets the comma after "Fiorano" to be a delimiter, even if you intend for it to be part of the field data:
Fiorano,Software,USA

Solution for this is to place an escape character directly preceding the delimiter character that you want to include in the field data. For example, if your escape character is specified as a backslash, you can place a backslash directly preceding a delimiter character, as in the following example:
Fiorano\,Software,USA

TSE interprets the comma after the backslash as field data rather than a delimiter character.

Delimiter Value

Type or select a value for the delimiter. To specify a delimiter value, first set the Delimiter Type to Custom Delimiter. The delimiter can be multi-character.

Delimiter Type

This is the field delimiter of this file schema. The delimiter can be multiple characters.

Escape Character

This is the escape character which is to be used as the field delimiter.

Field Node Properties

Depending on the type of file schema you are defining, you might need to add and/or remove fields. After adding fields to any schema, specify their properties. If you remove a field, its properties are also removed. You cannot add records or fields under a field.

When you add a field, you can immediately rename the field. You can edit the name of an existing field and its properties by selecting the field and editing it.

  • If you click Add > Field from the popup menu that appears after right-clicking the mouse, the new field is inserted as a descendant of the record that you selected.
  • You cannot give an existing field the same name as an existing record.
  • You cannot provide a new field instance the same name as an existing sibling field or record.
  • Sibling fields cannot have the same name.

Any changes to the visible properties in the table are set for the currently selected node of the schema tree, which can be a record or field or the root node.

The following parameters are associated with the field node:

Property

Value

XML Type

The type for the field. This value can either be Element (default), Attribute or None. If None is selected, then the field is NOT mapped to the resultant XML.

Data Type

Represents the data type for the field data. This property can be set if you want to validate the field data against the supported data types. Data types supported by it include String (default), Integer, Numeric, Date, Byte, and Data Format. This can be defined if the data type for the field is either Numeric or Date. For Numeric data type, data format can be defined based on the syntax rules of java.text.DecimalFormat. For Date data type, data format can be defined based on the syntax rules of java.text.SimpleDateFormat.

Minimum Length

The minimum number of characters that the field can contain.

Maximum Length

The maximum number of characters that the field can contain.

Default value

The default value for a field. The field is matched only if it's value is the same as the default value. Can be used to set Headers and Column Names.

Map If Null

Whether or not the field should be defined in the output XML if the value for the field in the source file is null/blank.
This property is redundant if the XML Type for the field is set to None in the General properties set.
If the value for the field in the source data is null/blank but Default Value is defined for the field, then the default value is set in the output XML.
This property field displays only if the structure of the parent record is delimited.

Wrap Character
 

Character used to enclose field data. This property is useful if you have a character in your field data that is also used as the delimiter value for the field's parent node.
For example, if your field data is the following and you have chosen a comma as the delimiter value of the node that contains the field, TSE Parser interprets the comma after "Fiorano" to be a delimiter, even if you intend to include it as a part of the field data:
Fiorano,Software,USA

A solution for this is to define a value for the wrap character property and then enclose the field data in the wrap character. For example, you can set the wrap character property to double quotation marks for the first field and then type your field data, as in the following example:
Fiorano, Software, USA

The comma between the double quotation marks is interpreted by TSE Parser to be field data rather than a delimiter value.
This property field displays only if the structure of the parent record is delimited.
If you have a field that uses a wrap character, there cannot be any data between the wrap character and any delimiter leading or following a wrap character.
If your field data includes characters that are also used as the wrap character, you must enclose those characters in another set of wrap characters.

Padding character

This functionality is for the File Writer. If a certain field is smaller than the required size (either minimum length for delimited records or field length for positional records), then the FileWriter will pad the field with the padding character. Fields are always padded to the right of the field.

Valid Characters

The value for this property represents the set of valid characters for the field value. If this value is set and the field data contains any character which does not belong to this list, then parsing error is thrown.

Invalid Characters

The value for this property represents the set of invalid characters for the field value. If this value is set and the field data contains any character which belongs to this list, then parsing error is thrown.

Trim Spaces

Whether to trim the spaces from the source field data before setting in the output XML. You can opt for trimming the spaces from the following positions:

  • Both (Leading and Trailing)
  • Leading
  • Trailing
  • None

Name

The name of the field. The name of the node should be a valid XML name.

Description

The description of the field.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.