The HDFSConnector component allows user to interact with the Hadoop Distributed File System (HDFS). The component can be used to read and write files to HDFS.


The HDFSConnector component can be configured using its Custom Property Sheet (CPS) wizard.

Figure 1: Component Configuration properties in the HDFSConnector CPS


Connection Configuration

Figure 2: Connection Configuration properties


The name or address of the machine on which Hadoop namenode server runs.


The port on which the above server runs.


User that owns the folder in Hadoop cluster on which operations are performed.

Connection timeout (ms)

Connection timeout value, in milliseconds.This value is used when the component is not able to connect to the server when an attempt to connect is made. It waits for the configured time to check whether a connection could be made in that interval.

Operation Configuration

Click Operation Configuration ellipsis button to configure the operations to be performed by the component.

Figure 3: Operation Configuration properties


Operation to be performed on HDFS.

Operations that are supported by the component are:

  • Write
  • Append
  • Copy From Local To HDFS
  • Copy From HDFS To Local
  • Read
  • Delete

Absolute file path in HDFS for the corresponding operation; this path is mandatory for all the operations.

Local Path

Absolute file path in local file system used for operations CopyFromLocalToHDFS and CopyFromHDFSToLocal.

Below properties appear only for Delete operation:

Delete Recursively

If enabled, folders and subfolders will be deleted recursively when using Delete operation.

Delete on Exit

If enabled, files will not be deleted immediately but when the component is stopped when using Delete operation.

Fetch Operation from Input Headers

If this option is enabled, the schema will not be used and operation will be fetced from Input Header. If this option is enabled, the operation and path details configured in the component are over-written with values fetched from input message headers. Otherwise, they will be over-written with values fetched from input XML.

Inputs and Outputs

This section gives a few sample inputs and their corresponding outputs using the Schema as available below.



If the “Fetch Operation from Input Headers” option is enabled, the schema will not be used. In this case, headers with same names as elements mentioned below can be used to specify values.

Figure 4: Input Schema

Replication: Toggles Replication on or off.
Block Size: Sets the required block size.
Overwrite: Toggles the overwrite function.
Buffer Size: Sets the requisite buffer size.
Read Mode: Decides the mode in which the text is read, it can be read as bytes or text. Toggle between them by writing the words Byte/Text.


Descriptions for the following properties are explained under Operation Configuration section above:

  • Operation
  • HDFS Path
  • Local Path
  • Delete on Exit
  • Delete Recursive


Below are a few sample inputs using the schema and their corresponding outputs.


Sample Input

<ns1:HDFSMessage xmlns:ns1="">


If completed successfully, the following message will appear in the Display window (when Display component is connected to theHDFSConnector).

Figure 5: Delete Output

Copy From Local To HDFS

Sample Input

<ns1:HDFSMessage xmlns:ns1="">


Figure 6: Copy from Local to HDFS Output


Sample Input

<ns1:HDFSMessage xmlns:ns1="">


Figure 7: Write Output

Copy From HDFS To Local

Sample Input

<ns1:HDFSMessage xmlns:ns1="">


Figure 8: Copy from HDFS to Local Output


Sample Input

<ns1:HDFSMessage xmlns:ns1="">


Figure 9: Read Output


Sample Input

<ns1:HDFSMessage xmlns:ns1="">


Figure 10: Append Output


To run HDFSConnector component in a Windows machine, the path of ‘winutils.exe’ should be specified as system property ‘hadoop.home.dir’ in runtime arguments of the component. To specify this, navigate to Service Instance Properties > Runtime Arguments > JVM_PARAMS and add '-Dhadoop.home.dir=<path of folder where winutils.exe is present’ to already existing JVM parameters.

Adaptavist ThemeBuilder EngineAtlassian Confluence