Orchestrating File Processing and Archiving Using appRules File Iteration Activities

A common requirement in IT shops is the ability to quickly process lots of files in a manageable and predictable manner.  CSV, Excel, Access, XML, JSON are some of the file types that you may be asked to process and archive.

We recently provided a project snippet to a customer that was migrating hundreds of Excel and Access files from multiple locations into a centralized MySQL database.   In this post, we will use this opportunity to discuss the project snippet and the file iteration activities in appRules.

A majority of the activities that were used in the snippet come from the appRules File System module which is shown below:

In a future post, we will discuss the activities in the appRules File System module as a group.  Note that in addition to the included activities, file and directory functions can also be performed using custom functions.  For this walkthrough, we will concentrate on the activities that were used in the project snippet.

 

Below is the project snippet:

Below is the description of the main activities used in the project to perform the iteration of the files:

  1. DisplayName: InitializeFileInfoSource

Workflow Activity: DirectoryInitializeFileInfoSource

Description: The DirectoryInitializeFileInfoSource activity creates records for each file in a folder.  The records contain information for each file and the file attributes are treated as data field values as shown below.  This feature allows you to treat the file just like other records in appRules.  In this example, the DirectoryInitializeFileInfoSource is the Source and upon initialization, it loads the first record.

Below are the properties of the DirectoryInitializeFileInfoSource activity.

Once configured, the record data fields can be accessed as shown below:

  1. DisplayName: Archive File

Workflow Activity: FileMove

Description: The FileMove activity is used for moving a file from one location to another.

MoveFrom: Instead of specifying a specific file path in this property, the value used in this case is a DataFieldValue from the current file record.

MoveTo: This is the destination for the file.  The file can be move to a specific file path or to a folder as specified.

  1. DisplayName: GetNextFileInfoRecord

Workflow Activity: DirectoryGetNextFileInfoRecord

Description: This activity gets the next file info record which contains the attributes of the specific file.  When the end of the records is reached, App.Result is set to false and iteration will end.

 

Conclusion:   

The appRules File System module includes workflow activities that can be used to iterate files for large file processing projects.  In this example, by using just three activities, we are able to iterate and archive the files without writing code.

 

To download appRules: www.appstrategy.com/appRulesTrial

Importing Very Large Delimited and Fixed-Length Files into On-Premise & SaaS Platforms Using appRules

Splitfile Activity

In this post, we will take a deep dive into the appRules SplitFile workflow activity. SplitFile is part of the appConnector.DelimitedFiles and appConnector.FixedLength file modules.

SplitFile can be used to split a file into multiple files  based on number of files or number of records that you specify.  In addition to splitting the file, the SplitFIle activity also allows you to specify a workflow to be executed once each file is created.  This feature used in conjunction with the High Performance options of the Microsoft Dynamics CRM Target activity is what allows you to compose and deploy workflows that deliver the best bulk load performance.

To illustrate the usage of the Split file activity to split and load large CSV and Fixed Length files into Microsoft Dynamics CRM, we will do a walkthrough of two sample workflows.   Note:  The process is the same for importing data into Salesforce, NetSuite, Oracle or other platforms.

In the first workflow (CreateChildCSVFiles), we will use a single SplitFile activity (along with the InitializeProject activity) to split a very large CSV file into smaller files.  After each file is created, a new instance of another workflow will be launched to process the records in the newly created child file.  Ofcourse you can add more activites to your workflow to suit your needs.

In the second workflow (ProcessChildCSVFile), we will show a sample workflow that is launched by the CreateChildSCVFiles after each child CSV file is created.  ProcessChildCSVFile is where the work is done to import the CSV records into Dynamics CRM.

 

CreateChildCSVFiles Workflow

Below is the CreateChildCSVFiles  workflow for splitting the large CSV file into multiple files.  After creating each file, the ProcessChildCSVFile  workflow is launched to process the records in it.

Figure 1 :  CreateChildCSVFiles workflow

 

 

The table below shows notes for the key properties of the workflow activities used in composing the workflow:

 

# Activity Type Notes
1 initializeProject Use the default values
2 SplitFile The SplitFile activity is used to split the file into smaller files and launch a new instance of the ProcessChildCSVFile workflow to process the records.

 

Configure the properties of this activity as follows:

 

  • Click the button to configure the SplitFileDefinition property to define the values required for the process. See figure 2 below for details.

 

3 CloseProject Closes the workflow, logs statistics and frees resources
 

Figure 2 : appRules SplitFile Activity Configuration window

Configure the values by following the instructions on the window

 

ProcessChildCSVFile Workflow

Below is the ProcessChildCSVFile workflow for processing the records in the files created by the main workflow after splitting the large CSV file into multiple files.  After creating each file, the ProcessChildCSVFile  workflow is launched to process the records in it.   By using this approach to launch and run mutiple processes,  the job is completed much faster.

 


Figure 3 :  ProcessChildCSVFile Workflow

The workflow must be configured to receive the workflow arguments to be passed to it by the master workflow CreateChildCSVFiles – (FilePath, FileNumber and RecordCount) .

Figure 4: Workflow arguments for ProcessChildCSVFile

The table below shows notes for the key properties of the workflow activities used in composing the ProcessChildCSVFile workflow:

 

# Activity Type Notes
1 initializeProject Use the default values
2 DynamicsCrmTarget

(InitializeDynamicsCrmTarget)

This activity is used to define a Target to be used for saving the CSV file records to the Microsoft Dynamics CRM data source.  You can add multiple Targets if your CSV file or requirements span more than one entity in Dynamics CRM.   Configure the properties of this activity as follows:

 

  • Click the button to configure the DataSource property to select the option for connecting to Dynamics CRM Online.

 

  • After configuring the DataSource property, select the PrimaryEntity from the dropdown list (account, contact, etc.).

 

  • Click the button to configure the HighPerformanceOption property. The default value for this property is set to Disabled. To utilize high performance option, you must select one of the other options to enable it.

 

3 CsvSourceFile

(InitializeDelimitedTargetFile)

This activity is used to define the Source CSV file to be imported into Dynamics CRM.  Configure the properties of this activity as follows:

 

  • Click the button to configure the DataFileDefinition property. This specifies the location of the file. Since the file name is passed as a workflow argument , select WorkflowArgument and select  the argument name (FileName) from the list.

 

  • Set the AutoLoad property to FirstRecord to read the first record from the file once the activity is initialized.

 

  • You can accept the default values for other properties or change them if necessary. These include Encoding, FieldEscapeChar, FieldQuotation etc.

 

4 While Use a While activity to control the flow of the workflow.

 

  • Configure the Condition by using the App.Results flag to check the last action performed by the activity before the While activity.

 

  • Add a Sequence activity to the Body of the While activity. This Sequence will contain activities that will read, map and save the CSV records in the Dynamics CRM entity.

 

 

5 MapAndSetCrmFields

(MapAndSetDynamicsCrmFields)

Use this activity to Map the CSV fields into the Dynamics CRM entity attributes

 

  • Select the CsvSourceFile activity as the Source.
  • Select the DynamicsCrmTarget activity as the Target
  • Click on the button to configure the FieldMaps property

 

 

6 InsertDynamicsCrmRecord

(InsertDynamicsCrmRecord)

Inserts the entity record into Dynamics CRM.

 

  • Select the DynamicsCrmTarget activity as the Target

 

7 GetNextCsvFileRecord Reads the next record from the CSV file
8 CloseDynamicsCrmTarget Closes the connection to Dynamics CRM
9 CloseCSVSourceFile Closes the CSV file
10 CloseProject Closes the workflow, logs statistics and frees resources
 

 

 

Conclusion

The appRules SplitFile activity is a simple and yet very powerful tool that can be used to tackle one of the more time consuming processes in data migration and integration.  Used in conjunction with other facilities in appRules, it offers unlimited options for processing CSV and Fixed Length files containing millions of records in a timely and cost-effective manner.

 

You can download appRules Studio including this sample project at: http://www.appstrategy.com/appRulesTrial

 

 

How to Implement a File Watcher Solution Using the appRules WatchFileSystem Activity

Many integration projects require monitoring of a directory on the file system for changes.  When a new file is added to a location or changes made to the location, the file system notification is utilized to process the file or to perform some other actions.

The system of watching a location on the file system is known as a “File Watcher System” or “File Drop System”.  appRules supports the quick implementation of a File Watcher Solution by utilizing the WatchFileSystem activity of the appConnector File System module.  You can specify the directory to monitor and when a file event occurs, you can directly launch a job or run a function to process the file or perform other actions.

Continue reading “How to Implement a File Watcher Solution Using the appRules WatchFileSystem Activity”

Data Integration 101: Anatomy of a Simple ETL Project Using appRules

In this Data Integration 101 post, we will do a quick tour of a simple ETL project using appRules.

The project which is shown below is used to perform bulk export of data from a SQL database to a CSV file.  No code was used to accomplish the task and it utilizes only three main activities.

The main activities are described below:

# Activity Description
1 InitializeSqlServerCeSource This activity is the initialization for the SQL Source.  You can select an activity from any SQL module – SQL Server, Oracle etc.  Configure the properties of this activity from any SQL module to select the records to export.
2 InitializeDelimitedTargetFile Configure this activity from the Delimited File module to define the Target CSV file and its attributes.
3 MapAndSaveRecords Use this activity to map data fields from the Source to the Target.  The activity also saves the records.

 

Conclusion

This example showcases the no-code ETL capabilities of appRules.  The same process can be used to export data from Accounting, CRM, ERP and other modules – without writing code.

 

You can download appRules with sample data integration projects at: http://www.appstrategy.com/appRulesTrial