InfoSphere Expert QualityStage v11.7.x

Course Overview

This course teaches how to build Quality Stage parallel jobs that investigate, standardize, match, and consolidate data records. Students will gain experience by building an application that combines customer data from different source systems into a single master customer record. This course also covers the Quality Stage data cleansing process in which students will transform an unstructured data source into a format suitable for loading into an existing data target. They will also write jobs to cleanse the source data by building custom rule sets to standardize data.

Prerequisites

  • Anyone with SQL knowledge
  • Anyone with DataStage or other ETL tool experience is an added advantage.

Audience

  • QualityStage Developers
  • Data Analysts responsible for data quality using QualityStage
  • Data Cleansing Developers
  • Data Quality Developers needing to customize QualityStage rule sets 
  • Data Quality Developers who want to create Match Specifications

Topics

Day 1:
• List the common data quality contaminants
• Describe QualityStage architecture
• Describe each of the following processes:
      -Investigation
      -Standardization
      -Match
      -Survivorship
• Build and run DataStage/QualityStage jobs, review results

Day 2:
• Build Investigate jobs
      -Character Investigations
      -Word Investigations
• Rule Sets and Rule Set files
      -Classification file
      -Dictionary file
      -Pattern Action file
      -Override files
• Build jobs using the Standardize stage
• Interpret standardization results

Day 3:
• Investigate unhandled data and patterns
• Standardization Rules Designer
• Optimizing Standardization Result
• Creation of a Custom Rule Set
• Match Frequency
• Match Specification
• Block and Match criteria

Day 4:
• Build a QualityStage job to identify matching records
• Refining Match Criteria
• Two-Source (Reference Match) Implementation
• Build a Survive job to consolidate matched records into a single master record

Day 5:
• QS Real time
• Additional time for labs

Start Date: 04/15/2024
View Other Schedule

Duration: 3 Days(5 hours a day)

}
Time: 8 AM EAT

Delivery Method: Virtual

Language: English

Course ID: NFCP4D-IBM Knowledge CatalogAD-EAT09

Price: $2500

For a Group Training Contact Us

For further details and inquiries about training programs, please get in touch with us.