CP4D v4.x DataStage for Developers

Course Overview

This course is designed for beginners in DataStage as well as for those who have worked int DataStage thick client. It begins with introducing the new interface for DataStage: the DataStage Flow. The course covers commonly used stages, cloud connecters and advanced connectors available in DataStage on IBM Cloud Pak for Data (CP4D). 

The course has Lab Exercises for the Units.


DataStage beginner or DataStage users who have used the thick client and want to use the Cloud Pak for Data for DataStage.


  • Understanding of Databases and ETL
  • Basic Familiarity with Cloud services.


Day 1 

  1. Unit 1: Overview of Cloud Pak for Data
    • Introduction of CP4D
      • Login
      • Home Page
      • Profile and Settings
      • Services Catalog
    • CP4D Architecture
  1. Unit 2: Transforming Data using DataStage on CP4D
    • Create DataStage Project
    • Create DataStage Parallel Job
    • Save and Compile DataStage Job
    • Run and monitor DataStage job
    • Configuration Files for DataStage on CP4D
    • Import and Export DataStage job
    • Add Environment Variables
    • Job run settings.
    • Asset browser
    • Export and Import Project
    • Clone a job
  2. Unit 3: Processing Stages
    1. Merge Stage
    2. Join Stage
    3. Sort Stage
    4. Transformer Stage
    5. Remove Duplicates Stage

Day 2 

  1. Unit 4: Amazon S3 Connector
  2. Unit 5: Amazon RedShift Connector
  3. Unit 6: Microsoft Azure Connector

Day 3

  1. Unit 7: Google Big Query Connector
  2. Unit 8: Snowflake Connector
  3. Unit 9: Checksum Stage

Day 4

  1. Unit 10: Change Capture and Change Apply
  2. Unit 11: Dataset
  3. Unit 12: HTTP Connector
  4. Unit 13: ODBC Connector
  5. Unit 14: Compress
  6. Unit 15: Expand
  7. Unit 16: Decode
  8. Unit 17: Encode

Day 5

  1. Administration:
    1. Cluster administration
      • Scaling services
      • Add nodes to your Cloud Pak for Data cluster
      • Backup and restore your deployment
        1. Backup and restore service list
        2. Backing up and restoring an entire deployment
        3. Backing up and restoring volumes
        4. Migrating Cloud Pak for Data metadata and clusters
  1. Cloud Pak for Data platform administration
    • Managing users
      • Connecting to your identity provider
      • Predefined roles and permissions
      • Managing roles and user groups
    • Importing JDBC drivers for data sources
    • Monitoring the platform
    • Managing storage volumes
    • Gathering diagnostic information
    • Customizing the platform
Start Date: 07/08/2024
View Other Schedule

Duration: 5 Days(5 hours a day)

Time: 8 AM EET

Delivery Method: Virtual

Language: English

Course ID: NFCP4D-DS4.0-EET09

Price: $2500

For a Group Training Contact Us

For further details and inquiries about training programs, please get in touch with us.