FREE Open Class - Building Enterprise Big Data Lake

Event Information

Share this event

Date and Time




80 Bloor Street West

#Suite 500

Toronto, ON M5S 2V1


View Map

Friends Who Are Going
Event description


Enterprises have started to build up their data lake in Canada
“We have to keep our data for 7 years for compliance reasons, but we’d love to store and analyze decades of data - without breaking the machine and the bank.”

These are some of the concerns marked in A Big Data Cheat Sheet: What Marketers Want to Know. Thus, for many enterprises, the needs for data engineers are skyrocketing. They want talented and experienced data engineers who can solve their big data problems, and they would accommodate to the employees' needs as much as they can.

Want to be part of these urgently-needed talents? We are here to help.

Topics of the class:

1. Overview of Hadoop Ecosystem

1.1 Hadoop Distributed File System (HDFS)

1.2 MapReduce

1.3 YARN

1.4 Hive

1.5 Pig

1.6 Spark

1.7 Introduction of Popular Data Ingestion and Egestion Technologies

1.7.1 Nifi

1.7.2 Flume

1.7.3 Sqoop

1.7.4 Logstash

1.7.5 Storm

1.7.6 Spark Streaming and SQL

1.7.7 InfoSphere DataStage

1.7.8 Talend

1.7.9 Facebook Scribe

1.7.10 Apache Apex

1.8 Data Lake Concept

1.8 Current big data state of art in financial industry

1.8 Tying all together in one architecture diagram - data lake


Short Intro

  • This 12-session part-time big data course not only build a strong foundational knowledge of big data and its ecosystem, but also gives you the hands-on practical experience to build an end-to-end real-time analytics big data application that you can confidently add to your data science project portfolio

Learning Outcome

  • Gaining solid understanding of the Big Data ecosystem and various real-world use cases

  • Launching and setting up Hadoop clusters in various environments: AWS EMR, Hortonworks, Cloudera, VM

  • ETL and querying large datasets with Apache Hive/Pig as well as SQL on Hadoop tools such as Presto, Impala and Phoenix

  • Ingesting data into NoSQL databases such as MongoDB, DynamoDB, Cassandra and HBase
  • Building big data ETL pipeline with Spark
  • Data manipulation for machine learning with Spark SQL, Dataframe and Dataset
  • Developing machine learning models with Spark ML/MLlib
  • Deploying machine learning models for real-time analytics with Apache Kafka and Spark Streaming

Course Details

Most students who are just getting started with big data will feel overwhelmed due to the shear number of tools one need to learn. Luckily, with a well-defined course outline, experienced instructor and our helpful teaching assistants, WeCloudData is here to point your at the right direction, save time by focusing on knowledge points that matter, stimulate your interest with hands-on assignments and projects so that you can maximize your learning outcome.


  • Basic python programming skills

  • Understanding SQL and relational databases
  • Strong curiosity and a passion for learning and applying big data technologies in real life

How is this course delivered

  • This classroom-based course is delivered with 40% lecture, 20% labs and 40% project
  • You will meet the instructors and teaching assistants in person and learn with your peer students
  • You will work on hands-on project to aid you build your big data portfolio that is often the key to a successful job placement
  • You will work with your peer students in group on data challenges
  • Use cases and best practice discussions will be delivered via Slack App

Assistance you will get from us

  • Our teaching assistants will help address your questions throughout the learning period
  • One on one chat with our instructors and mentors
  • Resume help and suggestions upon completion of the course
  • Job referrals

Certification & Next Step

  • You will receive a course completion certificate after successfully completing the course project as well as assignments
  • After the completion of this course, you can continue to take the Data Science Capstone Project course to become a WCD Certified Data Scientist
Share with friends

Date and Time



80 Bloor Street West

#Suite 500

Toronto, ON M5S 2V1


View Map

Save This Event

Event Saved