BIG DIVE Module 4

Deep Dive into Data Engineering

From November 30 to December 4 (Online)

Have you ever wonder how to scale your data-driven project to architectures that can handle bigger data sets and higher computational effort?

Are you working with storage technologies and want to know more about the full data science and machine learning pipelines?

By attending this course you will learn how to handle, store, and query large data sets, implement scalable API, and create effective pipelines for Machine Learning projects. Through a practical approach you’ll be exposed to great strategies and technologies from experts in the fields and industries case studies.

Scientific Approach Performances Optimization Predictive Algorithms Data Exploration & Preparation Communication Value Deploying Data Pipelines Maths & Stats Visual Representation Coding

About Data Engeneering

Preparing, handling and storing a big amount of data play a key point in the data-driven processes. It’s unfortunately common for organizations to misunderstand the data engineer and data scientists positions.

Data scientists had a strong maths and stats knowledge and their programming skills are focused on using data science tools. They aim at tuning algorithms and models in search of accuracy and precision more than designing and managing data flows over computation infrastructures.

Data engineers usually have a stronger background in computer science and storage/back-end technologies. They are in charge of effectively preparing and storing data, creating and deploying data pipelines in production environments.

According to a recent estimation, a good data team has two or three data engineers for every data scientist. The market demand for data engineers is therefore expected to grow rapidly in the next months and years.

Is this course for me?

The course is perfect for people with solid coding and system administration knowledge willing to re-skill their careers as Data Engineer.

An ideal target is therefore represented by System Administrators, Cloud Engineers, DevOps, Backend or Full-stack Software Developers.

At the same time this is the unique opportunity for Data Scientists to touch and explore the entire Data life-cycle, adding to their belt:

  • ability to manage large (for real) data volumes
  • a full data pipelines understanding
  • ability to move algorithms and models into production environments

 

Do you fall into these categories or consider yourself perfect for this course and a great opportunity for your professional growth?

Apply and let our Selection team evaluate your profile.

Syllabus and teachers

The course is aimed to provide:


  • Intro to Kinesis DataStream and real-time data services with AWS
  • Data stream persistence with Kinesis Firehose
  • Handle a Data Lake on AWS in Serverless mode
  • Data Lake queries with AWS Athena
  • Athena in depth: formats for Big Data (Parquet)
  • Data Transformation with AWS Glue
  • Serverless Data Transformation: Fargate and Step Functions
  • Data Lake Virtualization with AWS QuickSite
  • Hands-on lab with scripts and via AWS Web Console

This session will also include:


  • Lectures by experts
    • Big Data on AWS and Scale your data architecture by Guido Maria Nebiolo, Senior Cloud Architect at Storm Reply
    • Open Source and Data Virtualization by Rosario Antoci, IT Infrastructure Specialist at HPE CDS
    • Big Data: let’s Spark! by Alessio Attolini, Big Data and Analytics Developer at Value Partners Digital Technology
  • Hands-on sessions and group exercises to put in practice the lessons learnt

Resident teachers and coordinators:


Walter Dal Mut
Corley

Gabriele Mittica
Corley / Cloud Conf

Andrea Beccaris
Full Stack Developer and Network Engineer at TOP-IX

Christian Racca
BIG DIVE Program Manager

Stefania Delprete
Data Scientist at TOP-IX

Are you interested in attending more than one BIG DIVE training module and becoming a Data Expert at 360°?
Look the other modules and leverage the discount package!

From Zero to Data Science with Python (beginner class) – September 2020
Machine and Deep Learning Intensive (advanced class) – October 2020
Communicating and Visualizing Data (beginner to intermediate class) – November 2020

Application process

Here’s a timeline of what will happen:

February 17 Registrations opening
May 31
Early-bird expiring
November 27
Registration closing
From Nov 30 to Dec 4
BIG DIVE Deep Dive into Data Engineering

The application process starts with a self-evaluation of the prerequisites (mostly related to your programming skills and Maths background) needed to access and fully enjoy the course. Optional skills were taken into consideration to create a balanced classroom. You can download here a preview of the questions and requirements of the official application form.

In the form, you can tell us more about you, your previous experiences and why we should choose you. We strongly encourage you to make a short video to stand out among the other candidates!

You will be asked to fill out a questionnaire to test the technical skills required to follow the course smoothly.

After we receive the applications our team starts the screening. Candidates might be contacted by the organizers and asked to provide more information about skills or to attend an interview (in person or using a remote audio-video communication tool). The selection process continued till the official registration closure to create progressively a class of a minimum of 8 and maximum 20 Divers.

Applicants selected before the official end of registration were asked to pay a deposit (40% of the total due fee – according to the profile). In case of missing deposit (deadline is one week after the request) the candidate loses the priority in the selection queue. In case a selected candidate renounces to participate, a new Diver is selected. The deadline for asking for the deposit refund is fifteen days before the course begins (we do not refund unused portions of the training).

If you purchase more than one module, the deposit amount, payment deadlines and refunding options will be discussed and communicated privately.

All the news about selection, exclusion and deposit request are communicated by email through the email address inserted in the application form.

Logistics and technical information

From Monday to Friday from 9:30 am to 4:30 pm. Additional time was reserved for special lectures, exercises and “homework”.

One day of absence is allowed on a total of five training days.

VenueOnline via WebEx!

Technical prerequisites: a full list of technical prerequisites will be communicated in advance.

Training language: English.

Organizer and partners

BIG DIVE 2020 is organized by:

In collaboration with:

AXANT    ISI Foundation & ISI Global Science Foundation   TODO Creative Agency

With the patronage of:

Dipartimenti di informatica Università di Torino

Check the frequently asked questions

A copy of your University ID card or any certificate proving you are a student at the time of application. You can send it at info.bigdive@top-ix.org after you filled up the form online.

The full list of technical prerequisites will be communicated shortly.

As BIG DIVE will be taught in English, proper conversation and writing skills are required.

Please write us to info.bigdive@top-ix.org for any clarifications.

You video is strongly suggested for selection purposes,  the letters of recommendation are not mandatory.

We encourage you do add them to your application to get know you and your motivation better and better evaluate your profile.

Yes, at the end of this course you’ll receive a certificate of attendance if you take part at more than 85% of the lessons and activities.

As this year BIG DIVE 2020 is divided in four modules, you’ll receive one certificate for each module you would decide to attend.

Yes, you can download a courtesy application form in PDF on this link.

It includes the full list of question and requirements.

Yes, from this link you can download a PDF of BIG DIVE 2020’s pamphlet here.

It can be useful to be shown to different department in you organization.