“Data Visualization with D3.js” by TODO
Data Visualization with D3.js
Teacher: Fabio Franchino — TODO
Immersive lecture on the key elements and concepts behind data visualization.
Duration: 2 days.
“Real Time Ingestion and Analysis of data with MongoDB and Python” by AXANT
Real Time Ingestion and Analysis of data with MongoDB and Python
Teacher: Alessandro Molina – AXANT
Nowadays more and more data is generated by companies and software products, especially in the IoT world records are saved with a throughput of thousands per second.
That requires solutions able to scale writes and perform real time cleanup and analysis of thousands of records per second and MongoDB is getting wildly used in those
environments in the role of what’s commonly named “speed layers” to perform fast analytics over the most recent data and adapt or cleanup incoming records.
This session aims to show how MongoDB can be used as a primary storage for your data, scaling it to thousand of records and thousand of writes per second while also acting as a real-time analysis and visualization channel thanks to change streams and as a flexible analytics tool thanks to the aggregation pipeline and MapReduce.
Duration: 1 day.
#mongodb #realtime #scaling #mapreduce
“Data Analysis with Spark Streaming” by Agile Lab
Data Analysis with Spark Streaming
Teacher: Vito Ressa — AgileLab
Big Data analysis is a hot trend and one of its major roles is to give new value to enterprise data. However data and information lose value as they become old, so it is important in a lot of contexts to do near real-time analysis of incoming data flows. Apache Spark is a major actor in the big data scenario and with its Streaming module aims to solve the main challenges in real-time data processing at scale in distributed environments.
This session aims to show the potential of streaming data analysis and how to leverage on Apache Spark with Structured Streaming to extract value from it without taking care of common problems of streaming processing at scale already solved by Apache Spark.
Duration: 2 days.
#bigdata #dataengineering #dataframework #apachespark
“Excursus: Agent-based modelling and synthetic populations” by GCF
Excursus: Agent-based modelling and synthetic populations
Teachers: Sarah Wolf, Gesine Steudle – Global Climate Forum
To understand possible transitions of complex systems (like e.g.societies, markets, systems of socio-technical co-evolution) pure data analysis might not be sufficient because such transitions often imply substantial shifts that can hardly be described by pure statistical data extrapolation. Therefore, modelling activities can be a useful complement to data analysis.
This workshop introduces an agent-based model, which is based on synthetic populations, for the global challenge of how to make mobility more sustainable. It illustrates the methodological approach of agent-based modelling, discusses how the process of model development can be accompanied with stakeholder dialogues, explores the interaction between such an agent-based model and the relevant data science tools, and provides some hands-on exercises.
Duration: 2 days.
Prerequisites: basic knowledge of Python
#datascience #complexsystems #agentbased #mobility #sustainability
“Data Citizenship and NetScience: technology for data-culture” by HER
Data Citizenship and NetScience: technology for data-culture
Teachers: Salvatore Iaconesi, Oriana Persico – Human Ecosystems Relazioni
We constantly generate data, whether we realize it or not, whether we want it or not, and a very limited number of subjects has access to all of this data. This is a very serious condition, with enormous implications for our fundamental rights and freedoms, and for our opportunities to prosper, create, express, relate and live a just, inclusive, constructive life.
In this session we explore technologies for cultural acceleration through data: Human Ecosystems to create large scale, participatory data collection processes; Ubiquitous Commons for distributed, blockchain supported data-rights and evolved data-ownership patterns; Generative Open Data as accessibility layer for shared data commons.
This is a hands on session in which profound theoretical concepts emerge from technological architectures themselves and through the ways in which we will use them. It will be mainly focused on Network Science and the ways in which we can use it to gain better understandings of the city’s Relational Ecosystem between people, organizations, network connected objects, sensors and more.
We will see and understand how to use the platforms, and explore a practical case study: Bologna’s TDays, the limited traffic week-ends in the historical center of Bologna. We will figure out together how possible ways in which to transform them into a data-driven, inclusive, engaging opportunity for participatory citizenship, by using the platforms, social networks, art and design.
Duration: 1 day.
#networkscience #socialscience #territory #city #citizenship
“Machine Learning and Deep Learning for Computer Vision” by ISI
Machine Learning and Deep Learning for Computer Vision
Teachers: Andrè Panisson, Alan Perotti — ISI Foundation
This in-depth part of the course allows to build an appealing and diversified Machine Learning portfolio. It starts with a Machine Learning introduction and application with Scikit-learn, and continues with Neural Networks and backpropagation lectures where you’ll start exploring Computer Vision techniques on a dataset of images.
Deep Learning methods. You’ll be challenged to use TensorFlow and Keras on a image classification real cases (such as distracted drivers, healthcare or plant diseases). The workshop ends with lessons in Transfer Learning and one last project building your data set by scraping Google images and practicing everything you learned.
Duration: 4 days.
Prerequisites: Python, Pandas, Statistics, exposure to Machine Learning is welcome.
#machinelearning #deeplearning #neuralnetworks #scikitlearn #tensorflow
“Voice Recognition models in DeepSpeech and Common Voice” by Mozilla
Voice Recognition models in DeepSpeech and Common Voice
Teacher: Alexandre Lissy — Mozilla
DeepSpeech is an open source Speech-To-Text engine, using model trained by machine learning techniques, based on Baidu’s Deep Speech research paper.
You will learn how the model works, and how this was implemented using TensorFlow. The workshop will cover how we went from a PoC hack to a model that we try and make usable in production and how we leverage the distributed training system. We’ll explore how the inference-specific model is being built and the code around to make it run on several devices, and the tooling from TensorFlow we explored to try and speedup things.
We also present the Common Voice project, aiming at collecting open dataset for machine learning and more specifically voice-targetted machine learning.
You’ll be able to contribute to both project: how to train your own model for DeepSpeech, how to use DeepSpeech as a “blackbox”, how to hack into DeepSpeech, and how to contribute to Common Voice.
Duration: Half day.
Prerequisites: Python, shell, exposure to C++ is welcome.
#machinelearning #deepspeech #voicerecognition #tensorflow
“From local to glocal using community data” by FBK
From local to glocal using community data
Teacher: Maurizio Napolitano — Fondazione Bruno Kessler
The workshop starts with an introduction to the GIS world, the geospatial protocols and the available geodata resources.
It continues diving in the OpenStreetMap ecosystem where we explore how it can be used as a great tool for data scientists. After the examples of analysis on real cases, you’ll be challenged to make your own geospatial project supervised by the expert Maurizio Napolitano.
Duration: 1 day.
Prerequisites: Python, previous experience with OpenSteetMap is welcome.
#geospatial #map #opendata #osm