Building Big Data Pipelines with PySpark & MongoDB & Bokeh
Build an intelligent data pipeline using Apache Spark and MongoDB big data technologies.
Introduction to the course
Python Libraries Installation
Apache Spark Installation
NoSQL Booster Installation
Integration of PySpark with Jupyter Notebook
Data Loading in MongoDB
Data Pre-Processing and Preparation
Building the Machine Learning Model
Prediction Dataset Creation
Map Plot Creation
Maximum and Average Magnitude Plot
Grid Plot for Web Browser Visualization
Visual Studio Code Installation
Building the Spark ETL Pipeline Script
Building the Spark ML Pipeline Script
Dashboard Server Configuration
PySpark Data Pipeline
How to create data processing pipelines using PySpark.
Machine learning with geospatial data using the Spark MLlib library.
Data analysis using PySpark, MongoDB and Bokeh, inside of jupyter notebook.
How to manipulate, clean and transform data using PySpark dataframes.
Basic geo mapping.
How to create dashboards.
How to create a lightweight server to serve bokeh dashboards.
Access our ENTIRE content instantly with a subscription
Master students and PhD candidates
Researchers and Academics
Professionals and Companies
After you successfully finish the course, you can claim your Certificate of Completion with NO extra cost! You can add it to your CV, LinkedIn profile etc
We know hard it is to acquire new skills. All our courses are self paced.
Even when you finish the course and you get your certificate, you will still have access to course contents! Every time an Instructor makes an update you will be notified and be able to watch it for FREE
Data Engineer and business intelligence firstname.lastname@example.org