How to expense a Modal Course
Before you enroll

Remember to confirm your employer’s expense policy and/or get approval from your manager before purchasing a Modal Course! Please refer to any internal guidelines for the most accurate instructions for getting reimbursed.

Your receipt

After you enroll in a Modal Course, you will receive a receipt via email. Be sure to double check what your company requires, but typically you can attach this receipt to your company’s expense software, or forward it directly to your accounting team for reimbursement. Please reach out to support@modal.io if you need help adding extra required information to your receipt or if you have any questions.

Your certificate

Some employers require a certificate of completion before reimbursing an educational expense like this one. Upon completion of your course, you will receive your certificate via email, and will also be able to download your certificate from your Modal Dashboard. Your certificate will include your name, the course name, and the month/year you earned your certificate.

<- Back to catalog

Big Data with Spark and Python

Learn the fundamentals of big data with Spark and Python! Process and clean data and build a machine learning pipeline.
Who Is This For?

Data Scientists interested in building a foundational skill set for working with big data.

Any prerequisites?
  • A developed understanding of Python syntax, data structures, Pandas DataFrames, NumPy, and Scikit-Learn.
  • Familiarity with Supervised Machine Learning topics, such as the components of, and process for building, a regression ML model.
  • Statistics, especially probability, and applying these concepts using Python.
  • Linear algebra concepts like matrices and vector spaces.
What will I be able to do after this Course?
  • Connect to Spark, import data as RDDs and DataFrames, and perform basic transformations and actions on RDDs.
  • Use PySpark and Spark SQL to clean, manipulate and analyze DataFrames.
  • Use MLlib to preprocess data, and create, train, and evaluate a machine learning model.
Reimbursement FAQ

Course Overview

Sprint 1: Welcome to Spark
Meet Distributed Discounts, a membership-based, bulk retail company! You’ll help DD format their trove of data in a way that can be used by Spark for analysis, and create both RDDs and DataFrames to perform basic actions and transformations on the data.
Sprint 2: Data Cleaning and Analysis with PySpark and SQL
Now that Distributed Discounts' data has been imported, you’ll answer key business questions about their customers demographics, behavior, and purchases using Spark SQL.
Sprint 3: Machine Learning with Spark
Create a machine learning model to predict whether new customers will purchase another item from Distributed Discounts.

What’s in a Modal course?

1:1 Coaching
Receive personal guidance, instruction, and motivation from real, practicing industry experts.
Real-world simulations
Practical coursework blends simulated and real-world projects, ensuring you are building job-ready skills.
Integrated code editor
In-browser coding environment mitigates challenges while enabling paired programming and inline feedback.
Structure & flexibility
Engage with content when your schedule allows. Our assignments and deadlines help you stay on track and our coaches keep you accountable.
Individual guidance
Courses for a variety of career goals, skill needs, and company objectives, ensuring learning is both relevant and productive.
Capstones projects
Challenging and satisfying capstone projects allow you to demonstrate the skills you’ve learned, while reinforcing collaboration and business skills.

Meet our coaches

Linda Liu
Director, Data Science & Analytics

Working with the learners makes it an incredibly rewarding journey. The shared excitement and collaborative growth highlight the entire fulfilling coaching experience!

Nataliia Maksimova
Director, Business Intelligence

It's incredibly inspiring to introduce people to the fascinating world of data. Sharing my passion for data and showing that it's not just dry numbers but a creative field where you can grow and innovate is deeply rewarding.

Udit Mehrotra
Senior Data Scientist

Seeing the growth in my learners is not only heartening but also assuring because I know I had a significant role to play in shaping their journey.

Interested in buying multiple seats for your team?
Contact us