Data Engineering

Undergraduate course, Brigham Young University, School of Technology, 2018

I designed and remotely taught a 500-level Data Engineering class for the IT department, open to Seniors and Grad Students with the appropriate IT/CS prerequisites (project-based). At the time, remote-taught, in-person attended classes were not known to this department. The curriculum, class structure, and instruction methods were invented whole cloth to meet the needs of the students. This included securing appropriate hardware for them for their class projects.

Topics

  • Databases and engines
  • Data Pipelines
  • Warm vs cold vs hot data
  • Data mining and manipulation
  • Data quality and lineage
  • Data performance and measurement
  • Distributed databases
  • Data science and modeling
  • System reliability

Projects

  • Research and report
    • Students are assigned a topic, and must research it and teach it to the rest of the class
  • Project
    • Students must design and build a functional data engineering system over the course of the semester, including
      • a distributed database
      • a data pipeline
      • data collection agents
      • a streaming process
      • a visualization platform
      • metrics and monitoring