DS2002 Data Science Systems

Course materials and documentation for DS2002

View the Project on GitHub ksiller/ds2002-course

Milestone 2

Review the due date in Canvas and begin scheduling your work.

Within your team, discuss how you want to divide the work. Some may prefer to dive deeper into coding; others may lean more into documentation. Keep in mind that this is a team project.

Communicate with each other early and frequently. If you feel like you’re at an impasse, don’t hesitate to reach out for feedback and pointers.

  1. Research Python (or other) packages needed for your project.

  2. Implement the pipeline, including script(s) to query your database.

  3. Commit your code updates regularly.

  4. Add a requirements.txt to list all Python packages needed for your project.

  5. Test and debug with test data of your choosing; inspect logs and database records. Add a data folder to your GitHub repository with at least one example input file. The ISS project processes JSON data from an API endpoint. Since there is no typical pipeline input file, create a JSON file with a representative payload returned by the ISS API.

  6. Document deployment/installation and how to use the pipeline or scripts in the README.md file. See How to write a README.

  7. Submit the url to your GitHub project repository in the Milestone 2 assignment on Canvas.