DS2002 Data Science Systems

Course materials and documentation for DS2002

View the Project on GitHub ksiller/ds2002-course

Windows Setup

If you have a Mac or Linux computer, go to the Mac/Linux setup instructions.

Setting Up Your Own Computer for Course Projects (Optional)

Note: If you are new to programming and are not familiar with installing programming tools on your computer, I highly recommend skipping this step and using GitHub Codespaces instead. This will allow you to get started immediately without the hassle of troubleshooting any setup issues.

However, if you want to set up an environment for class work on your own computer, here are the basic steps.

To set up your own computer for all course activities I highly encourage you to install all the python packages in a new environment. Think of an environment as an isolated area to install the software packages you need for a specific project, i.e. in this case the course activities. Packages in an environment are isolated from other software packages on your computer.

VSCode

Download and install Visual Studio Code from the official website. Follow the platform-specific installation instructions for Windows.

Tools & Python

Since we’re using the Linux command line, you will need to install a Linux-like terminal program. I recommend installing Git-Bash which provides the Linux-style terminal and also Git.

1. Install Git-Bash

1.1 Download and install Git-Bash from the official Git website.

2. Install Miniforge

2.1. Download the Miniforge Installer. Go to the conda-forge Miniforge GitHub repository and download the Windows executable file (Miniforge3-Windows-x86_64.exe).

2.2 Run the Executable Installer: Double-click the downloaded .exe file to run the installer.

2.3. Configure bash for Python Open a new Git Bash terminal (not the Windows PowerShell or Miniforge Command Prompt) and execute the following command:

echo 'echo "Sourcing .bashrc" && eval "$(mamba.exe shell hook --shell bash)" && mamba activate' >> ~/.bashrc

3. Create a conda (mamba) environment.

3.1. Open a new Git Bash terminal (not the Windows PowerShell) and execute the following command:

mamba env create -n ds2002 -c conda-forge python=3.11 pip jq awscli curl git redis-py mongodb
mamba activate ds2002
pip install unzip wget

Note: You can try and add the zip and tar packages to the pip install command, but be aware that they may fail to install on your computer.

3.2. Check your environments:

mamba env list

You should see:

# conda environments:
#
base                 * C:\Users\mst3k\miniforge3
ds2002                 C:\Users\mst3k\miniforge3\envs\ds2002

4. Restart Git Bash and Verify:

4.1. Close and reopen your Git Bash terminal for the changes to take effect.

4.2. Run mamba activate ds2002. The prompt should change from base to ds2002 indicating the switch to your new environment.

4.3. Run the command mamba list. You should see a list of installed packages, and your prompt should show (ds2002) at the beginning, confirming that the ds2002 environment is active.

Note: The first step when opening a new terminal is to run mamba activate ds2002. You can add that command to the ~/.bashrc file if you wish.

Please be aware that we have limited bandwidth to guide you through fixing broken installations on your computer. If installations fail, you can always go back to using GitHub Codespaces.