DS2002 Data Science Systems

Course materials and documentation for DS2002

View the Project on GitHub ksiller/ds2002-course

Scripting in Python

The goal of this activity is to familiarize you with scripting in Python. Python scripting is essential for automating tasks, processing data, orchestrating workflows, and building reusable tools that can save time and reduce errors.

Note: Work through the examples below in your terminal (Codespace or local), experimenting with each command and its various options. If you encounter an error message, don’t be discouraged—errors are learning opportunities. Reach out to your peers or instructor for help when needed, and help each other when you can.

If the initial examples feel like a breeze, challenge yourself with activities in the Advanced Concepts section and explore the resource links at the end of this post.

In-class exercises

Scripting in python is fairly similar to bash, but it has a lot more functionality in terms of libraries, classes, functions, etc. A few things to note:

Starting JupyterLab in Codespaces

JupyterLab is pre-installed in your codespace environment. To start it:

  1. Open a terminal in your VSCode codespace (Terminal → New Terminal)
  2. Run the following command:
    jupyter lab --allow-root
    
  3. The terminal will display a URL with an authentication token. Look for a line like:
    http://127.0.0.1:8888/lab?token=...
    

    Copy the token info after the “…token=”. We’ll need it in the next step.

  4. In VS Code, you should see a notification asking if you want to open the forwarded port, or you can:
    • Click on the “Ports” tab in the bottom panel
    • Find port 8888 in the list
    • Click the “Open in Browser” icon (globe icon) next to port 8888
  5. JupyterLab will open in a new browser tab. Enter the token (from step 3) into the authentication field when prompted.

Note: Port 8888 is automatically forwarded in Codespaces, so you don’t need to manually configure port forwarding.

Alternatively you can set up the software environment locally on your own computer, see the setup instructions.

Running a Python script

  1. Open a terminal window.
  2. In the terminal run:
     python my_script.py # add command line args as needed if the script is written to handle them.
    

Additional Practice

Advanced Concepts (Optional)

Resources

Command line arguments in Python Pandas tutorial on Kaggle.