Day 6: Setting Up Your Development Environment

Ian Clemence
3 min readMar 10, 2025

--

Hey everyone,

Welcome back to “100 Days of Data Science”! Today, we’re talking about setting up your development environment on your local computer — a crucial first step to turning your data ideas into real projects. Think of it as organizing your workspace: when everything is in the right place, you can focus on solving problems and making discoveries without distractions.

1. Python with Anaconda

For most data scientists, Python is the go-to language, and Anaconda is the best way to get started. Anaconda is a free distribution that bundles Python with a host of useful libraries (like NumPy, pandas, scikit-learn, and more) and comes with Jupyter Notebooks and Spyder — the interactive and integrated development environments (IDEs) that make coding, testing, and debugging a breeze.

Imagine you’re setting up your kitchen with all the right tools before cooking. With Anaconda, you have your ingredients and utensils ready to mix up your data recipes!

Learn more & Install:

2. R and RStudio

R is another powerful language, especially if you’re into statistical analysis and data visualization. RStudio, its companion IDE, provides a clean interface to write code, visualize results, and manage your projects — all in one place.

For example, if you need to create stunning visualizations of seasonal sales trends, RStudio’s intuitive panels and robust package support (like ggplot2) make it easy to turn numbers into eye-catching charts.

Learn more & Install:

3. Unix Shell

The Unix Shell is your command-line interface that lets you navigate files, manage programs, and even run scripts. It’s essential, especially when working with cloud platforms or automating tasks. For example, in a Jupyter Notebook, you can run shell commands by prefixing them with an exclamation mark (like !ls to list files).

Mac users already have a Unix Shell built in. Windows users can install tools like Git Bash or Cygwin to get similar functionality.

Learn more:

4. Git: Version Control

Git is the industry-standard tool for version control. It helps you track changes in your code, collaborate with others, and revert to previous versions when necessary — like having a time machine for your projects!

Imagine working on a complex project and knowing you can always roll back to a previous, working version if something goes wrong — that’s the power of Git.

Learn more & Install:

Bringing It All Together

When you set up these tools — Python/Anaconda, R/RStudio, Unix Shell, and Git — you create a robust ecosystem that supports everything from data cleaning and analysis to modeling and deployment. These tools integrate smoothly; for instance, you can use shell commands directly within Jupyter Notebooks or manage your R scripts with Git in RStudio.

Final Thoughts:
Setting up your development environment is the foundation for every data science project. With a well-organized workspace, you can focus on analyzing data, building models, and generating insights that drive real-world decisions. Whether you’re just starting out or refining your setup, these tools will help you work efficiently and effectively.

Stay tuned for Day 7, where we’ll dive into our next exciting topic. Until then, happy coding, and feel free to drop any questions or tips in the comments!

--

--

Ian Clemence
Ian Clemence

Written by Ian Clemence

Software Developer | Data Scientist | ML Enthusiast

No responses yet