Housing - docs

Logo

A ready to deploy Data science project

View the Project on GitHub leoBitto/DSForge

DSForge

A template for Data Science and Data Analytics projects

Overview

DSForge is a template designed to streamline the setup of Data Science projects. It includes separate environments for development and production, each with distinct purposes and configurations. The development environment leverages Jupyter Notebook for exploratory data analysis, while the production environment integrates tools like Streamlit and Airflow for automated workflows and user interfaces.


Folder Structure

Base Folders


Environments

Development Environment

Production Environment

(To be implemented)


Key Features Added

  1. Development Environment:
    • A Docker-based setup for Jupyter Notebook.
    • manager.sh script for container lifecycle management.
    • Disabled token-based authentication for easier local access.
  2. Data Organization:
    • Defined raw, processed, and final data structure.
    • Shared data directory between development and production environments.
  3. Example Notebook:
    • An initial example notebook demonstrates how to read and preview data from raw.

Usage Instructions

Development Setup

  1. Build the Docker image:
    ./manager.sh build
    
  2. Start the Jupyter Notebook container:
    ./manager.sh start
    
  3. Access the Jupyter interface: Open your browser and navigate to http://localhost:8888.

Folder Conventions


Future Work