Back to Projects

Big Data NOAA - Weather Analytics

Completed

A big data analytics platform that processes NOAA weather station data using Hadoop MapReduce and HBase, with an interactive Python UI for visualizing global weather trends.

Python Java Hadoop HBase Docker MapReduce

My Role

Team Leader & Contributor

Team Size

3

Duration

3 months

Platform

Web / CLI

Overview

A distributed big data analytics system built to process and analyze historical weather data from NOAA (National Oceanic and Atmospheric Administration). The project leverages Hadoop MapReduce for large-scale data processing, HBase for storage, and a Python-based UI for interactive visualization of weather trends across global stations.

Key Features

  • Distributed Data Processing — Hadoop MapReduce jobs for processing large-scale NOAA weather datasets (TMIN, TMAX, etc.)
  • HBase Storage — scalable NoSQL database for storing station metadata and processed weather records
  • Interactive UI — Python-based interface with a map view for exploring weather stations across the globe
  • Global Trend Analysis — compute and visualize temperature trends over time at global and per-station levels
  • Dockerized Infrastructure — fully containerized Hadoop, HBase, and Zookeeper cluster for local development

My Responsibilities

  • Led the team of 3, coordinating task assignments and setting project milestones
  • Organized sprint meetings and tracked progress to ensure on-time delivery
  • Developed data download and ingestion scripts for NOAA station datasets
  • Built MapReduce jobs for weather data processing and trend analysis
  • Contributed to the Python CLI and UI for running analytic jobs and viewing results
  • Helped configure the Docker-based distributed environment (HDFS, HBase, Zookeeper)
  • Reviewed team members’ code and managed pull requests