sharktopus

Cloud-native GRIB cropper — crop before you download

GFS today · HRRR, NAM, ECMWF open-data coming as the community grows

What is sharktopus?

sharktopus is an open-source Python library that crops GRIB2 weather data in the cloud — by bounding box, variables, and vertical levels — before it hits your disk. It deploys a small serverless wgrib2 worker to AWS Lambda, Google Cloud Run, or Azure Container Apps; each user runs on their own cloud account and pays their own (typically near-zero) bill.

Today it ships with the NOAA Global Forecast System (GFS 0.25°) end-to-end. The internals are deliberately product-agnostic — batch orchestration, byte-range streaming, cropping, inventory, quotas — so adding a new product (HRRR, NAM, RAP, ECMWF open-data) is a matter of plugging in a URL resolver and a catalog, not rewriting the core. See docs/ADDING_A_PRODUCT.md.

sharktopus is consumer-agnostic: the output is a valid cropped GRIB2 file. Typical use cases:

The typical win for a 72-hour regional domain: ~12 GB → ~200 MB of transfer, ~20 min → ~30 s wall time. Defaults ship WRF-canonical variable and level sets because that's the lineage — override with your own lists for any other consumer.

Three ways to use it

sharktopus is a Python library with a one-command CLI and an optional local web UI on top. Use whichever matches your workflow — they all produce the same cropped GRIB2 output.

1. CLI — one command

Good for shell pipelines, cron jobs, Makefiles. Install, then run:

pip install sharktopus

sharktopus \
    --start 2024012100 --end 2024012112 --step 6 \
    --lat-s -25 --lat-n -20 --lon-w -45 --lon-e -40 \
    --priority gcloud_crop aws_crop nomads_filter \
    --dest ./my-run/

2. Python — a few lines in a script

Good for notebooks, ML pipelines, Airflow/Prefect tasks, or anywhere you already have Python doing the rest of the work:

from sharktopus import fetch_batch

fetch_batch(
    timestamps=["2024012100", "2024012106"],
    bbox=(-25, -20, -45, -40),       # lat_s, lat_n, lon_w, lon_e
    variables=["TMP", "UGRD", "VGRD", "HGT"],
    levels=["500 mb", "850 mb", "surface"],
    priority=["gcloud_crop", "aws_crop", "nomads_filter"],
    dest="./my-run/",
)

See the Python API section of the README for xarray integration and batch-level parallelism controls.

3. Web UI — no code

Don't want to write Python? Run sharktopus --ui and drive the whole thing from a local control panel: submit jobs, monitor free-tier quota, manage credentials, browse inventory. The UI binds to 127.0.0.1 only (no auth, no network exposure) so it's safe on any machine you already log into; for remote use, SSH-tunnel the port.

pip install 'sharktopus[ui]'
sharktopus --ui

sharktopus dashboard — dashboard page with stats and recent jobs

The Submit page is the full CLI on a form — product picker, calendar-driven cycle selection, Leaflet map for the bounding box, variable / level cascade, source priority, directory browser for output paths.

sharktopus submit form — product picker, dates and bounding box

GitHub PyPI

Legal & policies

Who maintains it?

sharktopus was originally developed to support the CONVECT project“Convective Systems Forecasting: Integrated Analysis of Numerical Modeling, Radar and Satellites” (“Previsão de sistemas convectivos: análise integrada da modelagem numérica, radar e satélites”, CNPq Extreme Events Call 15/2023), coordinated by Dr. Tânia Ocimoto Oda. CONVECT is executed at IEAPM (Instituto de Estudos do Mar Almirante Paulo Moreira, Brazilian Navy) with partner institutions UENF (Universidade Estadual do Norte Fluminense Darcy Ribeiro) and UFPR (Universidade Federal do Paraná). sharktopus itself is maintained as an independent open-source project. Governance is merit-based and documented in GOVERNANCE.md. Contributors retain their own institutional affiliation — see AUTHORS.md.

sharktopus is not a product of, endorsed by, or representing the Brazilian Navy, CNPq, IEAPM, UENF, or UFPR. Institutional acknowledgement and project funding context are not institutional ownership.

Contact

Issues and pull requests: github.com/sharktopus-project/sharktopus/issues
Project email: sharktopus.convect@gmail.com