# GPU Scoring A small, opinionated toolkit to score GPUs based on memory capacity, memory bandwidth, FP16 compute, and high‑bandwidth interconnect capability. It outputs both a human‑readable table and JSON for downstream automation. ## What this does - Loads GPU specifications from `gpu_data.json` - Computes a composite score per GPU on a 0–1 scale with a configurable minimum floor so no score is exactly 0 (useful when scores are later used as multipliers) - Prints a sorted table and a JSON array of `{ name, score }` ## Project layout - `gpu_rankings.py`: scoring logic and CLI entry point - `gpu_data.json`: GPU specification dataset consumed by the scorer - `README.md`: this document ## Data schema (`gpu_data.json`) Each top‑level key is a GPU name. Required fields per GPU: - `MEMORY_GB` (number): Total memory capacity in GB - `FP16_TFLOPS` (number): FP16 performance (or BF16 if that’s what the vendor exposes) - `MEMORY_BW_GBPS` (number): Sustained memory bandwidth in GB/s - `HIGH_BW_INTERCONNECT_EXISTS` (0 or 1): 1 if NVLink/SXM or equivalent high‑bandwidth interconnect is supported; otherwise 0 Example: ```json { "H100-80G-SXM5": { "MEMORY_GB": 80, "FP16_TFLOPS": 1979, "MEMORY_BW_GBPS": 3360, "HIGH_BW_INTERCONNECT_EXISTS": 1 } } ``` Notes: - If a field is missing or identical across all GPUs, the scorer will normalize gracefully (e.g., return 1.0 if there’s no variation). - Extra fields in JSON are ignored by the scorer. ## Scoring method (high level) For each GPU: 1) Normalize memory capacity to [0, 1]: `mem_score` 2) Normalize memory bandwidth to [0, 1]: `bw_score` 3) Apply a moderate multiplicative bandwidth boost to memory: `bandwidth_weighted_memory = mem_score * (1 + bandwidth_bonus_weight * bw_score)` 4) Normalize FP16 TFLOPs to [0, 1]: `compute_score` 5) Add an interconnect bonus: `interconnect_bonus = interconnect_weight * {0 or 1}` 6) Combine: `combined = memory_weight * bandwidth_weighted_memory + compute_weight * compute_score + interconnect_bonus` 7) Min–max normalize across all GPUs and apply a floor epsilon `min_floor`: `score = ((combined - min) / (max - min)) * (1 - min_floor) + min_floor` Why the floor? To avoid exact zeros when scores are later used as multiplicative factors; every device remains comparable but strictly > 0. ## Default weights (tunable) Defaults used in `main()`: - `memory_weight`: 0.6 - `compute_weight`: 0.4 - `bandwidth_bonus_weight`: 0.4 (max +40% boost to the memory component at highest bandwidth) - `interconnect_weight`: 0.1 - `min_floor`: 0.05 (final normalized scores lie in [0.05, 1]) Tuning guidance: - Increase `bandwidth_bonus_weight` to value memory speed more - Increase `compute_weight` when FP16 compute is more critical - Increase `interconnect_weight` when NVLink/SXM‑class fabrics are required - Adjust `min_floor` (e.g., 0.02–0.1) to avoid zeros while preserving rank contrast ## Requirements - Python 3.10+ - Packages: `pandas`, `numpy` Install: ```bash pip install pandas numpy ``` ## Running From the `gpu_scoring` directory: ```bash python gpu_rankings.py ``` You’ll see: - A table sorted by `score` (descending) - A JSON array printed after the table: ```json [ { "name": "H100-80G-SXM5", "score": 0.995 }, { "name": "A100-80G-SXM4", "score": 0.872 } ] ``` ## Customizing weights Edit the call to `gpu_score(...)` in `gpu_rankings.py` `main()`: ```python df["score"] = gpu_score( df, memory_weight=0.6, compute_weight=0.4, bandwidth_bonus_weight=0.4, interconnect_weight=0.1, min_floor=0.05, ) ``` ## Library usage (import in your own code) ```python from gpu_rankings import load_gpu_data, build_df, gpu_score gpu_dict = load_gpu_data() # or load_gpu_data("/path/to/gpu_data.json") df = build_df(gpu_dict) df["score"] = gpu_score( df, memory_weight=0.6, compute_weight=0.4, bandwidth_bonus_weight=0.4, interconnect_weight=0.1, min_floor=0.05, ) records = df[["name", "score"]].sort_values("score", ascending=False).to_dict(orient="records") ``` ## Updating the dataset Edit `gpu_data.json` to add or modify GPUs. Keep field names consistent: - `MEMORY_GB`, `FP16_TFLOPS`, `MEMORY_BW_GBPS`, `HIGH_BW_INTERCONNECT_EXISTS` ## Limitations and notes - Scoring is single‑GPU and spec‑based; it does not model workload‑specific behavior (e.g., comms‑bound vs compute‑bound) or cluster‑level scaling. - FP16 figures may be provided by vendors with different caveats (e.g., sparsity). Use consistent, non‑sparse figures where possible. - Interconnect bonus is a coarse indicator (0/1); adjust the weight or extend the data if you need gradations.