gpu_scoring/README.md
2025-12-16 14:13:12 +00:00

4.6 KiB
Raw Blame History

GPU Scoring

A small, opinionated toolkit to score GPUs based on memory capacity, memory bandwidth, FP16 compute, and highbandwidth interconnect capability. It outputs both a humanreadable table and JSON for downstream automation.

What this does

  • Loads GPU specifications from gpu_data.json
  • Computes a composite score per GPU on a 01 scale with a configurable minimum floor so no score is exactly 0 (useful when scores are later used as multipliers)
  • Prints a sorted table and a JSON array of { name, score }

Project layout

  • gpu_rankings.py: scoring logic and CLI entry point
  • gpu_data.json: GPU specification dataset consumed by the scorer
  • README.md: this document

Data schema (gpu_data.json)

Each toplevel key is a GPU name. Required fields per GPU:

  • MEMORY_GB (number): Total memory capacity in GB
  • FP16_TFLOPS (number): FP16 performance (or BF16 if thats what the vendor exposes)
  • MEMORY_BW_GBPS (number): Sustained memory bandwidth in GB/s
  • HIGH_BW_INTERCONNECT_EXISTS (0 or 1): 1 if NVLink/SXM or equivalent highbandwidth interconnect is supported; otherwise 0

Example:

{
  "H100-80G-SXM5": {
    "MEMORY_GB": 80,
    "FP16_TFLOPS": 1979,
    "MEMORY_BW_GBPS": 3360,
    "HIGH_BW_INTERCONNECT_EXISTS": 1
  }
}

Notes:

  • If a field is missing or identical across all GPUs, the scorer will normalize gracefully (e.g., return 1.0 if theres no variation).
  • Extra fields in JSON are ignored by the scorer.

Scoring method (high level)

For each GPU:

  1. Normalize memory capacity to [0, 1]: mem_score
  2. Normalize memory bandwidth to [0, 1]: bw_score
  3. Apply a moderate multiplicative bandwidth boost to memory: bandwidth_weighted_memory = mem_score * (1 + bandwidth_bonus_weight * bw_score)
  4. Normalize FP16 TFLOPs to [0, 1]: compute_score
  5. Add an interconnect bonus: interconnect_bonus = interconnect_weight * {0 or 1}
  6. Combine: combined = memory_weight * bandwidth_weighted_memory + compute_weight * compute_score + interconnect_bonus
  7. Minmax normalize across all GPUs and apply a floor epsilon min_floor: score = ((combined - min) / (max - min)) * (1 - min_floor) + min_floor

Why the floor? To avoid exact zeros when scores are later used as multiplicative factors; every device remains comparable but strictly > 0.

Default weights (tunable)

Defaults used in main():

  • memory_weight: 0.6
  • compute_weight: 0.4
  • bandwidth_bonus_weight: 0.4 (max +40% boost to the memory component at highest bandwidth)
  • interconnect_weight: 0.1
  • min_floor: 0.05 (final normalized scores lie in [0.05, 1])

Tuning guidance:

  • Increase bandwidth_bonus_weight to value memory speed more
  • Increase compute_weight when FP16 compute is more critical
  • Increase interconnect_weight when NVLink/SXMclass fabrics are required
  • Adjust min_floor (e.g., 0.020.1) to avoid zeros while preserving rank contrast

Requirements

  • Python 3.10+
  • Packages: pandas, numpy

Install:

pip install pandas numpy

Running

From the gpu_scoring directory:

python gpu_rankings.py

Youll see:

  • A table sorted by score (descending)
  • A JSON array printed after the table:
[
  { "name": "H100-80G-SXM5", "score": 0.995 },
  { "name": "A100-80G-SXM4", "score": 0.872 }
]

Customizing weights

Edit the call to gpu_score(...) in gpu_rankings.py main():

df["score"] = gpu_score(
    df,
    memory_weight=0.6,
    compute_weight=0.4,
    bandwidth_bonus_weight=0.4,
    interconnect_weight=0.1,
    min_floor=0.05,
)

Library usage (import in your own code)

from gpu_rankings import load_gpu_data, build_df, gpu_score

gpu_dict = load_gpu_data()          # or load_gpu_data("/path/to/gpu_data.json")
df = build_df(gpu_dict)
df["score"] = gpu_score(
    df,
    memory_weight=0.6,
    compute_weight=0.4,
    bandwidth_bonus_weight=0.4,
    interconnect_weight=0.1,
    min_floor=0.05,
)
records = df[["name", "score"]].sort_values("score", ascending=False).to_dict(orient="records")

Updating the dataset

Edit gpu_data.json to add or modify GPUs. Keep field names consistent:

  • MEMORY_GB, FP16_TFLOPS, MEMORY_BW_GBPS, HIGH_BW_INTERCONNECT_EXISTS

Limitations and notes

  • Scoring is singleGPU and specbased; it does not model workloadspecific behavior (e.g., commsbound vs computebound) or clusterlevel scaling.
  • FP16 figures may be provided by vendors with different caveats (e.g., sparsity). Use consistent, nonsparse figures where possible.
  • Interconnect bonus is a coarse indicator (0/1); adjust the weight or extend the data if you need gradations.