hitesh.kumar c57c973a91 initial commit

2025-12-16 14:13:12 +00:00

4.6 KiB

Raw Blame History

GPU Scoring

A small, opinionated toolkit to score GPUs based on memory capacity, memory bandwidth, FP16 compute, and high‑bandwidth interconnect capability. It outputs both a human‑readable table and JSON for downstream automation.

What this does

Loads GPU specifications from gpu_data.json
Computes a composite score per GPU on a 0–1 scale with a configurable minimum floor so no score is exactly 0 (useful when scores are later used as multipliers)
Prints a sorted table and a JSON array of { name, score }

Project layout

gpu_rankings.py: scoring logic and CLI entry point
gpu_data.json: GPU specification dataset consumed by the scorer
README.md: this document

Data schema (`gpu_data.json`)

Each top‑level key is a GPU name. Required fields per GPU:

MEMORY_GB (number): Total memory capacity in GB
FP16_TFLOPS (number): FP16 performance (or BF16 if that’s what the vendor exposes)
MEMORY_BW_GBPS (number): Sustained memory bandwidth in GB/s
HIGH_BW_INTERCONNECT_EXISTS (0 or 1): 1 if NVLink/SXM or equivalent high‑bandwidth interconnect is supported; otherwise 0

Example:

{
  "H100-80G-SXM5": {
    "MEMORY_GB": 80,
    "FP16_TFLOPS": 1979,
    "MEMORY_BW_GBPS": 3360,
    "HIGH_BW_INTERCONNECT_EXISTS": 1
  }
}

Notes:

If a field is missing or identical across all GPUs, the scorer will normalize gracefully (e.g., return 1.0 if there’s no variation).
Extra fields in JSON are ignored by the scorer.

Scoring method (high level)

For each GPU:

Normalize memory capacity to [0, 1]: mem_score
Normalize memory bandwidth to [0, 1]: bw_score
Apply a moderate multiplicative bandwidth boost to memory: bandwidth_weighted_memory = mem_score * (1 + bandwidth_bonus_weight * bw_score)
Normalize FP16 TFLOPs to [0, 1]: compute_score
Add an interconnect bonus: interconnect_bonus = interconnect_weight * {0 or 1}
Combine: combined = memory_weight * bandwidth_weighted_memory + compute_weight * compute_score + interconnect_bonus
Min–max normalize across all GPUs and apply a floor epsilon min_floor: score = ((combined - min) / (max - min)) * (1 - min_floor) + min_floor

Why the floor? To avoid exact zeros when scores are later used as multiplicative factors; every device remains comparable but strictly > 0.

Default weights (tunable)

Defaults used in main():

memory_weight: 0.6
compute_weight: 0.4
bandwidth_bonus_weight: 0.4 (max +40% boost to the memory component at highest bandwidth)
interconnect_weight: 0.1
min_floor: 0.05 (final normalized scores lie in [0.05, 1])

Tuning guidance:

Increase bandwidth_bonus_weight to value memory speed more
Increase compute_weight when FP16 compute is more critical
Increase interconnect_weight when NVLink/SXM‑class fabrics are required
Adjust min_floor (e.g., 0.02–0.1) to avoid zeros while preserving rank contrast

Requirements

Python 3.10+
Packages: pandas, numpy

Install:

pip install pandas numpy

Running

From the gpu_scoring directory:

python gpu_rankings.py

You’ll see:

A table sorted by score (descending)
A JSON array printed after the table:

[
  { "name": "H100-80G-SXM5", "score": 0.995 },
  { "name": "A100-80G-SXM4", "score": 0.872 }
]

Customizing weights

Edit the call to gpu_score(...) in gpu_rankings.py main():

df["score"] = gpu_score(
    df,
    memory_weight=0.6,
    compute_weight=0.4,
    bandwidth_bonus_weight=0.4,
    interconnect_weight=0.1,
    min_floor=0.05,
)

Library usage (import in your own code)

from gpu_rankings import load_gpu_data, build_df, gpu_score

gpu_dict = load_gpu_data()          # or load_gpu_data("/path/to/gpu_data.json")
df = build_df(gpu_dict)
df["score"] = gpu_score(
    df,
    memory_weight=0.6,
    compute_weight=0.4,
    bandwidth_bonus_weight=0.4,
    interconnect_weight=0.1,
    min_floor=0.05,
)
records = df[["name", "score"]].sort_values("score", ascending=False).to_dict(orient="records")

Updating the dataset

Edit gpu_data.json to add or modify GPUs. Keep field names consistent:

MEMORY_GB, FP16_TFLOPS, MEMORY_BW_GBPS, HIGH_BW_INTERCONNECT_EXISTS

Limitations and notes

Scoring is single‑GPU and spec‑based; it does not model workload‑specific behavior (e.g., comms‑bound vs compute‑bound) or cluster‑level scaling.
FP16 figures may be provided by vendors with different caveats (e.g., sparsity). Use consistent, non‑sparse figures where possible.
Interconnect bonus is a coarse indicator (0/1); adjust the weight or extend the data if you need gradations.

4.6 KiB Raw Blame History Unescape Escape