129 lines
4.6 KiB
Markdown
129 lines
4.6 KiB
Markdown
# GPU Scoring
|
||
|
||
A small, opinionated toolkit to score GPUs based on memory capacity, memory bandwidth, FP16 compute, and high‑bandwidth interconnect capability. It outputs both a human‑readable table and JSON for downstream automation.
|
||
|
||
## What this does
|
||
- Loads GPU specifications from `gpu_data.json`
|
||
- Computes a composite score per GPU on a 0–1 scale with a configurable minimum floor so no score is exactly 0 (useful when scores are later used as multipliers)
|
||
- Prints a sorted table and a JSON array of `{ name, score }`
|
||
|
||
## Project layout
|
||
- `gpu_rankings.py`: scoring logic and CLI entry point
|
||
- `gpu_data.json`: GPU specification dataset consumed by the scorer
|
||
- `README.md`: this document
|
||
|
||
## Data schema (`gpu_data.json`)
|
||
Each top‑level key is a GPU name. Required fields per GPU:
|
||
- `MEMORY_GB` (number): Total memory capacity in GB
|
||
- `FP16_TFLOPS` (number): FP16 performance (or BF16 if that’s what the vendor exposes)
|
||
- `MEMORY_BW_GBPS` (number): Sustained memory bandwidth in GB/s
|
||
- `HIGH_BW_INTERCONNECT_EXISTS` (0 or 1): 1 if NVLink/SXM or equivalent high‑bandwidth interconnect is supported; otherwise 0
|
||
|
||
Example:
|
||
```json
|
||
{
|
||
"H100-80G-SXM5": {
|
||
"MEMORY_GB": 80,
|
||
"FP16_TFLOPS": 1979,
|
||
"MEMORY_BW_GBPS": 3360,
|
||
"HIGH_BW_INTERCONNECT_EXISTS": 1
|
||
}
|
||
}
|
||
```
|
||
|
||
Notes:
|
||
- If a field is missing or identical across all GPUs, the scorer will normalize gracefully (e.g., return 1.0 if there’s no variation).
|
||
- Extra fields in JSON are ignored by the scorer.
|
||
|
||
## Scoring method (high level)
|
||
For each GPU:
|
||
1) Normalize memory capacity to [0, 1]: `mem_score`
|
||
2) Normalize memory bandwidth to [0, 1]: `bw_score`
|
||
3) Apply a moderate multiplicative bandwidth boost to memory:
|
||
`bandwidth_weighted_memory = mem_score * (1 + bandwidth_bonus_weight * bw_score)`
|
||
4) Normalize FP16 TFLOPs to [0, 1]: `compute_score`
|
||
5) Add an interconnect bonus: `interconnect_bonus = interconnect_weight * {0 or 1}`
|
||
6) Combine:
|
||
`combined = memory_weight * bandwidth_weighted_memory + compute_weight * compute_score + interconnect_bonus`
|
||
7) Min–max normalize across all GPUs and apply a floor epsilon `min_floor`:
|
||
`score = ((combined - min) / (max - min)) * (1 - min_floor) + min_floor`
|
||
|
||
Why the floor? To avoid exact zeros when scores are later used as multiplicative factors; every device remains comparable but strictly > 0.
|
||
|
||
## Default weights (tunable)
|
||
Defaults used in `main()`:
|
||
- `memory_weight`: 0.6
|
||
- `compute_weight`: 0.4
|
||
- `bandwidth_bonus_weight`: 0.4 (max +40% boost to the memory component at highest bandwidth)
|
||
- `interconnect_weight`: 0.1
|
||
- `min_floor`: 0.05 (final normalized scores lie in [0.05, 1])
|
||
|
||
Tuning guidance:
|
||
- Increase `bandwidth_bonus_weight` to value memory speed more
|
||
- Increase `compute_weight` when FP16 compute is more critical
|
||
- Increase `interconnect_weight` when NVLink/SXM‑class fabrics are required
|
||
- Adjust `min_floor` (e.g., 0.02–0.1) to avoid zeros while preserving rank contrast
|
||
|
||
## Requirements
|
||
- Python 3.10+
|
||
- Packages: `pandas`, `numpy`
|
||
|
||
Install:
|
||
```bash
|
||
pip install pandas numpy
|
||
```
|
||
|
||
## Running
|
||
From the `gpu_scoring` directory:
|
||
```bash
|
||
python gpu_rankings.py
|
||
```
|
||
|
||
You’ll see:
|
||
- A table sorted by `score` (descending)
|
||
- A JSON array printed after the table:
|
||
```json
|
||
[
|
||
{ "name": "H100-80G-SXM5", "score": 0.995 },
|
||
{ "name": "A100-80G-SXM4", "score": 0.872 }
|
||
]
|
||
```
|
||
|
||
## Customizing weights
|
||
Edit the call to `gpu_score(...)` in `gpu_rankings.py` `main()`:
|
||
```python
|
||
df["score"] = gpu_score(
|
||
df,
|
||
memory_weight=0.6,
|
||
compute_weight=0.4,
|
||
bandwidth_bonus_weight=0.4,
|
||
interconnect_weight=0.1,
|
||
min_floor=0.05,
|
||
)
|
||
```
|
||
|
||
## Library usage (import in your own code)
|
||
```python
|
||
from gpu_rankings import load_gpu_data, build_df, gpu_score
|
||
|
||
gpu_dict = load_gpu_data() # or load_gpu_data("/path/to/gpu_data.json")
|
||
df = build_df(gpu_dict)
|
||
df["score"] = gpu_score(
|
||
df,
|
||
memory_weight=0.6,
|
||
compute_weight=0.4,
|
||
bandwidth_bonus_weight=0.4,
|
||
interconnect_weight=0.1,
|
||
min_floor=0.05,
|
||
)
|
||
records = df[["name", "score"]].sort_values("score", ascending=False).to_dict(orient="records")
|
||
```
|
||
|
||
## Updating the dataset
|
||
Edit `gpu_data.json` to add or modify GPUs. Keep field names consistent:
|
||
- `MEMORY_GB`, `FP16_TFLOPS`, `MEMORY_BW_GBPS`, `HIGH_BW_INTERCONNECT_EXISTS`
|
||
|
||
## Limitations and notes
|
||
- Scoring is single‑GPU and spec‑based; it does not model workload‑specific behavior (e.g., comms‑bound vs compute‑bound) or cluster‑level scaling.
|
||
- FP16 figures may be provided by vendors with different caveats (e.g., sparsity). Use consistent, non‑sparse figures where possible.
|
||
- Interconnect bonus is a coarse indicator (0/1); adjust the weight or extend the data if you need gradations. |