AI backkground giving a sense of power grids and foundtaional models

GridFM

Small foundation models for the electric grid

GridSFM is designed around four core tenets:

Topology Agnostic. A single model with shared weights processes grids of any size and shape. Buses are nodes, transmission lines are edges, and the same backbone handles a 500-bus benchmark or a 4,000-bus state-scale topology without per-grid retraining.

Feasibility Aware. Infeasibility is a first-class output, not a discarded label. GridSFM classifies every scenario as feasible or infeasible with a confidence score — useful for contingency screening, security assessment, and market-clearing validation. On the held-out test set, the classifier reaches 95.3% balanced accuracy (F1 = 0.945 on the feasible class).

Physics Grounded. Branch flows are not predicted directly; they’re derived analytically from predicted bus voltages and angles via the standard π-equivalent branch equations. Physics penalties (power balance, thermal, voltage) regularize training so outputs land on the AC-OPF manifold.

Data Efficient. Self-supervised physics constraints supplement supervised solver labels, reducing the per-topology label budget. On a brand-new grid, as few as ~10 fine-tuning scenarios already produce reasonable cost and dispatch estimates, and ~1,000 scenarios recover full in-sample performance.

“GridSFM predicts AC-OPF solutions in milliseconds: bus voltages, generator dispatch, branch power flows, and a feasibility classification without running a solver.”

GridFM is built around four core tenets: topology-agnostic, feasibility-aware, physics-grounded, and data-efficient.

What the model predicts

Given a grid topology, physical and operating constraints, generation characteristics, and a loading scenario, GridSFM produces a complete operating point — bus voltage magnitudes V and angles θ, generator active and reactive dispatch (Pg, Qg), branch active and reactive flows (Pij, Qij) — plus a feasibility verdict with a continuous margin.

Headline results (GridSFM-Open, 54-grid test corpus)

MetricValue
Cost MAPE3.35% (median 2.85%; 51/54 grids below 5%)
Voltage magnitude MAE0.0080 p.u.
Voltage angle MAE2.14°
Generator active power MAE0.092 p.u.
Feasibility classifier balanced accuracy95.3%
AC-OPF warm-start speedup over cold start1.66× geometric mean (wins on 41/54 grids)
Warm-start speedup vs. DC-OPF baseline1.59× faster than DC warm-start alone

Used as a warm-start seed for the PowerModels.jl AC-OPF solver, GridSFM cuts solve time by 1.66× on average and captures ~61% of the theoretical headroom between a cold solve and the optimal-point ceiling. On the largest cases this reaches ~4× speedup (case1951_rte, case2868_rte) and 6–7× on a few (Texas2k summer peak, case2742_goc).

Per-grid AC-OPF speedup distribution (log-x axis): GridSFM warm-start vs. DC warm-start vs. ground-truth ceiling, with KDE plots and per-grid dots

Out-of-distribution generalization and fine-tuning

On a grid 1.4× larger than anything seen in training (case6470_rte), zero-shot cost MAPE rises to ~14% — the model has learned generalizable angle and cost structure, but voltage magnitude and the feasibility classifier need calibration to the new grid. Fine-tuning on just 1,000 scenarios from the new grid restores accuracy (cost MAPE drops to 1.12%, feasibility F1 recovers to 0.99), and a held-out N-1 contingency split tracks the intact topology closely — fine-tuning on the base topology transfers cleanly to contingency variants.

Fine-tuning loss curves showing rapid adaptation to a previously unseen 6,470-bus grid using only 1,000 training scenarios.
Train/val loss curves over 10 epochs of fine-tuning on case6470_rte (1,000 train graphs)