Skip to content

sankey.py

Sankey diagram generation for energy system visualization.

This module provides functions to create comprehensive Sankey diagrams from PyPSA network data, showing energy flows including supply/demand balances, transmission losses, trade statistics, and regional energy exchanges. The diagrams are exported as interactive Plotly visualizations with accompanying CSV data files.

The main entry point is view_sankey.py which processes PyPSA networks and generates Sankey diagrams aggregated by year, component, location, and carrier.

collect_imbalances(supply, demand)

Collect energy imbalances from link connections.

Processes supply and demand imbalances for multi-carrier links by proportionally distributing supply based on demand shares and calculating losses for each bus carrier type.

Parameters:

Name Type Description Default
supply pandas.Series

Link supply statistics across all bus carriers.

required
demand pandas.Series

Link demand statistics across all bus carriers.

required

Returns:

Type Description
pandas.DataFrame

Concatenated series of imbalances including losses and carrier-specific supply flows.

Source code in evals/views/sankey.py
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
def collect_imbalances(supply: pd.Series, demand: pd.Series) -> pd.DataFrame:
    """
    Collect energy imbalances from link connections.

    Processes supply and demand imbalances for multi-carrier links by
    proportionally distributing supply based on demand shares and
    calculating losses for each bus carrier type.

    Parameters
    ----------
    supply
        Link supply statistics across all bus carriers.
    demand
        Link demand statistics across all bus carriers.

    Returns
    -------
    :
        Concatenated series of imbalances including losses and
        carrier-specific supply flows.
    """
    bc_in = demand.index.unique("bus_carrier")

    if len(bc_in) > 1:
        to_concat = []
        for _bc in bc_in:
            demand_bc = filter_by(demand, bus_carrier=_bc)
            demand_share = demand_bc.sum() / demand.sum()
            supply_bc = supply * demand_share
            to_concat.append(_process_single_input_link(supply_bc, demand_bc, _bc))
            mapper = {
                v: f"{v} from {_bc}" for v in supply_bc.index.unique("bus_carrier")
            }
            to_concat.append(rename_aggregate(supply_bc, mapper, level="bus_carrier"))
        return pd.concat(to_concat)

    return _process_single_input_link(supply, demand, bc_in.item())

get_demand(nc, transmission_comps, transmission_carrier, unit)

Extract and process demand statistics from PyPSA networks.

Collects withdrawal statistics excluding transmission components, includes pipeline compression loads, and processes storage demands.

Parameters:

Name Type Description Default
nc pypsa.NetworkCollection

Dictionary of PyPSA network objects.

required
transmission_comps list

List of transmission components to exclude from analysis.

required
transmission_carrier list

List of transmission carriers to exclude from analysis.

required
unit str

Unit string for the returned series attributes.

required

Returns:

Type Description
pandas.DataFrame

Demand statistics series including compression loads with specified unit attribute.

Source code in evals/views/sankey.py
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
def get_demand(
    nc: NetworkCollection,
    transmission_comps: list,
    transmission_carrier: list,
    unit: str,
) -> pd.DataFrame:
    """
    Extract and process demand statistics from PyPSA networks.

    Collects withdrawal statistics excluding transmission components,
    includes pipeline compression loads, and processes storage demands.

    Parameters
    ----------
    nc
        Dictionary of PyPSA network objects.
    transmission_comps
        List of transmission components to exclude from analysis.
    transmission_carrier
        List of transmission carriers to exclude from analysis.
    unit
        Unit string for the returned series attributes.

    Returns
    -------
    :
        Demand statistics series including compression loads with
        specified unit attribute.
    """
    withdrawal = collect_myopic_statistics(
        nc,
        statistic="withdrawal",
        aggregate_components=None,
    )
    compressing = (
        withdrawal.to_frame()
        .query("carrier.str.contains('pipeline') and bus_carrier == 'AC'")
        .squeeze()
    )
    demand = (
        filter_by(
            withdrawal,
            component=transmission_comps,
            carrier=transmission_carrier,
            exclude=True,
        )
        .pipe(
            drop_from_multtindex_by_regex, "co2|process emissions", level="bus_carrier"
        )
        .pipe(
            rename_aggregate,
            {
                "hydro": "hydro demand",
                "PHS": "PHS demand",
                "H2 Store": "H2 Store demand",
                "gas": "gas Store demand",
            },
        )
        .mul(-1)
    )

    result = pd.concat([demand, compressing])
    result.attrs["unit"] = unit

    return result

Calculate losses from Link components.

Processes each carrier type in Link components to identify conversion losses and energy imbalances between supply and demand sides.

Parameters:

Name Type Description Default
supply pandas.Series

Supply statistics including Link components.

required
demand pandas.DataFrame

Demand statistics including Link components.

required

Returns:

Type Description
list[pandas.Series]

List of imbalance series for each Link carrier type. Empty carriers are skipped with a logged warning.

Source code in evals/views/sankey.py
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
def get_link_losses(supply: pd.Series, demand: pd.DataFrame) -> list[pd.Series]:
    """
    Calculate losses from Link components.

    Processes each carrier type in Link components to identify conversion
    losses and energy imbalances between supply and demand sides.

    Parameters
    ----------
    supply
        Supply statistics including Link components.
    demand
        Demand statistics including Link components.

    Returns
    -------
    :
        List of imbalance series for each Link carrier type.
        Empty carriers are skipped with a logged warning.
    """
    link_losses = []
    link_supply_carrier = filter_by(supply, component="Link").index.unique("carrier")
    link_demand_carrier = filter_by(demand, component="Link").index.unique("carrier")
    link_carrier = link_supply_carrier.union(link_demand_carrier)
    for carrier in link_carrier:
        link_supply = filter_by(supply, carrier=carrier, component="Link")
        link_demand = filter_by(demand, carrier=carrier, component="Link")
        if link_supply.empty or link_demand.empty:
            logger.warning(
                f"Skipping carrier '{carrier}' due to empty supply or demand."
            )
            continue
        link_losses.append(collect_imbalances(link_supply, link_demand))

    return link_losses

get_supply(nc, transmission_comps, transmission_carrier)

Extract and process supply statistics from PyPSA networks.

Collects supply statistics excluding transmission components, filters out CO2 and process emissions, and renames storage components for clarity.

Parameters:

Name Type Description Default
nc pypsa.NetworkCollection

Dictionary of PyPSA network objects.

required
transmission_comps list

List of transmission components to exclude from analysis.

required
transmission_carrier list

List of transmission carriers to exclude from analysis.

required

Returns:

Type Description
pandas.Series

Supply statistics series with unit attribute set to MWh.

Source code in evals/views/sankey.py
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
def get_supply(
    nc: NetworkCollection, transmission_comps: list, transmission_carrier: list
) -> pd.Series:
    """
    Extract and process supply statistics from PyPSA networks.

    Collects supply statistics excluding transmission components, filters out
    CO2 and process emissions, and renames storage components for clarity.

    Parameters
    ----------
    nc
        Dictionary of PyPSA network objects.
    transmission_comps
        List of transmission components to exclude from analysis.
    transmission_carrier
        List of transmission carriers to exclude from analysis.

    Returns
    -------
    :
        Supply statistics series with unit attribute set to MWh.
    """
    supply = (
        collect_myopic_statistics(
            nc,
            statistic="supply",
            aggregate_components=None,
        )
        .pipe(
            filter_by,
            component=transmission_comps,
            carrier=transmission_carrier,
            exclude=True,
        )
        .pipe(
            drop_from_multtindex_by_regex, "co2|process emissions", level="bus_carrier"
        )
        .pipe(
            rename_aggregate,
            {
                "hydro": "hydro supply",
                "PHS": "PHS supply",
                "H2 Store": "H2 Store supply",
                "gas": "gas Store supply",
            },
        )
    )
    supply.attrs["unit"] = "MWh"

    return supply

get_trade_statistics(nc, transmission_comps, transmission_carrier, unit)

Extract energy trade statistics for foreign and domestic exchanges.

Collects import/export statistics for both foreign and domestic trade, filtering out CO2 emissions and applying appropriate grouping labels.

Parameters:

Name Type Description Default
nc pypsa.NetworkCollection

Dictionary of PyPSA network objects.

required
transmission_comps list

List of transmission components to include in trade analysis.

required
transmission_carrier list

List of transmission carriers to include in trade analysis.

required
unit str

Unit string for the returned series attributes.

required

Returns:

Type Description
list[pandas.Series]

List of trade statistics series for foreign imports/exports and domestic imports/exports.

Source code in evals/views/sankey.py
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
def get_trade_statistics(
    nc: NetworkCollection,
    transmission_comps: list,
    transmission_carrier: list,
    unit: str,
) -> list[pd.Series]:
    """
    Extract energy trade statistics for foreign and domestic exchanges.

    Collects import/export statistics for both foreign and domestic trade,
    filtering out CO2 emissions and applying appropriate grouping labels.

    Parameters
    ----------
    nc
        Dictionary of PyPSA network objects.
    transmission_comps
        List of transmission components to include in trade analysis.
    transmission_carrier
        List of transmission carriers to include in trade analysis.
    unit
        Unit string for the returned series attributes.

    Returns
    -------
    :
        List of trade statistics series for foreign imports/exports
        and domestic imports/exports.
    """
    trade_statistics = []
    for scope, direction, alias in [
        (TradeTypes.FOREIGN, "import", Group.import_foreign),
        (TradeTypes.FOREIGN, "export", Group.export_foreign),
        (TradeTypes.DOMESTIC, "import", Group.import_domestic),
        (TradeTypes.DOMESTIC, "export", Group.export_domestic),
    ]:
        trade = (
            collect_myopic_statistics(
                nc,
                statistic="trade_energy",
                scope=scope,
                direction=direction,
                aggregate_components=None,
            )
            # the trade statistic finds transmission between EU -> country buses.
            # Those are dropped by the filter_by statement.
            .pipe(
                filter_by,
                component=transmission_comps,
                carrier=transmission_carrier,
            )
            .pipe(drop_from_multtindex_by_regex, "co2", level="bus_carrier")
            .pipe(rename_aggregate, alias)
        )
        trade.attrs["unit"] = unit
        trade_statistics.append(trade)

    return trade_statistics

net_distribution_grid_losses(supply, demand)

Calculate net electricity distribution grid losses.

Computes grid losses from electricity distribution by summing supply and demand for distribution grid carriers, then removes these carriers from the input series to avoid double counting.

Parameters:

Name Type Description Default
supply pandas.Series

Supply statistics series (modified in-place).

required
demand pandas.DataFrame

Demand statistics series (modified in-place).

required

Returns:

Type Description
pandas.Series

Grid losses series with 'losses' bus carrier label.

Notes

This function modifies the input supply and demand series by removing 'electricity distribution grid' carrier entries.

Source code in evals/views/sankey.py
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
def net_distribution_grid_losses(supply: pd.Series, demand: pd.DataFrame) -> pd.Series:
    """
    Calculate net electricity distribution grid losses.

    Computes grid losses from electricity distribution by summing supply
    and demand for distribution grid carriers, then removes these carriers
    from the input series to avoid double counting.

    Parameters
    ----------
    supply
        Supply statistics series (modified in-place).
    demand
        Demand statistics series (modified in-place).

    Returns
    -------
    :
        Grid losses series with 'losses' bus carrier label.

    Notes
    -----
    This function modifies the input supply and demand series by removing
    'electricity distribution grid' carrier entries.
    """
    grid_losses = (
        filter_by(supply, carrier="electricity distribution grid")
        .groupby(IDX)
        .sum()
        .add(
            filter_by(demand, carrier="electricity distribution grid")
            .groupby(IDX)
            .sum()
        )
    )
    grid_losses = insert_index_level(grid_losses, "losses", "bus_carrier", pos=4)
    supply.drop("electricity distribution grid", level="carrier", inplace=True)
    demand.drop("electricity distribution grid", level="carrier", inplace=True)

    return grid_losses

view_sankey(result_path, nc, config)

Generate Sankey diagrams for energy flow visualization.

Creates comprehensive Sankey diagrams showing energy flows, including supply/demand balances, transmission losses, trade statistics, and regional energy exchanges. The function processes PyPSA network data to extract energy flows and exports them as interactive visualizations.

Parameters:

Name Type Description Default
result_path str | pathlib.Path

Path where the generated Sankey diagrams and data will be saved.

required
nc pypsa.NetworkCollection

Dictionary of PyPSA network objects containing the energy system data to be analyzed and visualized.

required
config dict

Configuration dictionary containing view settings, chart specifications, transmission components to exclude, and export parameters.

required

Returns:

Type Description
None

Exports Sankey diagrams and underlying energy flow data to the specified result path. Generated files include interactive Plotly visualizations and CSV data files containing the processed statistics.

Notes

The function processes several types of energy flows: - Supply and demand statistics (excluding transmission components) - Grid losses from electricity distribution - Trade statistics (foreign and domestic imports/exports) - Link losses and conversion inefficiencies - Regional trade balances for specific carriers (oil, coal, lignite, NH3)

All statistics are aggregated by year, component, location, and carrier.

Source code in evals/views/sankey.py
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
def view_sankey(
    result_path: str | Path,
    nc: NetworkCollection,
    config: dict,
) -> None:
    """
    Generate Sankey diagrams for energy flow visualization.

    Creates comprehensive Sankey diagrams showing energy flows, including supply/demand
    balances, transmission losses, trade statistics, and regional energy exchanges.
    The function processes PyPSA network data to extract energy flows and exports
    them as interactive visualizations.

    Parameters
    ----------
    result_path
        Path where the generated Sankey diagrams and data will be saved.
    nc
        Dictionary of PyPSA network objects containing the energy system data
        to be analyzed and visualized.
    config
        Configuration dictionary containing view settings, chart specifications,
        transmission components to exclude, and export parameters.

    Returns
    -------
    :
        Exports Sankey diagrams and underlying energy flow data to the specified
        result path. Generated files include interactive Plotly visualizations
        and CSV data files containing the processed statistics.

    Notes
    -----
    The function processes several types of energy flows:
    - Supply and demand statistics (excluding transmission components)
    - Grid losses from electricity distribution
    - Trade statistics (foreign and domestic imports/exports)
    - Link losses and conversion inefficiencies
    - Regional trade balances for specific carriers (oil, coal, lignite, NH3)

    All statistics are aggregated by year, component, location, and carrier.
    """
    (
        _,
        transmission_comps,
        transmission_carrier,
        _,
        _,
    ) = _parse_view_config_items(nc, config)

    supply = get_supply(nc, transmission_comps, transmission_carrier)
    demand = get_demand(
        nc, transmission_comps, transmission_carrier, unit=supply.attrs["unit"]
    )

    grid_losses = net_distribution_grid_losses(supply, demand)
    trade_statistics = get_trade_statistics(
        nc, transmission_comps, transmission_carrier, unit=supply.attrs["unit"]
    )
    link_losses = get_link_losses(supply, demand)

    regional_trade = [
        regionalize_statistics(supply, demand, bus_carrier)
        for bus_carrier in BusCarrier.eu_buses()
    ]

    exporter = Exporter(
        statistics=[
            supply,
            demand,
            grid_losses,
        ]
        + trade_statistics
        + regional_trade
        + link_losses,
        view_config=config["view"],
    )

    exporter.defaults.xaxis_title = ""
    exporter.defaults.plotby = [DM.YEAR, DM.LOCATION]
    exporter.defaults.pivot_index = [
        DM.COMPONENT,
        DM.YEAR,
        DM.LOCATION,
        DM.CARRIER,
        DM.BUS_CARRIER,
    ]
    exporter.export(result_path, config["global"]["subdir"])