Hypergraph Partitioners (for DQC): Installation and Usage

June 2026

This guide covers six hypergraph partitioners that you can use as baselines or state-of-the-art references when benchmarking your own partitioning algorithms. All of them solve the same problem: partition the vertices of a hypergraph into k balanced blocks while minimising the number of cut hyperedges. The tutorial was written with Distributed Quantum Computing researchers in mind (you shall see references to "circuits" as QASM), but can be used as a guideline beyond this community.

Contents

Problem statement and common interface
FM — Fiduccia-Mattheyses local search
Greedy — constructive greedy
Stochastic Greedy
Evolutionary Algorithm
KaHyPar — multilevel partitioner
Zoltan PHG — parallel multilevel partitioner
Quick comparison
Bonus: QASM to HDH

1. Problem statement and common interface

A hypergraph H = (V, E) consists of a vertex set V and a set of hyperedges E, where each hyperedge is a subset of V of arbitrary size (unlike a graph, where edges always connect exactly two vertices).

The k-way minimum cut problem asks: partition V into k non-overlapping blocks V₁, …, V_k such that

Balance: each block has at most ⌈|V|/k × (1 + ε)⌉ vertices, and
Objective: the number of cut hyperedges — those whose vertices span more than one block — is minimised.

Common Python interface (HDH library)

About the HDH library: The FM, Greedy, Stochastic Greedy, and Evolutionary partitioners in this guide are part of HDH (Hybrid Dependency Hypergraph), a Python library developed to provide a unified hypergraph-based intermediate representation for quantum programs, along with tools for workflow analysis and partitioning. It will be used as the consistent hypergraph abstraction to evaluate all circuits through this tutorial. Full documentation is available at grageragarces.github.io/HDH.

FM, Greedy, Stochastic Greedy, and the Evolutionary Algorithm are available through the HDH Python library. Install it with:

pip install hdh

All four share the same input representation:

n_vertices  = 8                          # integer — total number of vertices
hyperedges  = [[0,1,2], [1,3,4], [4,5,6,7]]  # list of lists of vertex indices
k           = 2                          # number of output blocks
epsilon     = 0.05                       # allowed imbalance (5 %)

The returned value is the cut count: an integer equal to the number of hyperedges that span more than one block in the computed partition. Where the partition assignment itself is also needed, each function additionally returns a list assignment of length n_vertices, where assignment[v] is the block index (0-based) of vertex v.

2. FM — Fiduccia-Mattheyses local search

Context: Introduced by Fiduccia and Mattheyses in 1982, FM is the cornerstone local-search heuristic for graph and hypergraph partitioning. Each pass sweeps all vertices in decreasing order of gain (the reduction in cut if the vertex moves to another block), "locking" each vertex after it moves. The best balance-feasible prefix of each sweep is kept as the output. The algorithm is fast (O(|V| + |E|) per pass) and effective as a refinement step, but its solution quality depends heavily on the starting partition. In the multilevel pipeline (KaHyPar, Zoltan) it is used as a local refinement kernel.

Installation

Included in HDH:

pip install hdh

Running

from hdh.partitioners import fm_partition

cut, assignment = fm_partition(
    n_vertices,
    hyperedges,
    k,
    epsilon,
    max_passes=25,   # maximum FM sweep iterations
)

Output

cut         # int: number of cut hyperedges
assignment  # list[int] of length n_vertices: block index per vertex

Tip: FM is sensitive to the initial partition. Running it from multiple random starts and keeping the best result is a simple but effective strategy.

3. Greedy — constructive greedy

Context: The greedy partitioner is the simplest constructive baseline. It processes hyperedges in order and assigns each unassigned vertex to the block that already contains the most of that hyperedge's vertices (breaking ties at random). Balance is enforced by capping block sizes. Quality is typically the weakest of the six methods here, but it runs in O(|V| + |E|) time and requires no configuration, making it a reliable sanity-check lower-bound for solution quality.

Installation

pip install hdh

Running

from hdh.partitioners import greedy_partition

cut, assignment = greedy_partition(
    n_vertices,
    hyperedges,
    k,
    epsilon,
)

Output

cut         # int
assignment  # list[int]

4. Stochastic Greedy

Context: Stochastic Greedy (StocG) extends the deterministic greedy with a time-budget loop: it repeatedly runs a greedy pass with a randomised vertex/edge ordering, applies a local-improvement step, and keeps the best-seen solution until a wall-clock deadline. The result monotonically improves with more time, which makes it straightforward to trade speed for quality. It is the default strategy in the DQC benchmarking repository and tends to outperform plain greedy and FM at equal wall-clock time for typical circuit sizes.

Installation

pip install hdh

Running

from hdh.partitioners import stochastic_greedy_partition

cut, assignment = stochastic_greedy_partition(
    n_vertices,
    hyperedges,
    k,
    epsilon=0.05,
    time_cap=30,   # seconds of wall-clock budget
    seed=42,
)

Output

cut         # int: best cut found within time_cap
assignment  # list[int]

Reproducibility: Set seed explicitly when comparing methods across runs.

5. Evolutionary Algorithm

Context: The Evolutionary Algorithm (EA) maintains a population of partitions and evolves it through tournament selection, uniform partition crossover, random mutation, and balance repair. Elite solutions are preserved across generations, and the run terminates early when the best cut stops improving for patience consecutive generations. EA typically produces the best solution quality of the four HDH methods at the cost of longer runtime, making it suitable as a reference when a time budget of minutes is acceptable.

Installation

pip install hdh

Running

from hdh.partitioners import ea_partition

cut, assignment = ea_partition(
    n_vertices,
    hyperedges,
    k,
    epsilon,
    pop_size=40,
    max_generations=80,
    patience=20,      # stop if no improvement for this many generations
    elite_size=2,
    tournament_k=3,
    mut_rate=0.04,
    seed=42,
)

Output

cut         # int: cut of the best individual in the final population
assignment  # list[int]

Note: patience is the most impactful parameter for controlling runtime. Reduce it for a faster but potentially lower-quality run.

6. KaHyPar — multilevel hypergraph partitioner

Context: KaHyPar (Karlsruhe Hypergraph Partitioning) is the current state of the art for sequential hypergraph partitioning. It uses a multilevel strategy: coarsen the hypergraph by contracting clusters of similar vertices, compute an initial partition on the coarsened graph, then uncoarsen step-by-step while refining with FM at each level. Community detection during coarsening and a portfolio of cut-objective variants (k-KaHyPar, r-KaHyPar) make it very competitive on both small and large instances. KaHyPar is the recommended baseline when the best known solution quality is needed.

Installation

KaHyPar must be built from source. See the dedicated guide for step-by-step instructions (tested on Apple M4 Pro, Python 3.12):

Building and Installing KaHyPar Python Bindings →

Running

import kahypar as kh

# Build the hypergraph object.
# edge_weights and vertex_weights can be lists of ones if unweighted.
edge_weights   = [1] * len(hyperedges)
vertex_weights = [1] * n_vertices
# Flatten hyperedges and build the index array.
edge_indices = [0]
edge_vector  = []
for he in hyperedges:
    edge_vector.extend(he)
    edge_indices.append(len(edge_vector))

hg = kh.Hypergraph(
    n_vertices,
    len(hyperedges),
    edge_indices,
    edge_vector,
    k,
    edge_weights,
    vertex_weights,
)

ctx = kh.Context()
ctx.loadINIconfiguration("cut_kKaHyPar_sea20.ini")  # see config/ in source tree
ctx.setK(k)
ctx.setEpsilon(epsilon)
ctx.setSeed(42)

kh.partition(hg, ctx)

cut        = kh.cut(hg)
assignment = [hg.blockID(v) for v in range(n_vertices)]

Configuration files

KaHyPar requires an .ini configuration file that controls coarsening, initial partitioning, and refinement parameters. The canonical files ship with the source:

kahypar/config/
  cut_kKaHyPar_sea20.ini   ← recommended for cut objective, k-way
  cut_rKaHyPar.ini         ← recursive bisection variant

Output

cut        # int — kh.cut(hg) after kh.partition(hg, ctx)
assignment # list[int] — [hg.blockID(v) for v in range(n_vertices)]

7. Zoltan PHG — parallel multilevel partitioner

Context: Zoltan is a parallel combinatorial scientific computing toolkit developed at Sandia National Laboratories. Its PHG (Parallel HyperGraph) component implements a parallel multilevel hypergraph partitioner based on the same coarsen/partition/refine strategy as KaHyPar, but designed for distributed-memory MPI runs. For sequential use on a laptop it typically produces slightly worse cut quality than KaHyPar but is the reference tool for scalability experiments on clusters.

Installation

Option A — Homebrew (macOS)

brew install trilinos    # Zoltan ships as part of Trilinos

Trilinos installs headers and libraries under $(brew --prefix)/include/trilinos and $(brew --prefix)/lib.

Option B — Build from source

git clone --depth=1 https://github.com/trilinos/Trilinos.git
cd Trilinos
mkdir -p build && cd build
cmake .. \
  -DTrilinos_ENABLE_Zoltan=ON \
  -DTrilinos_ENABLE_ALL_PACKAGES=OFF \
  -DBUILD_SHARED_LIBS=ON \
  -DCMAKE_BUILD_TYPE=Release
make -j$(sysctl -n hw.physicalcpu)
sudo make install

Compile the C wrapper

The repository ships a thin C driver (zoltan_phg_partition.c) that reads the hypergraph from stdin, calls Zoltan PHG, and prints the cut count to stdout. Compile it once:

ZOLTAN_PREFIX=$(brew --prefix trilinos)   # or your install prefix
mpicc -O2 -o zoltan_phg_partition zoltan_phg_partition.c \
  -I"$ZOLTAN_PREFIX/include" \
  -L"$ZOLTAN_PREFIX/lib" \
  -lzoltan -lm

Running

The driver reads from stdin using a simple text protocol.

Input format (stdin)

n_vertices n_hyperedges k
v0 v1 v2          ← vertices of hyperedge 0
v0 v3             ← vertices of hyperedge 1
...

Calling from Python via subprocess

import subprocess

def build_stdin(n_vertices, hyperedges, k):
    lines = [f"{n_vertices} {len(hyperedges)} {k}"]
    for he in hyperedges:
        lines.append(" ".join(map(str, he)))
    return "\n".join(lines)

stdin_data = build_stdin(n_vertices, hyperedges, k)

result = subprocess.run(
    ["./zoltan_phg_partition"],
    input=stdin_data,
    capture_output=True,
    text=True,
    check=True,
)

cut = int(result.stdout.strip())

Running with MPI (parallel)

echo "$HYPERGRAPH_STDIN" | mpirun -n 4 ./zoltan_phg_partition

Output

cut    # int printed to stdout — number of cut hyperedges

Note: Zoltan PHG does not currently return the per-vertex assignment through the stdin/stdout driver — only the cut count. If you need the full assignment, link directly against the Zoltan C library and call Zoltan_LB_Partition.

8. Quick comparison

Method	Type	Typical quality	Speed	Install	Parallel
Greedy	Constructive	Lowest baseline	Very fast — O(V+E)	`pip install hdh`	No
FM	Local search	Low–medium	Fast — O(V+E)/pass	`pip install hdh`	No
Stochastic Greedy	Iterated greedy	Medium (grows with time)	Configurable via `time_cap`	`pip install hdh`	No
Evolutionary	Metaheuristic	Good	Slow	`pip install hdh`	No
KaHyPar	Multilevel	State of the art	Medium–fast	Build from source	No
Zoltan PHG	Parallel multilevel	Good	Fast with MPI	Build from source / brew	Yes (MPI)

Recommended comparison setup: Use Greedy as the weakest baseline, KaHyPar as the strongest sequential reference, and Stochastic Greedy or FM as practical mid-tier alternatives with controllable runtime. Fix epsilon = 0.05 and the same seed across all methods for fair comparisons.

9. Bonus: QASM to HDH

All six partitioners expect (n_vertices, hyperedges) as plain Python objects. When your input is a quantum circuit, the natural mapping is:

vertices — qubits, indexed 0 … qc.num_qubits − 1
hyperedges — multi-qubit gates; each gate becomes one hyperedge whose members are the qubit indices it acts on

Single-qubit gates are excluded because they create no inter-qubit connectivity. Non-unitary instructions (barrier, measure, delay, snapshot) are also excluded.

Prerequisites

pip install qiskit hdh

HDH converter (from a QuantumCircuit object)

If you already have a QuantumCircuit in memory, HDH's converter module produces an HDH graph object in one call.

from qiskit import QuantumCircuit
from hdh.converters import from_qiskit

qc = QuantumCircuit.from_qasm_file("circuit.qasm")
hdh_graph = from_qiskit(qc)

# Extract the raw representation for the partitioners
n_vertices = hdh_graph.num_vertices
hyperedges = hdh_graph.hyperedges      # list[list[int]]

Note: The exact attribute names (.num_vertices, .hyperedges) depend on the HDH version installed — check dir(hdh_graph) or the HDH documentation if they differ. Path A is always safe because it only depends on Qiskit.

End-to-end example

Convert a QASM file and run all four HDH partitioners on it:

from qiskit import QuantumCircuit
from hdh.partitioners import (
    greedy_partition,
    fm_partition,
    stochastic_greedy_partition,
    ea_partition,
)

SKIP = {"barrier", "snapshot", "delay", "label", "measure", "reset"}

def qasm_to_hypergraph(path):
    qc = QuantumCircuit.from_qasm_file(path)
    qubit_index = {q: i for i, q in enumerate(qc.qubits)}
    hyperedges = []
    for inst in qc.data:
        if inst.operation.name in SKIP:
            continue
        if len(inst.qubits) >= 2:
            edge = list(dict.fromkeys(qubit_index[q] for q in inst.qubits))
            if len(edge) >= 2:
                hyperedges.append(edge)
    return qc.num_qubits, hyperedges

n_vertices, hyperedges = qasm_to_hypergraph("circuit.qasm")
k       = 2
epsilon = 0.05

results = {
    "greedy": greedy_partition(n_vertices, hyperedges, k, epsilon),
    "fm":     fm_partition(n_vertices, hyperedges, k, epsilon),
    "stocg":  stochastic_greedy_partition(n_vertices, hyperedges, k, epsilon, time_cap=30, seed=42),
    "ea":     ea_partition(n_vertices, hyperedges, k, epsilon, seed=42),
}

for name, (cut, _) in results.items():
    print(f"{name:8s}  cut={cut}")

Common pitfalls:

OpenQASM 2 files use QuantumCircuit.from_qasm_file; OpenQASM 3 requires qiskit.qasm3.load (Qiskit ≥ 1.0).
If num_qubits is large but most qubits never appear in a two-qubit gate, the hypergraph will have many isolated vertices — this is correct and expected.
Gates that act on the same qubit twice (e.g. some custom unitaries) are deduplicated by dict.fromkeys, which keeps the edge valid.