{
 "cells": [
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "# Tutorial\n",
    "\n",
    "In this `energnn` tutorial, we will review :\n",
    "- How to install `energnn`.\n",
    "- The interaction with typical implementations of `energnn.problem.Problem`, `energnn.problem.ProblemBatch` and `energnn.problem.ProblemLoader`.\n",
    "- The creation of a GNN model using `energnn.model.ready_to_use`,\n",
    "- The training of the model with `energnn.trainer.Trainer`."
   ],
   "id": "ef720d484a9302d7"
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "## Installation\n",
    "\n",
    "To install the latest stable release of `energnn` on CPU,\n",
    "```bash\n",
    "pip install energnn\n",
    "```\n",
    "For the GPU version,\n",
    "```bash\n",
    "pip install energnn --extra gpu\n",
    "```"
   ],
   "id": "7e9fb261104f6c9e"
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "## Problem Class\n",
    "\n",
    "Let's consider the following use case: **DC Power Flow** in an electrical network.\n",
    "\n",
    "We want to determine the phase angles $\\theta$ at the different buses (nodes) of the network, given the active power injections $P$ and the physical characteristics of the lines (defined in the susceptance $B$ matrix).\n",
    "\n",
    "The problem is modeled by the following linear system:\n",
    "$$B \\theta = P$$\n",
    "\n",
    "Where:\n",
    "- $B$ is the susceptance matrix (similar to a Laplacian matrix).\n",
    "- $P$ is the vector of active power injections at each bus.\n",
    "- $\\theta$ is the vector of phase angles we wish to predict.\n",
    "\n",
    "Let's generate a random problem instance and explore its interface."
   ],
   "id": "b09f0af02cdce27d"
  },
  {
   "metadata": {
    "ExecuteTime": {
     "end_time": "2026-04-24T12:19:29.687861316Z",
     "start_time": "2026-04-24T12:19:26.283604352Z"
    }
   },
   "cell_type": "code",
   "source": [
    "from energnn.problem.example import LinearSystemProblemGenerator\n",
    "\n",
    "pb_generator = LinearSystemProblemGenerator(seed=7, n_max=4)\n",
    "problem = pb_generator.generate_problem()"
   ],
   "id": "7cf6cd1eb3f196fa",
   "outputs": [],
   "execution_count": 1
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "### Context Graph\n",
    "\n",
    "The input of our GNN model is referred to as **the context**, instantiated as an `energnn.graph.Graph` object.\n",
    "\n",
    "In our case, the context contains the known data of the problem: the matrix $B$ and the vector $P$. This data is structured as a Hyper Heterogeneous Multi Graph (H2MG):\n",
    "\n",
    "1. **The matrix $B$ (lines)**: The off-diagonal elements of $B$ represent the connections between nodes. We encode them in a hyper-edge set called **line**. Each line has a feature called `susceptance`.\n",
    "2. **The vector $P$ (buses)**: Power injections are associated with the buses of the network. We encode them in a hyper-edge set called **bus**. Each bus has a feature called `active_power_injection`."
   ],
   "id": "3abb3650a64c6437"
  },
  {
   "metadata": {
    "ExecuteTime": {
     "end_time": "2026-04-24T12:19:29.714663035Z",
     "start_time": "2026-04-24T12:19:29.690889133Z"
    }
   },
   "cell_type": "code",
   "source": [
    "# Let us explore the context structure\n",
    "print(problem.context_structure)"
   ],
   "id": "6598363220c05d2",
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "           Ports                  Features\n",
      "Name                                      \n",
      "line  [from, to]             [susceptance]\n",
      "bus         [id]  [active_power_injection]\n"
     ]
    }
   ],
   "execution_count": 2
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "Each hyper-edge set has **ports** and/or **features**:\n",
    "- **Ports** define the connectivity (e.g., a `line` connects a `from` bus to a `to` bus). They associate a hyper-edge with an integer **address** (the node index).\n",
    "- **Features** are the associated numerical values (e.g., the susceptance value)."
   ],
   "id": "dda008c0c5278ea2"
  },
  {
   "metadata": {
    "ExecuteTime": {
     "end_time": "2026-04-24T12:19:29.930288937Z",
     "start_time": "2026-04-24T12:19:29.717226120Z"
    }
   },
   "cell_type": "code",
   "source": [
    "# Print the context graph associated to the problem instance\n",
    "context, _ = problem.get_context()\n",
    "print(context)"
   ],
   "id": "991074778b81c139",
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "bus\n",
      "          ports               features\n",
      "             id active_power_injection\n",
      "object_id                             \n",
      "0           0.0              -0.361243\n",
      "1           1.0               0.397479\n",
      "line\n",
      "          ports         features\n",
      "           from   to susceptance\n",
      "object_id                       \n",
      "0           0.0  1.0    0.818972\n",
      "\n"
     ]
    }
   ],
   "execution_count": 3
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "The **context** of this specific problem instance has:\n",
    "\n",
    "- Several **line** objects (representing physical links and their susceptance),\n",
    "- Several **bus** objects (representing nodes and their power injection)."
   ],
   "id": "89cbcd6af39ce900"
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "### Decision Graph\n",
    "The output of our GNN model is referred to as **the decision**.\n",
    "\n",
    "In our case, it is the vector of phase angles $\\theta$.\n",
    "\n",
    "We also model it as a graph, where each **bus** carries a `phase_angle` feature.\n",
    "\n",
    "This specific problem class has a helper method called `get_zero_decision` that returns a decision of the right structure, filled with zeros. It's a great starting point for optimization."
   ],
   "id": "ab7e7a94933f66f1"
  },
  {
   "metadata": {
    "ExecuteTime": {
     "end_time": "2026-04-24T12:19:29.951197378Z",
     "start_time": "2026-04-24T12:19:29.940180770Z"
    }
   },
   "cell_type": "code",
   "source": [
    "# Let us explore the decision structure\n",
    "print(problem.decision_structure)"
   ],
   "id": "6e9b3adbbf79d767",
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "     Ports       Features\n",
      "Name                     \n",
      "bus   None  [phase_angle]\n"
     ]
    }
   ],
   "execution_count": 4
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": "Notice that decisions concern only a subset of the classes available in the context (here the `bus`), and that they have no ports, just features.",
   "id": "ac71675fc43c38fd"
  },
  {
   "metadata": {
    "ExecuteTime": {
     "end_time": "2026-04-24T12:19:29.982854590Z",
     "start_time": "2026-04-24T12:19:29.960255120Z"
    }
   },
   "cell_type": "code",
   "source": [
    "# Let us create an initial zero decision\n",
    "decision, _ = problem.get_zero_decision()\n",
    "print(decision)"
   ],
   "id": "b65a282d869a3f9c",
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "bus\n",
      "             features\n",
      "          phase_angle\n",
      "object_id            \n",
      "0                -0.0\n",
      "1                 0.0\n",
      "\n"
     ]
    }
   ],
   "execution_count": 5
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "### Objective Function\n",
    "The score of a decision measures the quality of our prediction.\n",
    "\n",
    "In this case, we use the Mean Squared Error (MSE) compared to the exact solution $\\theta^\\star$: $\\frac{1}{2} \\Vert \\theta - \\theta^\\star \\Vert^2$.\n",
    "\n",
    "Let's evaluate this score for our zero decision."
   ],
   "id": "9e2d2d2d7fc8c3c0"
  },
  {
   "metadata": {
    "ExecuteTime": {
     "end_time": "2026-04-24T12:19:30.157103171Z",
     "start_time": "2026-04-24T12:19:29.984624164Z"
    }
   },
   "cell_type": "code",
   "source": [
    "score, _ = problem.get_score(decision=decision)\n",
    "print(f\"Initial score (MSE): {score}\")"
   ],
   "id": "38de3eadfb55c47",
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Initial score (MSE): 0.08047349750995636\n"
     ]
    }
   ],
   "execution_count": 6
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "### Gradient Graph\n",
    "To learn, the model needs to know the direction in which to modify its predictions: this is the **gradient** of the objective function.\n",
    "\n",
    "In this simple case, the gradient is just the vector $\\theta - \\theta^\\star$.\n",
    "\n",
    "Let's calculate it for our initial decision."
   ],
   "id": "accda6a3f4156630"
  },
  {
   "metadata": {
    "ExecuteTime": {
     "end_time": "2026-04-24T12:19:30.186020775Z",
     "start_time": "2026-04-24T12:19:30.159920171Z"
    }
   },
   "cell_type": "code",
   "source": [
    "gradient, _ = problem.get_gradient(decision=decision)\n",
    "print(gradient)"
   ],
   "id": "7c7c31970db2a009",
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "bus\n",
      "             features\n",
      "          phase_angle\n",
      "object_id            \n",
      "0            0.037100\n",
      "1           -0.399463\n",
      "\n"
     ]
    }
   ],
   "execution_count": 7
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "Notice that this gradient is of the exact same type as the decision.\n",
    "\n",
    "Just as a quick sanity check, we can perform a simple manual gradient descent to see if the objective decreases."
   ],
   "id": "fbcebd90c60a171"
  },
  {
   "metadata": {
    "ExecuteTime": {
     "end_time": "2026-04-24T12:19:30.253901623Z",
     "start_time": "2026-04-24T12:19:30.188281333Z"
    }
   },
   "cell_type": "code",
   "source": [
    "alpha = 0.5  # Learning rate\n",
    "\n",
    "objective, _ = problem.get_score(decision=decision)\n",
    "print(f\"Step 0, objective = {objective}\")\n",
    "\n",
    "for i in range(10):\n",
    "    gradient, _ = problem.get_gradient(decision=decision)\n",
    "\n",
    "    # Update decision (theta = theta - alpha * gradient)\n",
    "    decision.feature_flat_array -= alpha * gradient.feature_flat_array\n",
    "\n",
    "    objective, _ = problem.get_score(decision=decision)\n",
    "    print(f\"Step {i + 1}, objective = {objective}\")"
   ],
   "id": "7f90b981f57b5f6a",
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Step 0, objective = 0.08047349750995636\n",
      "Step 1, objective = 0.02011837437748909\n",
      "Step 2, objective = 0.005029592663049698\n",
      "Step 3, objective = 0.0012573988642543554\n",
      "Step 4, objective = 0.0003143500944133848\n",
      "Step 5, objective = 7.858734170440584e-05\n",
      "Step 6, objective = 1.964674265764188e-05\n",
      "Step 7, objective = 4.911732503387611e-06\n",
      "Step 8, objective = 1.2279560905881226e-06\n",
      "Step 9, objective = 3.069773981678736e-07\n",
      "Step 10, objective = 7.67386012512361e-08\n"
     ]
    }
   ],
   "execution_count": 8
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "The objective function successfully decreases! This confirms the gradient definition is consistent with the score we wish to minimize.\n",
    "Now, let's explore how multiple problems can be batched together."
   ],
   "id": "bc0b4c88f5cdbdbd"
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "## Problem Batch\n",
    "\n",
    "Interacting with a single problem instance is useful at inference time, or for debugging purposes.\n",
    "But to train a whole Graph Neural Network model, it is necessary to process batches of problem instances altogether."
   ],
   "id": "71bfe788a6c66eab"
  },
  {
   "metadata": {
    "ExecuteTime": {
     "end_time": "2026-04-24T12:23:03.080380006Z",
     "start_time": "2026-04-24T12:23:02.945599003Z"
    }
   },
   "cell_type": "code",
   "source": [
    "from energnn.problem.example import LinearSystemProblemGenerator\n",
    "\n",
    "pb_generator = LinearSystemProblemGenerator(seed=9, n_max=3)\n",
    "problem_batch = pb_generator.generate_problem_batch(batch_size=3)\n",
    "\n",
    "# Let us explore the context and decision structures\n",
    "print(\"Context Structure:\\n\", problem_batch.context_structure, \"\\n\")\n",
    "print(\"Decision Structure:\\n\", problem_batch.decision_structure)"
   ],
   "id": "7bb9c8ce6533b092",
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Context Structure:\n",
      "            Ports                  Features\n",
      "Name                                      \n",
      "line  [from, to]             [susceptance]\n",
      "bus         [id]  [active_power_injection] \n",
      "\n",
      "Decision Structure:\n",
      "      Ports       Features\n",
      "Name                     \n",
      "bus   None  [phase_angle]\n"
     ]
    }
   ],
   "execution_count": 21
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": "Here, contexts are still graphs, but this time with an extra dimension for the batch:",
   "id": "6f5d3607d07f1524"
  },
  {
   "metadata": {
    "ExecuteTime": {
     "end_time": "2026-04-24T12:19:31.752640132Z",
     "start_time": "2026-04-24T12:19:31.531404736Z"
    }
   },
   "cell_type": "code",
   "source": [
    "context, _ = problem_batch.get_context()\n",
    "print(context)"
   ],
   "id": "b9d80c6edb467ccb",
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "bus\n",
      "                   ports               features\n",
      "                      id active_power_injection\n",
      "batch_id object_id                             \n",
      "0        0           0.0              -1.216858\n",
      "         1           1.0               1.103963\n",
      "         2           0.0               0.000000\n",
      "1        0           0.0              -2.028237\n",
      "         1           1.0              -0.351886\n",
      "         2           2.0               2.323155\n",
      "2        0           0.0               1.087987\n",
      "         1           1.0              -1.157142\n",
      "         2           0.0               0.000000\n",
      "line\n",
      "                   ports         features\n",
      "                    from   to susceptance\n",
      "batch_id object_id                       \n",
      "0        0           0.0  1.0    1.001875\n",
      "         1           0.0  0.0    0.000000\n",
      "         2           0.0  0.0    0.000000\n",
      "1        0           0.0  1.0    0.918508\n",
      "         1           0.0  2.0    0.748101\n",
      "         2           1.0  2.0    0.584060\n",
      "2        0           0.0  1.0    1.047838\n",
      "         1           0.0  0.0    0.000000\n",
      "         2           0.0  0.0    0.000000\n",
      "\n"
     ]
    }
   ],
   "execution_count": 10
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "Notice that the different networks in the batch do not necessarily have the same connectivity, or the same number of **line** or **bus** objects.\n",
    "To group these graphs of varying sizes together, `energnn` uses **padding** (filling with zeros).\n",
    "\n",
    "Still, a `ProblemBatch` can be handled in a very similar way to a single `Problem`."
   ],
   "id": "b40fd27724c4e0cf"
  },
  {
   "metadata": {
    "ExecuteTime": {
     "end_time": "2026-04-24T12:19:31.992641967Z",
     "start_time": "2026-04-24T12:19:31.756455297Z"
    }
   },
   "cell_type": "code",
   "source": [
    "alpha = 0.5\n",
    "\n",
    "decision, _ = problem_batch.get_zero_decision()\n",
    "objective, _ = problem_batch.get_score(decision=decision)\n",
    "print(f\"Step 0, average objective = {objective}\")\n",
    "\n",
    "for i in range(10):\n",
    "    gradient, _ = problem_batch.get_gradient(decision=decision)\n",
    "\n",
    "    # Update decision on the whole batch\n",
    "    decision.feature_flat_array -= alpha * gradient.feature_flat_array\n",
    "\n",
    "    objective, _ = problem_batch.get_score(decision=decision)\n",
    "    print(f\"Step {i + 1}, average objective = {objective}\")"
   ],
   "id": "ea998f304f523f81",
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Step 0, average objective = [0.41525667905807495, 0.6843823790550232, 0.25396728515625]\n",
      "Step 1, average objective = [0.10381416976451874, 0.1710955947637558, 0.0634918212890625]\n",
      "Step 2, average objective = [0.025953546166419983, 0.04277390241622925, 0.015872955322265625]\n",
      "Step 3, average objective = [0.006488386541604996, 0.010693473741412163, 0.00396823650225997]\n",
      "Step 4, average objective = [0.0016220954712480307, 0.0026733684353530407, 0.0009920591255649924]\n",
      "Step 5, average objective = [0.00040552386781200767, 0.0006683421670459211, 0.00024801530526019633]\n",
      "Step 6, average objective = [0.00010138096695300192, 0.0001670852507231757, 6.200382631504908e-05]\n",
      "Step 7, average objective = [2.534558552724775e-05, 4.1771614633034915e-05, 1.550095657876227e-05]\n",
      "Step 8, average objective = [6.3362235778186005e-06, 1.0442954589962028e-05, 3.875235961459111e-06]\n",
      "Step 9, average objective = [1.5839691513974685e-06, 2.6107645680895075e-06, 9.688423006082303e-07]\n",
      "Step 10, average objective = [3.9603560253453907e-07, 6.5266902993244e-07, 2.4221057515205757e-07]\n"
     ]
    }
   ],
   "execution_count": 11
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": "Notice that there is one score per element of the batch.",
   "id": "165719406ae4d4c8"
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "## Problem Loader\n",
    "\n",
    "Being able to process problem instances per batch is nice, but not enough.\n",
    "To train a Graph Neural Network, we'll need to iterate over multiple minibatches of problem instances.\n",
    "That's where the `ProblemLoader` class comes in."
   ],
   "id": "5254b7c5997c180c"
  },
  {
   "metadata": {
    "ExecuteTime": {
     "end_time": "2026-04-24T12:24:12.711627786Z",
     "start_time": "2026-04-24T12:24:12.649555356Z"
    }
   },
   "cell_type": "code",
   "source": [
    "from energnn.problem.example import LinearSystemProblemLoader\n",
    "\n",
    "problem_loader = LinearSystemProblemLoader(batch_size=4, seed=7, dataset_size=16, n_max=4)\n",
    "\n",
    "# Let us explore the context and decision structures\n",
    "print(\"Context Structure:\\n\", problem_loader.context_structure, \"\\n\")\n",
    "print(\"Decision Structure:\\n\", problem_loader.decision_structure)"
   ],
   "id": "dd5473e512693b31",
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Context Structure:\n",
      "            Ports                  Features\n",
      "Name                                      \n",
      "line  [from, to]             [susceptance]\n",
      "bus         [id]  [active_power_injection] \n",
      "\n",
      "Decision Structure:\n",
      "      Ports       Features\n",
      "Name                     \n",
      "bus   None  [phase_angle]\n"
     ]
    }
   ],
   "execution_count": 22
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": "It allows to iterate over batches of problems.",
   "id": "86dca13395d3671d"
  },
  {
   "metadata": {
    "ExecuteTime": {
     "end_time": "2026-04-24T12:19:34.854571189Z",
     "start_time": "2026-04-24T12:19:32.022254443Z"
    }
   },
   "cell_type": "code",
   "source": [
    "for problem_batch in problem_loader:\n",
    "    context, _ = problem_batch.get_context()\n",
    "    decision, _ = problem_batch.get_zero_decision()\n",
    "    objective, _ = problem_batch.get_score(decision=decision)\n",
    "    print(\"Objective:\", objective)"
   ],
   "id": "fc259caef1889511",
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Objective: [0.04023674875497818, 1.0828040838241577, 0.5069718956947327, 0.6397796869277954]\n",
      "Objective: [0.15303818881511688, 0.17635096609592438, 0.9519702792167664, 1.7408232688903809]\n",
      "Objective: [0.028324833139777184, 0.35810908675193787, 0.463958740234375, 1.2205294370651245]\n",
      "Objective: [0.047384973615407944, 0.9067906141281128, 0.3180186450481415, 0.6611507534980774]\n"
     ]
    }
   ],
   "execution_count": 13
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": "Each iteration yields a new batch of problems.",
   "id": "15db88e89257dc8c"
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "## Graph Neural Network Model\n",
    "\n",
    "Let us instantiate a small Graph Neural Network model, that fits the context and decision structure of our problem class."
   ],
   "id": "4dc0b2d26d8e1803"
  },
  {
   "metadata": {
    "ExecuteTime": {
     "end_time": "2026-04-24T12:19:36.388628427Z",
     "start_time": "2026-04-24T12:19:34.871404231Z"
    }
   },
   "cell_type": "code",
   "source": [
    "from energnn.model.ready_to_use import TinyRecurrentEquivariantGNN\n",
    "\n",
    "model = TinyRecurrentEquivariantGNN(\n",
    "    in_structure=problem_loader.context_structure,\n",
    "    out_structure=problem_loader.decision_structure\n",
    ")"
   ],
   "id": "d4bb80109ff4964a",
   "outputs": [],
   "execution_count": 14
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": "Make sure that your model is in evaluation mode first!",
   "id": "79d82d3ff1c81392"
  },
  {
   "metadata": {
    "ExecuteTime": {
     "end_time": "2026-04-24T12:19:36.418716884Z",
     "start_time": "2026-04-24T12:19:36.409445065Z"
    }
   },
   "cell_type": "code",
   "source": [
    "model.eval()  # Set the model in evaluation mode.\n",
    "# model.train()  # To set the model in train mode."
   ],
   "id": "c5b4523e3d2f4db5",
   "outputs": [],
   "execution_count": 15
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": "It is able to take as input a context and return a decision.",
   "id": "18b362addbd8340d"
  },
  {
   "metadata": {
    "ExecuteTime": {
     "end_time": "2026-04-24T12:19:38.711400201Z",
     "start_time": "2026-04-24T12:19:36.420426178Z"
    }
   },
   "cell_type": "code",
   "source": [
    "problem = pb_generator.generate_problem()\n",
    "context, _ = problem.get_context()\n",
    "decision, _ = model(context)\n",
    "print(decision)"
   ],
   "id": "93031eef643e7e61",
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "bus\n",
      "             features\n",
      "          phase_angle\n",
      "object_id            \n",
      "0           -1.241976\n",
      "1            0.972012\n",
      "2            0.320894\n",
      "\n"
     ]
    }
   ],
   "execution_count": 16
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": "It can also process batches of contexts and return batches of decisions.",
   "id": "e97644121d4df05c"
  },
  {
   "metadata": {
    "ExecuteTime": {
     "end_time": "2026-04-24T12:19:41.143176113Z",
     "start_time": "2026-04-24T12:19:38.733412532Z"
    }
   },
   "cell_type": "code",
   "source": [
    "problem_batch = pb_generator.generate_problem_batch(batch_size=4)\n",
    "context, _ = problem_batch.get_context()\n",
    "decision, _ = model.forward_batch(graph=context)\n",
    "print(decision)"
   ],
   "id": "acafbbc147579bdd",
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "bus\n",
      "                      features\n",
      "                   phase_angle\n",
      "batch_id object_id            \n",
      "0        0            3.879246\n",
      "         1           -4.336865\n",
      "         2            0.713661\n",
      "1        0           -1.430068\n",
      "         1            0.131218\n",
      "         2            1.134047\n",
      "2        0           -0.827532\n",
      "         1            0.785140\n",
      "         2           -0.000000\n",
      "3        0           -0.009739\n",
      "         1            0.016669\n",
      "         2           -0.786488\n",
      "\n"
     ]
    }
   ],
   "execution_count": 17
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "## Trainer\n",
    "\n",
    "Let us train our Graph Neural Network model over a problem loader. The core training loop is defined by the following pseudocode.\n",
    "\n",
    "```python\n",
    "for problem_batch in problem_loader:\n",
    "    context, _ = problem_batch.get_context()\n",
    "    decision, _ = model.forward_batch(context)\n",
    "    gradient, _ = problem_batch.get_gradient(decision)\n",
    "    model.backprop(gradient)\n",
    "```\n",
    "\n",
    "In practice, we use `energnn.trainer` to implement the training logic, and allow to use :\n",
    "\n",
    "- `optax` for the optimizer,\n",
    "- `orbax` for checkpointing and saving/loading models."
   ],
   "id": "3052b7a2076e8527"
  },
  {
   "metadata": {
    "ExecuteTime": {
     "end_time": "2026-04-24T12:19:41.347441925Z",
     "start_time": "2026-04-24T12:19:41.165114527Z"
    }
   },
   "cell_type": "code",
   "source": [
    "from energnn.trainer import Trainer\n",
    "import optax\n",
    "\n",
    "trainer = Trainer(model=model, gradient_transformation=optax.adam(learning_rate=3e-4))"
   ],
   "id": "e4ba246080df3c5b",
   "outputs": [],
   "execution_count": 18
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": "The training is performed by iterating over a **train** loader, and the validation score is periodically computed on a **validation** loader.",
   "id": "bd42d36691734ce6"
  },
  {
   "metadata": {
    "ExecuteTime": {
     "end_time": "2026-04-24T12:19:41.363457957Z",
     "start_time": "2026-04-24T12:19:41.351374862Z"
    }
   },
   "cell_type": "code",
   "source": [
    "train_loader = LinearSystemProblemLoader(seed=7, dataset_size=64, batch_size=4, n_max=3)\n",
    "val_loader = LinearSystemProblemLoader(seed=8, dataset_size=8, batch_size=4, n_max=3)"
   ],
   "id": "9da2e52f08b95bda",
   "outputs": [],
   "execution_count": 19
  },
  {
   "metadata": {
    "ExecuteTime": {
     "end_time": "2026-04-24T12:20:22.618732059Z",
     "start_time": "2026-04-24T12:19:41.366570073Z"
    }
   },
   "cell_type": "code",
   "source": [
    "_ = trainer.train(\n",
    "    train_loader=train_loader,\n",
    "    val_loader=val_loader,\n",
    "    eval_before_training=True,\n",
    "    n_epochs=10,\n",
    ")"
   ],
   "id": "2d9389b0169d78c9",
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Validation: 100%|██████████| 2/2 [00:02<00:00,  1.13s/batch, score=1.0137e+01]\n",
      "Epoch 1/10: 100%|██████████| 16/16 [00:08<00:00,  1.90batch/s]\n",
      "Validation: 100%|██████████| 2/2 [00:00<00:00, 10.12batch/s, score=2.6406e+00]\n",
      "Epoch 2/10: 100%|██████████| 16/16 [00:02<00:00,  5.37batch/s]\n",
      "Validation: 100%|██████████| 2/2 [00:00<00:00, 11.16batch/s, score=2.5781e+00]\n",
      "Epoch 3/10: 100%|██████████| 16/16 [00:03<00:00,  5.04batch/s]\n",
      "Validation: 100%|██████████| 2/2 [00:00<00:00, 10.24batch/s, score=2.5190e+00]\n",
      "Epoch 4/10: 100%|██████████| 16/16 [00:03<00:00,  5.31batch/s]\n",
      "Validation: 100%|██████████| 2/2 [00:00<00:00, 10.84batch/s, score=2.4622e+00]\n",
      "Epoch 5/10: 100%|██████████| 16/16 [00:03<00:00,  5.33batch/s]\n",
      "Validation: 100%|██████████| 2/2 [00:00<00:00, 11.87batch/s, score=2.4078e+00]\n",
      "Epoch 6/10: 100%|██████████| 16/16 [00:03<00:00,  4.97batch/s]\n",
      "Validation: 100%|██████████| 2/2 [00:00<00:00, 11.17batch/s, score=2.3555e+00]\n",
      "Epoch 7/10: 100%|██████████| 16/16 [00:03<00:00,  5.15batch/s]\n",
      "Validation: 100%|██████████| 2/2 [00:00<00:00, 10.00batch/s, score=2.3042e+00]\n",
      "Epoch 8/10: 100%|██████████| 16/16 [00:03<00:00,  4.81batch/s]\n",
      "Validation: 100%|██████████| 2/2 [00:00<00:00, 10.22batch/s, score=2.2553e+00]\n",
      "Epoch 9/10: 100%|██████████| 16/16 [00:03<00:00,  4.89batch/s]\n",
      "Validation: 100%|██████████| 2/2 [00:00<00:00,  8.78batch/s, score=2.2075e+00]\n",
      "Epoch 10/10: 100%|██████████| 16/16 [00:03<00:00,  4.56batch/s]\n",
      "Validation: 100%|██████████| 2/2 [00:00<00:00, 11.20batch/s, score=2.1607e+00]\n"
     ]
    }
   ],
   "execution_count": 20
  },
  {
   "metadata": {},
   "cell_type": "markdown",
   "source": [
    "Et voilà! The GNN has started to learn how to solve this basic linear system derived from the DC Power Flow problem, which is a sparse linear system.\n",
    "\n",
    "Notice that the approach is not limited to linear system and can be used for a much broader set of applications!"
   ],
   "id": "6adb93eb2ba67ded"
  }
 ],
 "metadata": {},
 "nbformat": 4,
 "nbformat_minor": 5
}