reformat

Can-Zhao · Can-Zhao · commit f448e9eadab5 · 2025-03-11T22:06:36.000Z
Signed-off-by: Can-Zhao &lt;canz@nvidia.com&gt;
diff --git a/generation/maisi/maisi_diff_unet_training_tutorial.ipynb b/generation/maisi/maisi_diff_unet_training_tutorial.ipynb
@@ -31,6 +31,30 @@
     "`[Release Note (March 2025)]:` We are excited to announce the new MAISI Version `'maisi-rflow'`. Compared with the previous version `'maisi-ddpm'`, it accelerated latent diffusion model inference by 33x. Please see the detailed difference in the following section."
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "aa51792d",
+   "metadata": {},
+   "source": [
+    "## Set up the MAISI version\n",
+    "\n",
+    "Choose between `'maisi-ddpm'` and `'maisi-rflow'`. The differences are:\n",
+    "- The maisi version `'maisi-ddpm'` uses basic noise scheduler DDPM. `'maisi-rflow'` uses Rectified Flow scheduler, can be 33 times faster during inference.\n",
+    "- The maisi version `'maisi-ddpm'` requires training images to be labeled with body region (`\"top_region_index\"` and `\"bottom_region_index\"`), while `'maisi-rflow'` does not have such requirement. In other words, it is easier to prepare training data for `'maisi-rflow'`.\n",
+    "- For the released model weights, `'maisi-rflow'` can generate images with better quality for head region and small output volumes, and comparable quality for other cases compared with `'maisi-ddpm'`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "828b9ece-7759-40e8-ac4c-6467c3399701",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "maisi_version = \"maisi-ddpm\"\n",
+    "assert maisi_version in [\"maisi-ddpm\", \"maisi-rflow\"]"
+   ]
+  },
   {
    "cell_type": "markdown",
    "id": "c9ecfb90",
@@ -41,7 +65,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 1,
+   "execution_count": 2,
    "id": "58cbde9b",
    "metadata": {},
    "outputs": [],
@@ -59,7 +83,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 13,
+   "execution_count": 3,
    "id": "e3bf0346",
    "metadata": {},
    "outputs": [
@@ -136,8 +160,8 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 3,
-   "id": "828b9ece-7759-40e8-ac4c-6467c3399701",
+   "execution_count": 4,
+   "id": "31684f74",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -159,7 +183,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 4,
+   "execution_count": 5,
    "id": "fc32a7fe",
    "metadata": {},
    "outputs": [],
@@ -181,15 +205,15 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 5,
+   "execution_count": 6,
    "id": "1b199078",
    "metadata": {},
    "outputs": [
     {
      "name": "stderr",
      "output_type": "stream",
      "text": [
-      "[2025-03-11 21:46:47.184][ INFO](notebook) - Generated simulated images.\n"
+      "[2025-03-11 22:05:02.952][ INFO](notebook) - Generated simulated images.\n"
      ]
     }
    ],
@@ -228,26 +252,26 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 6,
+   "execution_count": 7,
    "id": "6c7b434c",
    "metadata": {},
    "outputs": [
     {
      "name": "stderr",
      "output_type": "stream",
      "text": [
-      "[2025-03-11 21:46:47.199][ INFO](notebook) - files and folders under work_dir: ['predictions', 'config_maisi.json', 'models', 'sim_dataroot', 'config_maisi_diff_model.json', 'embeddings', 'environment_maisi_diff_model.json', 'sim_datalist.json'].\n",
-      "[2025-03-11 21:46:47.199][ INFO](notebook) - number of GPUs: 1.\n"
+      "[2025-03-11 22:05:02.966][ INFO](notebook) - files and folders under work_dir: ['predictions', 'config_maisi.json', 'models', 'sim_dataroot', 'config_maisi_diff_model.json', 'embeddings', 'environment_maisi_diff_model.json', 'sim_datalist.json'].\n",
+      "[2025-03-11 22:05:02.966][ INFO](notebook) - number of GPUs: 1.\n"
      ]
     }
    ],
    "source": [
     "env_config_path = \"./configs/environment_maisi_diff_model.json\"\n",
     "model_config_path = \"./configs/config_maisi_diff_model.json\"\n",
-    "if maisi_version == 'maisi-ddpm':\n",
+    "if maisi_version == \"maisi-ddpm\":\n",
     "    model_def_path = \"./configs/config_maisi-ddpm.json\"\n",
     "    include_body_region = True\n",
-    "elif maisi_version == 'maisi-rflow':\n",
+    "elif maisi_version == \"maisi-rflow\":\n",
     "    model_def_path = \"./configs/config_maisi-rflow.json\"\n",
     "    include_body_region = False\n",
     "else:\n",
@@ -315,7 +339,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 7,
+   "execution_count": 8,
    "id": "95ea6972",
    "metadata": {},
    "outputs": [],
@@ -375,24 +399,24 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 8,
+   "execution_count": 9,
    "id": "f45ea863",
    "metadata": {},
    "outputs": [
     {
      "name": "stderr",
      "output_type": "stream",
      "text": [
-      "[2025-03-11 21:46:47.210][ INFO](notebook) - Creating training data...\n"
+      "[2025-03-11 22:05:02.977][ INFO](notebook) - Creating training data...\n"
      ]
     },
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
       "\n",
-      "[2025-03-11 21:46:57.369][ INFO](creating training data) - Using device cuda:0\n",
-      "[2025-03-11 21:46:58.170][ INFO](creating training data) - filenames_raw: ['tr_image_001.nii.gz', 'tr_image_002.nii.gz']\n",
+      "[2025-03-11 22:05:10.881][ INFO](creating training data) - Using device cuda:0\n",
+      "[2025-03-11 22:05:11.686][ INFO](creating training data) - filenames_raw: ['tr_image_001.nii.gz', 'tr_image_002.nii.gz']\n",
       "\n"
      ]
     }
@@ -428,17 +452,17 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 9,
+   "execution_count": 10,
    "id": "0221a658",
    "metadata": {},
    "outputs": [
     {
      "name": "stderr",
      "output_type": "stream",
      "text": [
-      "[2025-03-11 21:47:00.412][ INFO](notebook) - data: {'dim': (64, 64, 32), 'spacing': [0.875, 0.875, 0.75], 'top_region_index': [0, 1, 0, 0], 'bottom_region_index': [0, 0, 1, 0]}.\n",
-      "[2025-03-11 21:47:00.414][ INFO](notebook) - data: {'dim': (64, 64, 32), 'spacing': [0.875, 0.875, 0.75], 'top_region_index': [0, 1, 0, 0], 'bottom_region_index': [0, 0, 1, 0]}.\n",
-      "[2025-03-11 21:47:00.415][ INFO](notebook) - Completed creating .json files for all embedding files.\n"
+      "[2025-03-11 22:05:13.881][ INFO](notebook) - data: {'dim': (64, 64, 32), 'spacing': [0.875, 0.875, 0.75], 'top_region_index': [0, 1, 0, 0], 'bottom_region_index': [0, 0, 1, 0]}.\n",
+      "[2025-03-11 22:05:13.884][ INFO](notebook) - data: {'dim': (64, 64, 32), 'spacing': [0.875, 0.875, 0.75], 'top_region_index': [0, 1, 0, 0], 'bottom_region_index': [0, 0, 1, 0]}.\n",
+      "[2025-03-11 22:05:13.885][ INFO](notebook) - Completed creating .json files for all embedding files.\n"
      ]
     }
    ],
@@ -467,10 +491,7 @@
     "        spacing = [float(_item) for _item in spacing]\n",
     "\n",
     "        # Create the dictionary with the specified keys and values\n",
-    "        data = {\n",
-    "                \"dim\": dimensions,\n",
-    "                \"spacing\": spacing\n",
-    "        }\n",
+    "        data = {\"dim\": dimensions, \"spacing\": spacing}\n",
     "        if include_body_region:\n",
     "            # The region can be selected from one of four regions from top to bottom.\n",
     "            # [1,0,0,0] is the head and neck, [0,1,0,0] is the chest region, [0,0,1,0]\n",
@@ -510,42 +531,42 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 10,
+   "execution_count": 11,
    "id": "ade6389d",
    "metadata": {},
    "outputs": [
     {
      "name": "stderr",
      "output_type": "stream",
      "text": [
-      "[2025-03-11 21:47:00.420][ INFO](notebook) - Training the model...\n"
+      "[2025-03-11 22:05:13.892][ INFO](notebook) - Training the model...\n"
      ]
     },
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
       "\n",
-      "[2025-03-11 21:47:09.081][ INFO](training) - Using cuda:0 of 1\n",
-      "[2025-03-11 21:47:09.081][ INFO](training) - [config] ckpt_folder -> ./temp_work_dir/./models.\n",
-      "[2025-03-11 21:47:09.081][ INFO](training) - [config] data_root -> ./temp_work_dir/./embeddings.\n",
-      "[2025-03-11 21:47:09.081][ INFO](training) - [config] data_list -> ./temp_work_dir/sim_datalist.json.\n",
-      "[2025-03-11 21:47:09.081][ INFO](training) - [config] lr -> 0.0001.\n",
-      "[2025-03-11 21:47:09.081][ INFO](training) - [config] num_epochs -> 2.\n",
-      "[2025-03-11 21:47:09.081][ INFO](training) - [config] num_train_timesteps -> 1000.\n",
-      "[2025-03-11 21:47:09.081][ INFO](training) - num_files_train: 2\n",
-      "[2025-03-11 21:47:10.815][ INFO](training) - Training from scratch.\n",
-      "[2025-03-11 21:47:11.273][ INFO](training) - Scaling factor set to 1.159977912902832.\n",
-      "[2025-03-11 21:47:11.273][ INFO](training) - scale_factor -> 1.159977912902832.\n",
-      "[2025-03-11 21:47:11.276][ INFO](training) - torch.set_float32_matmul_precision -> highest.\n",
-      "[2025-03-11 21:47:11.276][ INFO](training) - Epoch 1, lr 0.0001.\n",
-      "[2025-03-11 21:47:12.253][ INFO](training) - [2025-03-11 21:47:12] epoch 1, iter 1/2, loss: 0.7979, lr: 0.000100000000.\n",
-      "[2025-03-11 21:47:12.535][ INFO](training) - [2025-03-11 21:47:12] epoch 1, iter 2/2, loss: 0.7931, lr: 0.000056250000.\n",
-      "[2025-03-11 21:47:12.572][ INFO](training) - epoch 1 average loss: 0.7955.\n",
-      "[2025-03-11 21:47:14.031][ INFO](training) - Epoch 2, lr 2.5e-05.\n",
-      "[2025-03-11 21:47:14.420][ INFO](training) - [2025-03-11 21:47:14] epoch 2, iter 1/2, loss: 0.7883, lr: 0.000025000000.\n",
-      "[2025-03-11 21:47:14.517][ INFO](training) - [2025-03-11 21:47:14] epoch 2, iter 2/2, loss: 0.7893, lr: 0.000006250000.\n",
-      "[2025-03-11 21:47:14.594][ INFO](training) - epoch 2 average loss: 0.7888.\n",
+      "[2025-03-11 22:05:24.419][ INFO](training) - Using cuda:0 of 1\n",
+      "[2025-03-11 22:05:24.419][ INFO](training) - [config] ckpt_folder -> ./temp_work_dir/./models.\n",
+      "[2025-03-11 22:05:24.419][ INFO](training) - [config] data_root -> ./temp_work_dir/./embeddings.\n",
+      "[2025-03-11 22:05:24.419][ INFO](training) - [config] data_list -> ./temp_work_dir/sim_datalist.json.\n",
+      "[2025-03-11 22:05:24.419][ INFO](training) - [config] lr -> 0.0001.\n",
+      "[2025-03-11 22:05:24.419][ INFO](training) - [config] num_epochs -> 2.\n",
+      "[2025-03-11 22:05:24.419][ INFO](training) - [config] num_train_timesteps -> 1000.\n",
+      "[2025-03-11 22:05:24.420][ INFO](training) - num_files_train: 2\n",
+      "[2025-03-11 22:05:26.152][ INFO](training) - Training from scratch.\n",
+      "[2025-03-11 22:05:26.539][ INFO](training) - Scaling factor set to 1.159977912902832.\n",
+      "[2025-03-11 22:05:26.539][ INFO](training) - scale_factor -> 1.159977912902832.\n",
+      "[2025-03-11 22:05:26.542][ INFO](training) - torch.set_float32_matmul_precision -> highest.\n",
+      "[2025-03-11 22:05:26.542][ INFO](training) - Epoch 1, lr 0.0001.\n",
+      "[2025-03-11 22:05:28.578][ INFO](training) - [2025-03-11 22:05:28] epoch 1, iter 1/2, loss: 0.7974, lr: 0.000100000000.\n",
+      "[2025-03-11 22:05:28.719][ INFO](training) - [2025-03-11 22:05:28] epoch 1, iter 2/2, loss: 0.7943, lr: 0.000056250000.\n",
+      "[2025-03-11 22:05:28.762][ INFO](training) - epoch 1 average loss: 0.7958.\n",
+      "[2025-03-11 22:05:30.615][ INFO](training) - Epoch 2, lr 2.5e-05.\n",
+      "[2025-03-11 22:05:31.002][ INFO](training) - [2025-03-11 22:05:31] epoch 2, iter 1/2, loss: 0.7898, lr: 0.000025000000.\n",
+      "[2025-03-11 22:05:31.105][ INFO](training) - [2025-03-11 22:05:31] epoch 2, iter 2/2, loss: 0.7886, lr: 0.000006250000.\n",
+      "[2025-03-11 22:05:31.168][ INFO](training) - epoch 2 average loss: 0.7892.\n",
       "\n"
      ]
     }
@@ -583,40 +604,40 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 11,
+   "execution_count": 12,
    "id": "1626526d",
    "metadata": {},
    "outputs": [
     {
      "name": "stderr",
      "output_type": "stream",
      "text": [
-      "[2025-03-11 21:47:18.262][ INFO](notebook) - Running inference...\n",
-      "[2025-03-11 21:47:35.148][ INFO](notebook) - Completed all steps.\n"
+      "[2025-03-11 22:05:35.033][ INFO](notebook) - Running inference...\n",
+      "[2025-03-11 22:05:50.259][ INFO](notebook) - Completed all steps.\n"
      ]
     },
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
       "\n",
-      "[2025-03-11 21:47:27.859][ INFO](inference) - Using cuda:0 of 1 with random seed: 99760\n",
-      "[2025-03-11 21:47:27.859][ INFO](inference) - [config] ckpt_filepath -> ./temp_work_dir/./models/diff_unet_ckpt.pt.\n",
-      "[2025-03-11 21:47:27.860][ INFO](inference) - [config] random_seed -> 99760.\n",
-      "[2025-03-11 21:47:27.860][ INFO](inference) - [config] output_prefix -> unet_3d.\n",
-      "[2025-03-11 21:47:27.860][ INFO](inference) - [config] output_size -> (256, 256, 128).\n",
-      "[2025-03-11 21:47:27.860][ INFO](inference) - [config] out_spacing -> (1.0, 1.0, 0.75).\n",
-      "[2025-03-11 21:47:27.860][ INFO](root) - `controllable_anatomy_size` is not provided.\n",
-      "[2025-03-11 21:47:30.510][ INFO](inference) - checkpoints ./temp_work_dir/./models/diff_unet_ckpt.pt loaded.\n",
-      "[2025-03-11 21:47:30.512][ INFO](inference) - scale_factor -> 1.159977912902832.\n",
-      "[2025-03-11 21:47:30.512][ INFO](inference) - num_downsample_level -> 4, divisor -> 4.\n",
-      "[2025-03-11 21:47:30.514][ INFO](inference) - noise: cuda:0, torch.float32, <class 'torch.Tensor'>\n",
+      "[2025-03-11 22:05:43.502][ INFO](inference) - Using cuda:0 of 1 with random seed: 7854\n",
+      "[2025-03-11 22:05:43.502][ INFO](inference) - [config] ckpt_filepath -> ./temp_work_dir/./models/diff_unet_ckpt.pt.\n",
+      "[2025-03-11 22:05:43.502][ INFO](inference) - [config] random_seed -> 7854.\n",
+      "[2025-03-11 22:05:43.502][ INFO](inference) - [config] output_prefix -> unet_3d.\n",
+      "[2025-03-11 22:05:43.502][ INFO](inference) - [config] output_size -> (256, 256, 128).\n",
+      "[2025-03-11 22:05:43.502][ INFO](inference) - [config] out_spacing -> (1.0, 1.0, 0.75).\n",
+      "[2025-03-11 22:05:43.502][ INFO](root) - `controllable_anatomy_size` is not provided.\n",
+      "[2025-03-11 22:05:45.793][ INFO](inference) - checkpoints ./temp_work_dir/./models/diff_unet_ckpt.pt loaded.\n",
+      "[2025-03-11 22:05:45.795][ INFO](inference) - scale_factor -> 1.159977912902832.\n",
+      "[2025-03-11 22:05:45.796][ INFO](inference) - num_downsample_level -> 4, divisor -> 4.\n",
+      "[2025-03-11 22:05:45.798][ INFO](inference) - noise: cuda:0, torch.float32, <class 'torch.Tensor'>\n",
       "\n",
       "  0%|          | 0/10 [00:00<?, ?it/s]\n",
-      " 10%|█         | 1/10 [00:00<00:07,  1.24it/s]\n",
-      " 60%|██████    | 6/10 [00:00<00:00,  8.39it/s]\n",
-      "100%|██████████| 10/10 [00:01<00:00,  9.80it/s]\n",
-      "[2025-03-11 21:47:33.116][ INFO](inference) - Saved ./temp_work_dir/./predictions/unet_3d_seed99760_size256x256x128_spacing1.00x1.00x0.75_20250311214732_rank0.nii.gz.\n",
+      " 10%|█         | 1/10 [00:00<00:05,  1.78it/s]\n",
+      " 60%|██████    | 6/10 [00:00<00:00, 11.19it/s]\n",
+      "100%|██████████| 10/10 [00:00<00:00, 12.88it/s]\n",
+      "[2025-03-11 22:05:48.356][ INFO](inference) - Saved ./temp_work_dir/./predictions/unet_3d_seed7854_size256x256x128_spacing1.00x1.00x0.75_20250311220547_rank0.nii.gz.\n",
       "\n"
      ]
     }
@@ -654,7 +675,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 12,
+   "execution_count": 13,
    "id": "0d8a344d",
    "metadata": {},
    "outputs": [