add cobra pythonn

marouenbg · marouenbg · commit 7763a557c933 · 2023-10-11T14:17:45.000-04:00
diff --git a/netbooks/Welcome_to_netBooks.ipynb b/netbooks/Welcome_to_netBooks.ipynb
@@ -102,10 +102,8 @@
     "        \n",
     "        - [Uncovering Associations among Genes and Phenotypes with SEAHORSE](netZooR/seahorse.ipynb)\n",
     "        \n",
-    "        - [Decomposing gene co-expression networks with COBRA (R version)](netZooR/COBRA.ipynb)\n",
+    "        - [Decomposing gene co-expression networks with COBRA](netZooR/COBRA.ipynb)\n",
     "        \n",
-    "        - [Decomposing gene co-expression networks with COBRA (Python version)](netZooPy/cobra.ipynb)\n",
-    "    \n",
     "    - Case studies\n",
     "    \n",
     "        - [Building PANDA and LIONESS Regulatory Networks from GTEx Gene Expression Data in R](netZooR/ApplicationinGTExData.ipynb)\n",
@@ -152,6 +150,8 @@
     "        - [Identifying mutation networks using SAMBAR](netZooPy/sambar_tutorial.ipynb)\n",
     "        \n",
     "        - [DRAGON: Determining Regulatory Associations using Graphical models on multi-Omic Networks](netZooPy/dragon_tutorial.ipynb)\n",
+    "        \n",
+    "        - [Decomposing gene co-expression networks with COBRA](netZooPy/cobra.ipynb)\n",
     "\n",
     "    - Case studies\n",
     "\n",
@@ -192,7 +192,7 @@
    "mimetype": "text/x-r-source",
    "name": "R",
    "pygments_lexer": "r",
-   "version": "4.3.0"
+   "version": "4.2.2"
   }
  },
  "nbformat": 4,
diff --git a/netbooks/netZooPy/cobra.ipynb b/netbooks/netZooPy/cobra.ipynb
@@ -2,7 +2,6 @@
  "cells": [
   {
    "cell_type": "markdown",
-   "id": "ff46b205",
    "metadata": {},
    "source": [
     "# Decomposing gene co-expression networks with COBRA (Python version)\n",
@@ -13,7 +12,6 @@
   },
   {
    "cell_type": "markdown",
-   "id": "71d4f5e4",
    "metadata": {},
    "source": [
     "## 1. Introduction\n",
@@ -23,31 +21,40 @@
     "\n",
     "COBRA is now part of the [netZooPy package](https://github.com/netZoo/netZooPy). Please follow the installation guidelines on the [README](https://github.com/netZoo/netZooPy/blob/master/README.md). If you need help or if you have any question about netZoo, feel free to start with [discussions](https://github.com/netZoo/netZooPy/discussions). To report a bug, please open a new [issue](https://github.com/netZoo/netZooPy/issues). \n",
     "\n",
-    "To illustrate how to use COBRA for different tasks, we import thyroid carcinoma (THCA) data from the TCGA project <sup>1</sup>. "
+    "To illustrate how to use COBRA for different tasks, we import thyroid carcinoma (THCA) data from the TCGA project <sup>1</sup>. \n",
+    "\n",
+    "This vignette can be ran on netbooks server or locally by setting the `runserver` parameter"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "runserver=1"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "On the server, we need to change the working directory to the `data` folder of the current useer."
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 1,
-   "id": "69f50403",
+   "execution_count": null,
    "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "/home/soel/Desktop/netbooks/netbooks\n"
-     ]
-    }
-   ],
+   "outputs": [],
    "source": [
-    "cd .."
+    "if runserver==1:\n",
+    "    ppath='/opt/data/netZooPy/cobra/'"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 2,
-   "id": "4d8dd9bf",
+   "execution_count": null,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -58,43 +65,29 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 3,
-   "id": "27ba906e",
+   "execution_count": null,
    "metadata": {},
    "outputs": [],
    "source": [
-    "gene_expression = pd.read_csv(\"data/gene_expression_thca.csv\", index_col = 0).to_numpy()\n",
-    "metadata = pd.read_csv(\"data/thca_metadata.csv\", index_col = 0)\n",
+    "gene_expression = pd.read_csv(ppath+\"gene_expression_thca.csv\", index_col = 0).to_numpy()\n",
+    "metadata = pd.read_csv(ppath+\"data/thca_metadata.csv\", index_col = 0)\n",
     "batch = metadata['batch'].to_numpy()\n",
     "cancer = metadata['status'].to_numpy()\n",
     "sex = metadata['sex'].to_numpy()"
    ]
   },
   {
    "cell_type": "markdown",
-   "id": "1b97bf47",
    "metadata": {},
    "source": [
     "Here gene_expression is a gene expression matrix for 19711 genes and 572 samples. Batch, cancer, and sex are sample-specific metadata as vectors of length 572."
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 4,
-   "id": "62cfa651",
+   "execution_count": null,
    "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Gene expression shape = (19711, 572)\n",
-      "Batch vector length = 572\n",
-      "Cancer vector length = 572\n",
-      "Sex vector length = 572\n"
-     ]
-    }
-   ],
+   "outputs": [],
    "source": [
     "print(\"Gene expression shape = \" + str(gene_expression.shape))\n",
     "print(\"Batch vector length = \" + str(len(batch)))\n",
@@ -104,7 +97,6 @@
   },
   {
    "cell_type": "markdown",
-   "id": "71d4f89a",
    "metadata": {},
    "source": [
     "## 2. Applications of COBRA\n",
@@ -121,37 +113,23 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 5,
-   "id": "d9bcc699",
+   "execution_count": null,
    "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "17"
-      ]
-     },
-     "execution_count": 5,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
+   "outputs": [],
    "source": [
     "len(np.unique(batch))"
    ]
   },
   {
    "cell_type": "markdown",
-   "id": "d9dd368e",
    "metadata": {},
    "source": [
     "For batch correction, the design matrix must contain an intercept in the first column, and the batches (encoded usy dummy coding for identifiability) in the remaining columns. "
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 6,
-   "id": "9b1c0f8b",
+   "execution_count": null,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -162,45 +140,30 @@
   },
   {
    "cell_type": "markdown",
-   "id": "32c6fb4f",
    "metadata": {},
    "source": [
     "We get a design matrix with 17 covariates (an intercept and 16 for the dummy coding) for the 572 samples in our study. "
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 7,
-   "id": "672634c4",
+   "execution_count": null,
    "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "(572, 17)"
-      ]
-     },
-     "execution_count": 7,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
+   "outputs": [],
    "source": [
     "X.shape"
    ]
   },
   {
    "cell_type": "markdown",
-   "id": "711ee90f",
    "metadata": {},
    "source": [
     "We are now ready to fit COBRA"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 8,
-   "id": "9d6ebeb7",
+   "execution_count": null,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -209,16 +172,14 @@
   },
   {
    "cell_type": "markdown",
-   "id": "fafd4006",
    "metadata": {},
    "source": [
     "The batch corrected network consider only the mean effect after removing the contribution of the batch variables. It is computed as follows. "
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 9,
-   "id": "d9db88a2",
+   "execution_count": null,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -227,7 +188,6 @@
   },
   {
    "cell_type": "markdown",
-   "id": "7d007b0f",
    "metadata": {},
    "source": [
     "### 3.2 Differential co-expression analysis\n",
@@ -236,8 +196,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 10,
-   "id": "cc2e923c",
+   "execution_count": null,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -246,16 +205,14 @@
   },
   {
    "cell_type": "markdown",
-   "id": "88330e19",
    "metadata": {},
    "source": [
     "In this case, the design matrix contains an intercept an a second column with an indicator for cancer/ healthy. The additional columns are for the variables we want to adjust for. Similarly as before, we consider the batch variable. "
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 11,
-   "id": "2659f530",
+   "execution_count": null,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -267,16 +224,14 @@
   },
   {
    "cell_type": "markdown",
-   "id": "1a9b9bd4",
    "metadata": {},
    "source": [
     "We are now ready to fit COBRA and extract the component corresponding to the differential co-expression. Since the indicator variable for cancer is the second column in our design matrix, the COBRA-adjusted differential co-expression network corresponds to the second component of COBRA's decomposition. "
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 12,
-   "id": "a03f1d0e",
+   "execution_count": null,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -286,7 +241,6 @@
   },
   {
    "cell_type": "markdown",
-   "id": "ea5f0359",
    "metadata": {},
    "source": [
     "### 3.3 Identifying the component for a covariate of interest\n",
@@ -296,8 +250,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 13,
-   "id": "63764747",
+   "execution_count": null,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -308,16 +261,14 @@
   },
   {
    "cell_type": "markdown",
-   "id": "55d09ce6",
    "metadata": {},
    "source": [
     "With this design, the last component of COBRA's decomposition describes the sex differes in cancer between male and females. "
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 14,
-   "id": "e46405d0",
+   "execution_count": null,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -327,7 +278,6 @@
   },
   {
    "cell_type": "markdown",
-   "id": "afdf8a4d",
    "metadata": {},
    "source": [
     "## Reference\n",
@@ -338,7 +288,7 @@
  ],
  "metadata": {
   "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
+   "display_name": "Python 3",
    "language": "python",
    "name": "python3"
   },
@@ -352,7 +302,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.10.12"
+   "version": "3.9.7"
   }
  },
  "nbformat": 4,