malaysia-ai
diff --git a/‎accuracy/models-accuracy.ipynb‎
Lines changed: 85 additions & 553 deletions b/‎accuracy/models-accuracy.ipynb‎
Lines changed: 85 additions & 553 deletions
diff --git a/‎docs/generate‎
Lines changed: 1 addition & 0 deletions b/‎docs/generate‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎docs/index.rst‎
Lines changed: 1 addition & 0 deletions b/‎docs/index.rst‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎docs/load-dependency.ipynb‎
Lines changed: 11 additions & 2 deletions b/‎docs/load-dependency.ipynb‎
Lines changed: 11 additions & 2 deletions
diff --git a/‎docs/load-emotion.ipynb‎
Lines changed: 57 additions & 13 deletions b/‎docs/load-emotion.ipynb‎
Lines changed: 57 additions & 13 deletions
diff --git a/‎docs/load-entities.ipynb‎
Lines changed: 9 additions & 14 deletions b/‎docs/load-entities.ipynb‎
Lines changed: 9 additions & 14 deletions
diff --git a/‎docs/load-language-detection.ipynb‎
Lines changed: 14 additions & 19 deletions b/‎docs/load-language-detection.ipynb‎
Lines changed: 14 additions & 19 deletions
diff --git a/‎docs/load-pos.ipynb‎
Lines changed: 11 additions & 11 deletions b/‎docs/load-pos.ipynb‎
Lines changed: 11 additions & 11 deletions
@@ -1,5 +1,6 @@
 #!/bin/bash
 
 cp ../README.rst .
+cp ../accuracy/models-accuracy.ipynb .
 rm -rf _build/html && make html
 find . -name "*Magic*"  -exec rm  -rf {} \;
@@ -20,6 +20,7 @@ Contents:
    Dataset
    Installation
    load-cache
+   models-accuracy
    running-on-windows
    Api
    Contributing
 
@@ -38,8 +38,8 @@
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "CPU times: user 5.15 s, sys: 925 ms, total: 6.07 s\n",
-      "Wall time: 6.8 s\n"
+      "CPU times: user 6.15 s, sys: 1.31 s, total: 7.46 s\n",
+      "Wall time: 9.21 s\n"
      ]
     }
    ],
@@ -48,6 +48,15 @@
     "import malaya"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Models accuracy\n",
+    "\n",
+    "We use `sklearn.metrics.classification_report` for accuracy reporting, check at https://malaya.readthedocs.io/en/latest/models-accuracy.html#dependency-parsing"
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
 
@@ -38,8 +38,8 @@
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "CPU times: user 6 s, sys: 1.23 s, total: 7.23 s\n",
-      "Wall time: 8.33 s\n"
+      "CPU times: user 6.19 s, sys: 1.27 s, total: 7.46 s\n",
+      "Wall time: 9.01 s\n"
      ]
     }
    ],
@@ -48,6 +48,53 @@
     "import malaya"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Models accuracy\n",
+    "\n",
+    "We use `sklearn.metrics.classification_report` for accuracy reporting, check at https://malaya.readthedocs.io/en/latest/models-accuracy.html#emotion-analysis"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### labels supported\n",
+    "\n",
+    "Default labels for emotion module."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "['anger', 'fear', 'happy', 'love', 'sadness', 'surprise']"
+      ]
+     },
+     "execution_count": 2,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "malaya.emotion.label"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Example texts\n",
+    "\n",
+    "Copy pasted from random tweets."
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 2,
@@ -413,15 +460,6 @@
     "malaya.emotion.available_transformer()"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Make sure you can check accuracy chart from here first before select a model, https://malaya.readthedocs.io/en/latest/Accuracy.html#emotion-analysis\n",
-    "\n",
-    "**You might want to use Tiny-Albert, a very small size, 22.4MB, but the accuracy is still on the top notch.**"
-   ]
-  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -726,7 +764,11 @@
     "\n",
     "```python\n",
     "def predict_words(\n",
-    "    self, string: str, method: str = 'last', visualization: bool = True\n",
+    "    self,\n",
+    "    string: str,\n",
+    "    method: str = 'last',\n",
+    "    bins_size: float = 0.05,\n",
+    "    visualization: bool = True,\n",
     "):\n",
     "    \"\"\"\n",
     "    classify words.\n",
@@ -740,12 +782,14 @@
     "        * ``'last'`` - attention from last layer.\n",
     "        * ``'first'`` - attention from first layer.\n",
     "        * ``'mean'`` - average attentions from all layers.\n",
+    "    bins_size: float, optional (default=0.05)\n",
+    "        default bins size for word distribution histogram.\n",
     "    visualization: bool, optional (default=True)\n",
     "        If True, it will open the visualization dashboard.\n",
     "\n",
     "    Returns\n",
     "    -------\n",
-    "    result: dict\n",
+    "    dictionary: results\n",
     "    \"\"\"\n",
     "```\n",
     "\n",
 
@@ -48,6 +48,15 @@
     "import malaya"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Models accuracy\n",
+    "\n",
+    "We use `sklearn.metrics.classification_report` for accuracy reporting, check at https://malaya.readthedocs.io/en/latest/models-accuracy.html#entities-recognition and https://malaya.readthedocs.io/en/latest/models-accuracy.html#entities-recognition-ontonotes5"
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -487,13 +496,6 @@
     "malaya.entity.available_transformer()"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Make sure you can check accuracy chart from here first before select a model, https://malaya.readthedocs.io/en/latest/models-accuracy.html#Entities-Recognition"
-   ]
-  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -643,13 +645,6 @@
     "malaya.entity.available_transformer_ontonotes5()"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Make sure you can check accuracy chart from here first before select a model, https://malaya.readthedocs.io/en/latest/models-accuracy.html#Entities-Recognition-Ontonotes5"
-   ]
-  },
   {
    "cell_type": "code",
    "execution_count": 36,
 
@@ -34,26 +34,12 @@
    "execution_count": 1,
    "metadata": {},
    "outputs": [
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow_addons/utils/ensure_tf_install.py:68: UserWarning: Tensorflow Addons supports using Python ops for all Tensorflow versions above or equal to 2.2.0 and strictly below 2.4.0 (nightly versions are not supported). \n",
-      " The versions of TensorFlow you are currently using is 2.4.1 and is not supported. \n",
-      "Some things might work, some things might not.\n",
-      "If you were to encounter a bug, do not file an issue.\n",
-      "If you want to make sure you're using a tested and supported configuration, either change the TensorFlow version or the TensorFlow Addons's version. \n",
-      "You can find the compatibility matrix in TensorFlow Addon's readme:\n",
-      "https://github.com/tensorflow/addons\n",
-      "  UserWarning,\n"
-     ]
-    },
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "CPU times: user 5.17 s, sys: 990 ms, total: 6.16 s\n",
-      "Wall time: 6.67 s\n"
+      "CPU times: user 5.72 s, sys: 1.14 s, total: 6.87 s\n",
+      "Wall time: 8.29 s\n"
      ]
     }
    ],
@@ -67,7 +53,18 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "### List available language detected"
+    "### Models accuracy\n",
+    "\n",
+    "We use `sklearn.metrics.classification_report` for accuracy reporting, check at https://malaya.readthedocs.io/en/latest/models-accuracy.html#language-detection"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### labels supported\n",
+    "\n",
+    "Default labels for language detection module."
    ]
   },
   {
@@ -572,8 +569,6 @@
    "source": [
     "### Load Deep learning model\n",
     "\n",
-    "Deep learning model is slightly more accurate then fast-text model, can check accuracy comparison at here, https://malaya.readthedocs.io/en/latest/Accuracy.html#language-detection\n",
-    "\n",
     "```python\n",
     "def deep_model(quantized: bool = False, **kwargs):\n",
     "    \"\"\"\n",
 
@@ -38,8 +38,8 @@
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "CPU times: user 4.09 s, sys: 556 ms, total: 4.65 s\n",
-      "Wall time: 3.75 s\n"
+      "CPU times: user 5.94 s, sys: 1.17 s, total: 7.11 s\n",
+      "Wall time: 8.41 s\n"
      ]
     }
    ],
@@ -48,6 +48,15 @@
     "import malaya"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Models accuracy\n",
+    "\n",
+    "We use `sklearn.metrics.classification_report` for accuracy reporting, check at https://malaya.readthedocs.io/en/latest/models-accuracy.html#pos-recognition"
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -320,15 +329,6 @@
     "malaya.pos.available_transformer()"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Make sure you can check accuracy chart from here first before select a model, https://malaya.readthedocs.io/en/latest/Accuracy.html#pos-recognition\n",
-    "\n",
-    "**You might want to use Tiny-Albert, a very small size, 22.4MB, but the accuracy is still on the top notch.**"
-   ]
-  },
   {
    "cell_type": "code",
    "execution_count": 4,