HLT-ISTI
diff --git a/‎.github/workflows/ci.yml‎
Lines changed: 38 additions & 2 deletions b/‎.github/workflows/ci.yml‎
Lines changed: 38 additions & 2 deletions
diff --git a/‎CHANGE_LOG.txt‎
Lines changed: 35 additions & 0 deletions b/‎CHANGE_LOG.txt‎
Lines changed: 35 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 19 additions & 19 deletions b/‎README.md‎
Lines changed: 19 additions & 19 deletions
diff --git a/‎TODO.txt‎
Lines changed: 54 additions & 0 deletions b/‎TODO.txt‎
Lines changed: 54 additions & 0 deletions
@@ -6,6 +6,8 @@ on:
     branches:
       - master
       - devel
+    tags:
+      - "[0-9]+.[0-9]+.[0-9]+"
 
 jobs:
 
@@ -28,7 +30,7 @@ jobs:
     - name: Install dependencies
       run: |
         python -m pip install --upgrade pip setuptools wheel
-        python -m pip install "qunfold @ git+https://github.com/mirkobunse/qunfold@v0.1.4"
+        python -m pip install "qunfold @ git+https://github.com/mirkobunse/qunfold@main"
         python -m pip install -e .[bayes,tests]
     - name: Test with unittest
       run: python -m unittest
@@ -47,7 +49,7 @@ jobs:
     - name: Install dependencies
       run: |
         python -m pip install --upgrade pip setuptools wheel "jax[cpu]"
-        python -m pip install "qunfold @ git+https://github.com/mirkobunse/qunfold@v0.1.4"
+        python -m pip install "qunfold @ git+https://github.com/mirkobunse/qunfold@main"
         python -m pip install -e .[neural,docs]
     - name: Build documentation
       run: sphinx-build -M html docs/source docs/build
@@ -66,3 +68,37 @@ jobs:
         branch: gh-pages
         directory: __gh-pages/
         github_token: ${{ secrets.GITHUB_TOKEN }}
+
+  release:
+    name: Build & Publish Release
+    runs-on: ubuntu-latest
+    if: startsWith(github.ref, 'refs/tags/')
+    steps:
+      - uses: actions/checkout@v4
+      - name: Set up Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: "3.11"
+      - name: Install build dependencies
+        run: |
+          python -m pip install --upgrade pip build twine
+      - name: Build package
+        run: python -m build
+      - name: Publish to PyPI
+        uses: pypa/gh-action-pypi-publish@release/v1
+        with:
+          user: __token__
+          password: ${{ secrets.PYPI_API_TOKEN }}
+      - name: Create GitHub Release
+        id: create_release
+        uses: actions/create-release@v1
+        with:
+          tag_name: ${{ github.ref_name }}
+          release_name: Release ${{ github.ref_name }}
+          body: |
+            Changes in this release: 
+            - see commit history for details
+          draft: false
+          prerelease: false
+        env:
+          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
@@ -1,3 +1,38 @@
+Change Log 0.2.0
+-----------------
+
+CLEAN TODO-FILE
+
+- Base code Refactor:
+    - Removing coupling between LabelledCollection and quantification methods; the fit interface changes:
+        def fit(data:LabelledCollection): -> def fit(X, y):
+    - Adding function "predict" (function "quantify" is still present as an alias, for the nostalgic)
+    - Aggregative methods's behavior in terms of fit_classifier and how to treat the val_split is now
+        indicated exclusively at construction time, and it is no longer possible to indicate it at fit time.
+        This is because, in v<=0.1.9, one could create a method (e.g., ACC) and then indicate:
+        my_acc.fit(tr_data, fit_classifier=False, val_split=val_data)
+        in which case the first argument is unused, and this was ambiguous with
+        my_acc.fit(the_data, fit_classifier=False)
+        in which case the_data is to be used for validation purposes. However, the val_split could be set as a fraction
+        indicating only part of the_data must be used for validation, and the rest wasted... it was certainly confusing.
+    - This change imposes a versioning constrain with qunfold, which now must be >= 0.1.6
+- EMQ has been modified, so that the representation function "classify" now only provides posterior
+    probabilities and, if required, these are recalibrated (e.g., by "bcts") during the aggregation function.
+    - A new parameter "on_calib_error" is passed to the constructor, which informs of the policy to follow
+        in case the abstention's calibration functions failed (which happens sometimes). Options include:
+            - 'raise': raises a RuntimeException (default)
+            - 'backup': reruns by silently avoiding calibration
+    - Parameter "recalib" has been renamed "calib"
+- Added aggregative bootstrap for deriving confidence regions (confidence intervals, ellipses in the simplex, or
+    ellipses in the CLR space). This method is efficient as it leverages the two-phases of the aggregative quantifiers.
+    This method applies resampling only to the aggregation phase, thus avoiding to train many quantifiers, or
+    classify multiple times the instances of a sample. See:
+    - quapy/method/confidence.py (new)
+    - the new example no. 16.confidence_regions.py
+- BayesianCC moved to confidence.py, where methods having to do with confidence intervals belong.
+- Improved documentation of qp.plot module.
+
+
 Change Log 0.1.9
 ----------------
 
 
@@ -13,8 +13,8 @@ for facilitating the analysis and interpretation of the experimental results.
 
 ### Last updates:
 
-* Version 0.1.9 is released! major changes can be consulted [here](CHANGE_LOG.txt).
-* The developer API documentation is available [here](https://hlt-isti.github.io/QuaPy/index.html)
+* Version 0.2.0 is released! major changes can be consulted [here](CHANGE_LOG.txt).
+* The developer API documentation is available [here](https://hlt-isti.github.io/QuaPy/build/html/modules.html)
 
 ### Installation
 
@@ -46,15 +46,15 @@ of the test set.
 ```python
 import quapy as qp
 
-dataset = qp.datasets.fetch_UCIBinaryDataset("yeast")
-training, test = dataset.train_test
+training, test = qp.datasets.fetch_UCIBinaryDataset("yeast").train_test
 
 # create an "Adjusted Classify & Count" quantifier
 model = qp.method.aggregative.ACC()
-model.fit(training)
+Xtr, ytr = training.Xy
+model.fit(Xtr, ytr)
 
-estim_prevalence = model.quantify(test.X)
-true_prevalence  = test.prevalence()
+estim_prevalence = model.predict(test.X)
+true_prevalence = test.prevalence()
 
 error = qp.error.mae(true_prevalence, estim_prevalence)
 print(f'Mean Absolute Error (MAE)={error:.3f}')
@@ -67,8 +67,7 @@ class prevalence of the training set. For this reason, any quantification model
 should be tested across many samples, even ones characterized by class prevalence 
 values different or very different from those found in the training set.
 QuaPy implements sampling procedures and evaluation protocols that automate this workflow.
-See the [documentation](https://hlt-isti.github.io/QuaPy/manuals/protocols.html) 
-and the [examples directory](https://github.com/HLT-ISTI/QuaPy/tree/master/examples) for detailed examples.
+See the [documentation](https://hlt-isti.github.io/QuaPy/build/html/) for detailed examples.
 
 ## Features
 
@@ -80,8 +79,8 @@ quantification methods based on structured output learning, HDy, QuaNet, quantif
     * 32 UCI Machine Learning datasets.
     * 11 Twitter quantification-by-sentiment datasets.
     * 3 product reviews quantification-by-sentiment datasets. 
-    * 4 tasks from LeQua 2022 competition
-    * 4 tasks from LeQua 2024 competition (_new in v0.1.9!_)
+    * 4 tasks from LeQua 2022 competition and 4 tasks from LeQua 2024 competition
+    * IFCB for Plancton quantification 
 * Native support for binary and single-label multiclass quantification scenarios.
 * Model selection functionality that minimizes quantification-oriented loss functions.
 * Visualization tools for analysing the experimental results.
@@ -102,22 +101,23 @@ In case you want to contribute improvements to quapy, please generate pull reque
 
 ## Documentation
 
-The developer API documentation is available [here](https://hlt-isti.github.io/QuaPy/). 
+The [developer API documentation](https://hlt-isti.github.io/QuaPy/build/html/modules.html) is available [here](https://hlt-isti.github.io/QuaPy/build/html/index.html). 
 
-Check out our [Manuals](https://hlt-isti.github.io/QuaPy/manuals.html), in which many examples
+Check out the [Manuals](https://hlt-isti.github.io/QuaPy/manuals.html), in which many code examples
 are provided:
 
 * [Datasets](https://hlt-isti.github.io/QuaPy/manuals/datasets.html)
 * [Evaluation](https://hlt-isti.github.io/QuaPy/manuals/evaluation.html)
-* [Explicit loss minimization](https://hlt-isti.github.io/QuaPy/manuals/explicit-loss-minimization.html)
+* [Protocols](https://hlt-isti.github.io/QuaPy/manuals/protocols.html)
 * [Methods](https://hlt-isti.github.io/QuaPy/manuals/methods.html)
-* [Model Selection](https://hlt-isti.github.io/QuaPy/manuals/datasets.html)
+* [SVMperf](https://hlt-isti.github.io/QuaPy/manuals/explicit-loss-minimization.html)
+* [Model Selection](https://hlt-isti.github.io/QuaPy/manuals/model-selection.html)
 * [Plotting](https://hlt-isti.github.io/QuaPy/manuals/plotting.html)
-* [Protocols](https://hlt-isti.github.io/QuaPy/manuals/protocols.html)
 
 ## Acknowledgments:
 
-This work has been funded by the QuaDaSh project (P2022TB5JF) "Finanziato dall’Unione europea- Next Generation EU, Missione 4 Componente 2 CUP B53D23026250001".
-
-<img src="docs/source/EUfooter.png" alt="EUcommission" width="1000"/>
 <img src="docs/source/SoBigData.png" alt="SoBigData++" width="250"/>
+
+This work has been supported by the QuaDaSh project 
+_"Finanziato dall’Unione europea---Next Generation EU, 
+Missione 4 Componente 2 CUP B53D23026250001"_.
@@ -1,6 +1,60 @@
+Adapt examples; remaining: example 4-onwards
+not working: 15 (qunfold)
+
+Solve the warnings issue; right now there is a warning ignore in method/__init__.py:
+
+Add 'platt' to calib options in EMQ?
+
+Allow n_prevpoints in APP to be specified by a user-defined grid?
+
+Update READMEs, wiki, & examples for new fit-predict interface
+
+Add the fix suggested by Alexander:
+
+For a more general application, I would maybe first establish a per-class threshold value of plausible prevalence
+based on the number of actual positives and the required sample size; e.g., for sample_size=100 and actual
+positives [10, 100, 500] -> [0.1, 1.0, 1.0], meaning that class 0 can be sampled at most at 0.1 prevalence, while
+the others can be sampled up to 1. prevalence. Then, when a prevalence value is requested, e.g., [0.33, 0.33, 0.33],
+we may either clip each value and normalize (as you suggest for the extreme case, e.g., [0.1, 0.33, 0.33]/sum) or
+scale each value by per-class thresholds, i.e., [0.33*0.1, 0.33*1, 0.33*1]/sum.
+- This affects LabelledCollection
+- This functionality should be accessible via sampling protocols and evaluation functions
+
+Solve the pre-trained classifier issues. An example is the coptic-codes script I did, which needed a mock_lr to
+work for having access to classes_; think also the case in which the precomputed outputs are already generated
+as in the unifying problems code.
+
+Para quitar el labelledcollection de los métodos:
+
+- El follón viene por la semántica confusa de fit en agregativos, que recibe 3 parámetros:
+    - data: LabelledCollection, que puede ser:
+        - el training set si hay que entrenar el clasificador
+        - None si no hay que entregar el clasificador
+        - el validation, que entra en conflicto con val_split, si no hay que entrenar clasificador
+    - fit_classifier: dice si hay que entrenar el clasificador o no, y estos cambia la semántica de los otros
+    - val_split: que puede ser:
+        - un número: el número de kfcv, lo cual implica fit_classifier=True y data=todo el training set
+        - una fración en [0,1]: que indica la parte que usamos para validation; implica fit_classifier=True y data=train+val
+        - un labelled collection: el conjunto de validación específico; no implica fit_classifier=True ni False
+- La forma de quitar la dependencia de los métodos con LabelledCollection debería ser así:
+    - En el constructor se dice si el clasificador que se recibe por parámetro hay que entrenarlo o ya está entrenado;
+        es decir, hay un fit_classifier=True o False.
+        - fit_classifier=True:
+            - data en fit es todo el training incluyendo el validation y todo
+            - val_split:
+                - int: número de folds en kfcv
+                - proporción en [0,1]
+        - fit_classifier=False:
+
+
+
+- [TODO] document confidence in manuals
+- [TODO] Test the return_type="index" in protocols and finish the "distributing_samples.py" example
+- [TODO] Add EDy (an implementation is available at quantificationlib)
 - [TODO] add ensemble methods SC-MQ, MC-SQ, MC-MQ
 - [TODO] add HistNetQ
 - [TODO] add CDE-iteration and Bayes-CDE methods
 - [TODO] add Friedman's method and DeBias
 - [TODO] check ignore warning stuff
     check https://docs.python.org/3/library/warnings.html#temporarily-suppressing-warnings
+- [TODO] nmd and md are not selectable from qp.evaluation.evaluate as a string