Skip to content

Commit 7f698b5

Browse files
committed
cleaning readmes
1 parent 02b6a0c commit 7f698b5

File tree

2 files changed

+3
-39
lines changed

2 files changed

+3
-39
lines changed

CHANGE_LOG.txt

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,6 @@
11
Change Log 0.2.0
22
-----------------
33

4-
CLEAN TODO-FILE
5-
64
- Base code Refactor:
75
- Removing coupling between LabelledCollection and quantification methods; the fit interface changes:
86
def fit(data:LabelledCollection): -> def fit(X, y):

TODO.txt

Lines changed: 3 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -1,53 +1,19 @@
1-
Adapt examples; remaining: example 4-onwards
2-
not working: 15 (qunfold)
3-
41
Solve the warnings issue; right now there is a warning ignore in method/__init__.py:
52

63
Add 'platt' to calib options in EMQ?
74

85
Allow n_prevpoints in APP to be specified by a user-defined grid?
96

10-
Update READMEs, wiki, & examples for new fit-predict interface
11-
12-
Add the fix suggested by Alexander:
13-
14-
For a more general application, I would maybe first establish a per-class threshold value of plausible prevalence
7+
Add the fix suggested by Alexander?
8+
"For a more general application, I would maybe first establish a per-class threshold value of plausible prevalence
159
based on the number of actual positives and the required sample size; e.g., for sample_size=100 and actual
1610
positives [10, 100, 500] -> [0.1, 1.0, 1.0], meaning that class 0 can be sampled at most at 0.1 prevalence, while
1711
the others can be sampled up to 1. prevalence. Then, when a prevalence value is requested, e.g., [0.33, 0.33, 0.33],
1812
we may either clip each value and normalize (as you suggest for the extreme case, e.g., [0.1, 0.33, 0.33]/sum) or
19-
scale each value by per-class thresholds, i.e., [0.33*0.1, 0.33*1, 0.33*1]/sum.
13+
scale each value by per-class thresholds, i.e., [0.33*0.1, 0.33*1, 0.33*1]/sum."
2014
- This affects LabelledCollection
2115
- This functionality should be accessible via sampling protocols and evaluation functions
2216

23-
Solve the pre-trained classifier issues. An example is the coptic-codes script I did, which needed a mock_lr to
24-
work for having access to classes_; think also the case in which the precomputed outputs are already generated
25-
as in the unifying problems code.
26-
27-
Para quitar el labelledcollection de los métodos:
28-
29-
- El follón viene por la semántica confusa de fit en agregativos, que recibe 3 parámetros:
30-
- data: LabelledCollection, que puede ser:
31-
- el training set si hay que entrenar el clasificador
32-
- None si no hay que entregar el clasificador
33-
- el validation, que entra en conflicto con val_split, si no hay que entrenar clasificador
34-
- fit_classifier: dice si hay que entrenar el clasificador o no, y estos cambia la semántica de los otros
35-
- val_split: que puede ser:
36-
- un número: el número de kfcv, lo cual implica fit_classifier=True y data=todo el training set
37-
- una fración en [0,1]: que indica la parte que usamos para validation; implica fit_classifier=True y data=train+val
38-
- un labelled collection: el conjunto de validación específico; no implica fit_classifier=True ni False
39-
- La forma de quitar la dependencia de los métodos con LabelledCollection debería ser así:
40-
- En el constructor se dice si el clasificador que se recibe por parámetro hay que entrenarlo o ya está entrenado;
41-
es decir, hay un fit_classifier=True o False.
42-
- fit_classifier=True:
43-
- data en fit es todo el training incluyendo el validation y todo
44-
- val_split:
45-
- int: número de folds en kfcv
46-
- proporción en [0,1]
47-
- fit_classifier=False:
48-
49-
50-
5117
- [TODO] document confidence in manuals
5218
- [TODO] Test the return_type="index" in protocols and finish the "distributing_samples.py" example
5319
- [TODO] Add EDy (an implementation is available at quantificationlib)

0 commit comments

Comments
 (0)