Skip to content

Commit

Permalink
Fix typos in the doc
Browse files Browse the repository at this point in the history
  • Loading branch information
e10e3 committed Jan 31, 2025
1 parent f85f12c commit e6deef7
Show file tree
Hide file tree
Showing 13 changed files with 18 additions and 18 deletions.
4 changes: 2 additions & 2 deletions docs/examples/batch-to-online.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@
" ('lin_reg', linear_model.LogisticRegression(solver='lbfgs'))\n",
"])\n",
"\n",
"# Define a determistic cross-validation procedure\n",
"# Define a deterministic cross-validation procedure\n",
"cv = model_selection.KFold(n_splits=5, shuffle=True, random_state=42)\n",
"\n",
"# Compute the MSE values\n",
Expand Down Expand Up @@ -356,7 +356,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The results seem to be exactly the same! The twist is that the running statistics won't be very accurate for the first few observations. In general though this doesn't matter too much. Some would even go as far as to say that this descrepancy is beneficial and acts as some sort of regularization...\n",
"The results seem to be exactly the same! The twist is that the running statistics won't be very accurate for the first few observations. In general though this doesn't matter too much. Some would even go as far as to say that this discrepancy is beneficial and acts as some sort of regularization...\n",
"\n",
"Now the idea is that we can compute the running statistics of each feature and scale them as they come along. The way to do this with River is to use the `StandardScaler` class from the `preprocessing` module, as so:"
]
Expand Down
2 changes: 1 addition & 1 deletion docs/examples/building-a-simple-nowcasting-model.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -446,7 +446,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"We've managed to get a good looking prediction curve with a reasonably simple model. What's more our model has the advantage of being interpretable and easy to debug. There surely are more rocks to squeeze (e.g. tune the hyperparameters, use an ensemble model, etc.) but we'll leave that as an exercice to the reader.\n",
"We've managed to get a good looking prediction curve with a reasonably simple model. What's more our model has the advantage of being interpretable and easy to debug. There surely are more rocks to squeeze (e.g. tune the hyperparameters, use an ensemble model, etc.) but we'll leave that as an exercise to the reader.\n",
"\n",
"As a finishing touch we'll rewrite our pipeline using the `|` operator, which is called a \"pipe\"."
]
Expand Down
2 changes: 1 addition & 1 deletion docs/examples/content-personalization.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -319,7 +319,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"A good recommender model should at the very least understand what kind of items each user prefers. One of the simplest and yet performant way to do this is Simon Funk's SGD method he developped for the Netflix challenge and wrote about [here](https://sifter.org/simon/journal/20061211.html). It models each user and each item as latent vectors. The dot product of these two vectors is the expected preference of the user for the item."
"A good recommender model should at the very least understand what kind of items each user prefers. One of the simplest and yet performant way to do this is Simon Funk's SGD method he developed for the Netflix challenge and wrote about [here](https://sifter.org/simon/journal/20061211.html). It models each user and each item as latent vectors. The dot product of these two vectors is the expected preference of the user for the item."
]
},
{
Expand Down
2 changes: 1 addition & 1 deletion docs/examples/sentence-classification.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -814,7 +814,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The command below allows you to download the pre-trained embeddings that spaCy makes available. More informations about spaCy and its installation may be found here [here](https://spacy.io/usage)."
"The command below allows you to download the pre-trained embeddings that spaCy makes available. More information about spaCy and its installation may be found here [here](https://spacy.io/usage)."
]
},
{
Expand Down
2 changes: 1 addition & 1 deletion docs/faq/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,4 +58,4 @@ There are many great open-source libraries for building neural network models. W

## Who are the authors of this library?

We are research engineers, graduate students, PhDs and machine learning researchers. The members of the develompent team are mainly located in France, Brazil and New Zealand.
We are research engineers, graduate students, PhDs and machine learning researchers. The members of the development team are mainly located in France, Brazil and New Zealand.
2 changes: 1 addition & 1 deletion docs/introduction/basic-concepts.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ Dictionaries are therefore a perfect fit. They're native to Python and have exce

In production, you're almost always going to face data streams which you have to react to, such as users visiting your website. The advantage of online machine learning is that you can design models that make predictions as well as learn from this data stream as it flows.

But of course, when you're developping a model, you don't usually have access to a real-time feed on which to evaluate your model. You usually have an offline dataset which you want to evaluate your model on. River provides some datasets which can be read in online manner, one sample at a time. It is however crucial to keep in mind that the goal is to reproduce a production scenario as closely as possible, in order to ensure your model will perform just as well in production.
But of course, when you're developing a model, you don't usually have access to a real-time feed on which to evaluate your model. You usually have an offline dataset which you want to evaluate your model on. River provides some datasets which can be read in online manner, one sample at a time. It is however crucial to keep in mind that the goal is to reproduce a production scenario as closely as possible, in order to ensure your model will perform just as well in production.

## Model evaluation

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -179,7 +179,7 @@
}
},
"source": [
"We see that `ADWIN` successfully indicates the presence of drift (red vertical lines) close to the begining of a new data distribution.\n",
"We see that `ADWIN` successfully indicates the presence of drift (red vertical lines) close to the beginning of a new data distribution.\n",
"\n",
"\n",
"---\n",
Expand Down
2 changes: 1 addition & 1 deletion docs/recipes/active-learning.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -196,7 +196,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Active learning is primarly used to label data in an efficient manner. However, in an online setting, active learning can also be used simply to speed up training. The point is that you can achieve a very good performance without training on an entire dataset. Active learning is a powerful way to decide which samples to train on."
"Active learning is primarily used to label data in an efficient manner. However, in an online setting, active learning can also be used simply to speed up training. The point is that you can achieve a very good performance without training on an entire dataset. Active learning is a powerful way to decide which samples to train on."
]
},
{
Expand Down
6 changes: 3 additions & 3 deletions docs/recipes/cloning-and-mutating.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
"source": [
"Sometimes you might want to reset a model, or edit (what we call mutate) its attributes. This can be useful in an online environment. Indeed, if you detect a drift, then you might want to mutate a model's attributes. Or if you see that a model's performance is plummeting, then you might to reset it to its \"factory settings\".\n",
"\n",
"Anyway, this is not to convince you, but rather to say that a model's attributes don't have be to set in stone throughout its lifetime. In particular, if you're developping your own model, then you might want to have good tools to do this. This is what this recipe is about."
"Anyway, this is not to convince you, but rather to say that a model's attributes don't have be to set in stone throughout its lifetime. In particular, if you're developing your own model, then you might want to have good tools to do this. This is what this recipe is about."
]
},
{
Expand Down Expand Up @@ -332,9 +332,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"All attributes are immutable by default. Under the hood, each model can specify a set of mutable attributes via the `_mutable_attributes` property. In theory this can be overriden. But the general idea is that we will progressively add more and more mutable attributes with time.\n",
"All attributes are immutable by default. Under the hood, each model can specify a set of mutable attributes via the `_mutable_attributes` property. In theory this can be overridden. But the general idea is that we will progressively add more and more mutable attributes with time.\n",
"\n",
"And that concludes this recipe. Arguably, this recipe caters to advanced users, and in particular users who are developping their own models. And yet, one could also argue that modifying parameters of a model on-the-fly is a great tool to have at your disposal when you're doing online machine learning."
"And that concludes this recipe. Arguably, this recipe caters to advanced users, and in particular users who are developing their own models. And yet, one could also argue that modifying parameters of a model on-the-fly is a great tool to have at your disposal when you're doing online machine learning."
]
}
],
Expand Down
6 changes: 3 additions & 3 deletions docs/recipes/on-hoeffding-trees.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
"\n",
"In this guide, we are going to:\n",
"\n",
"1. summarize the differences accross the multiple HT versions available;\n",
"1. summarize the differences across the multiple HT versions available;\n",
"2. learn how to inspect tree models;\n",
"3. learn how to manage the memory usage of HTs;\n",
"4. compare numerical tree splitters and understand their impact on the iDT induction process.\n",
Expand Down Expand Up @@ -888,7 +888,7 @@
"- $n$: Number of observations seen so far.\n",
"- $c$: the number of classes.\n",
"- $s$: the number of split points to evaluate (which means that this is a user-given parameter).\n",
"- $h$: the number of histogram bins or hash slots. Tipically, $h \\ll n$.\n",
"- $h$: the number of histogram bins or hash slots. Typically, $h \\ll n$.\n",
"\n",
"### 4.1. Classification tree splitters\n",
"\n",
Expand All @@ -906,7 +906,7 @@
"- The number of split points can be configured in the Gaussian splitter. Increasing this number makes this splitter slower, but it also potentially increases the quality of the obtained query points, implying enhanced tree accuracy. \n",
"- The number of stored bins can be selected in the Histogram splitter. Increasing this number increases the memory footprint and running time of this splitter, but it also potentially makes its split candidates more accurate and positively impacts on the tree's final predictive performance.\n",
"\n",
"Next, we provide a brief comparison of the classification splitters using 10K instances of the Random RBF synthetic dataset. Note that the tree equiped with the Exhaustive splitter does not use Naive Bayes leaves."
"Next, we provide a brief comparison of the classification splitters using 10K instances of the Random RBF synthetic dataset. Note that the tree equipped with the Exhaustive splitter does not use Naive Bayes leaves."
]
},
{
Expand Down
2 changes: 1 addition & 1 deletion docs/releases/0.12.0.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@
## drift

- Refactor the concept drift detectors to match the remaining of River's API. Warnings are only issued by detectors that support this feature.
- Drifts can be assessed via the property `drift_detected`. Warning signals can be acessed by the property `warning_detected`. The `update` now returns `self`.
- Drifts can be assessed via the property `drift_detected`. Warning signals can be accessed by the property `warning_detected`. The `update` now returns `self`.
- Ensure all detectors automatically reset their inner states after a concept drift detection.
- Streamline `DDM`, `EDDM`, `HDDM_A`, and `HDDM_W`. Make the configurable parameters names match their respective papers.
- Fix bugs in `EDDM` and `HDDM_W`.
Expand Down
2 changes: 1 addition & 1 deletion docs/releases/0.19.0.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ Calling `learn_one` in a pipeline will now update each part of the pipeline in t
## forest

- Fixed issue with `forest.ARFClassifier` which couldn't be passed a `CrossEntropy` metric.
- Fixed a bug in `forest.AMFClassifier` which slightly improves predictive accurary.
- Fixed a bug in `forest.AMFClassifier` which slightly improves predictive accuracy.
- Added `forest.AMFRegressor`.

## multioutput
Expand Down
2 changes: 1 addition & 1 deletion docs/releases/0.8.0.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,6 @@

## tree

- Unifed base class structure applied to all tree models.
- Unified base class structure applied to all tree models.
- Bug fixes.
- Added `tree.SGTClassifier` and `tree.SGTRegressor`.

0 comments on commit e6deef7

Please sign in to comment.