Skip to content

Commit c38354e

Browse files
Update docs for 0.3.0
1 parent 4941f09 commit c38354e

30 files changed

+2386
-218
lines changed

README.md

Lines changed: 68 additions & 36 deletions
Large diffs are not rendered by default.

docs/.vitepress/config.mts

Lines changed: 36 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ export default defineConfig({
1111
{text: 'Home', link: '/'},
1212
{text: 'Docs', link: '/introduction'},
1313
{
14-
text: '0.1.x',
14+
text: '0.3.x',
1515
items: [
1616
{
1717
text: 'Changelog',
@@ -41,31 +41,52 @@ export default defineConfig({
4141
collapsed: false,
4242
link: '/pipelines',
4343
items: [
44-
{text: 'Text Classification', link: '/text-classification'},
45-
{text: 'Fill Mask', link: '/fill-mask'},
46-
{text: 'Zero Shot Classification', link: '/zero-shot-classification'},
47-
{text: 'Question Answering', link: '/question-answering'},
48-
{text: 'Token Classification', link: '/token-classification'},
49-
{text: 'Feature Extraction', link: '/feature-extraction'},
50-
{text: 'Text to Text Generation', link: '/text-to-text-generation'},
51-
{text: 'Translation', link: '/translation'},
52-
{text: 'Summarization', link: '/summarization'},
53-
{text: 'Text Generation', link: '/text-generation'},
44+
{
45+
text: 'NLP Tasks',
46+
collapsed: true,
47+
items: [
48+
{text: 'Text Classification', link: '/text-classification'},
49+
{text: 'Fill Mask', link: '/fill-mask'},
50+
{text: 'Zero Shot Classification', link: '/zero-shot-classification'},
51+
{text: 'Question Answering', link: '/question-answering'},
52+
{text: 'Token Classification', link: '/token-classification'},
53+
{text: 'Feature Extraction', link: '/feature-extraction'},
54+
{text: 'Text to Text Generation', link: '/text-to-text-generation'},
55+
{text: 'Translation', link: '/translation'},
56+
{text: 'Summarization', link: '/summarization'},
57+
{text: 'Text Generation', link: '/text-generation'},
58+
]
59+
},
60+
{
61+
text: 'Computer Vision Tasks',
62+
collapsed: true,
63+
items: [
64+
{text: 'Image Classification', link: '/image-classification'},
65+
{text: 'Zero Shot Image Classification', link: '/zero-shot-image-classification'},
66+
{text: 'Object Detection', link: '/object-detection'},
67+
{text: 'Zero Shot Object Detection', link: '/zero-shot-object-detection'},
68+
{text: 'Image Feature Extraction', link: '/image-feature-extraction'},
69+
{text: 'Image To Text', link: '/image-to-text'},
70+
{text: 'Image To Image', link: '/image-to-image'},
71+
]
72+
}
5473
]
5574
},
5675
{
5776
text: 'Advanced Usage',
5877
collapsed: false,
5978
items: [
60-
{text: 'Auto Models', link: '/auto-models'},
61-
{text: 'Auto Tokenizers', link: '/auto-tokenizers'},
79+
{text: 'Models', link: '/models'},
80+
{text: 'Tokenizers', link: '/tokenizers'},
6281
]
6382
},
6483
{
6584
text: 'Utilities',
6685
collapsed: false,
6786
items: [
68-
{text: 'Generation', link: '/generation'},
87+
{text: 'Generation', link: '/utils/generation'},
88+
{text: 'Image', link: '/utils/image'},
89+
{text: 'Tensor', link: '/utils/tensor'},
6990
]
7091
}
7192
],
@@ -77,7 +98,7 @@ export default defineConfig({
7798

7899
footer: {
79100
message: 'Released under the MIT License.',
80-
copyright: 'Copyright © 2024 <a href="https://github.com/CodeWithKyrian">Kyrian Obikwelu</a>'
101+
copyright: 'Copyright © 2024 <a href="https://twitter.com/CodeWithKyrian">Kyrian Obikwelu</a>'
81102
},
82103

83104
editLink: {

docs/auto-models.md

Lines changed: 0 additions & 5 deletions
This file was deleted.

docs/auto-tokenizers.md

Lines changed: 0 additions & 5 deletions
This file was deleted.

docs/configuration.md

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,9 +14,11 @@ models, and the remote path template. These settings allow you to tailor how and
1414

1515
```php
1616
use Codewithkyrian\Transformers\Transformers;
17+
use Codewithkyrian\Transformers\Utils\ImageDriver;
1718

1819
Transformers::setup()
1920
->setCacheDir('/path/to/models')
21+
->setImageDriver(ImageDriver::IMAGICK)
2022
->setRemoteHost('https://yourmodelshost.com')
2123
->setRemotePathTemplate('custom/path/{model}/{file}')
2224
->setAuthToken('your-token')
@@ -94,6 +96,20 @@ Transformers::setup()
9496
->apply();
9597
```
9698

99+
### `setImageDriver(ImageDriver $imageDriver)`
100+
101+
This setting allows you to specify the image backend to use for image processing tasks. By default, TransformersPHP uses
102+
the `IMAGICK` image driver. You can change this to `GD` or `VIPS` if you prefer, just make sure to have the required
103+
extensions installed.
104+
105+
```php
106+
use Codewithkyrian\Transformers\Utils\ImageDriver;
107+
108+
Transformers::setup()
109+
->setImageDriver(ImageDriver::GD)
110+
->apply();
111+
```
112+
97113
## Applying Configuration
98114

99115
::: danger VERY IMPORTANT
@@ -127,7 +143,7 @@ use Codewithkyrian\Transformers\Transformers;
127143
### Laravel Projects
128144

129145
In a Laravel project, you can add global configuration in the `AppServiceProvider` class. Laravel service providers are
130-
excellent locations for bootstrap code, making them the best place to set up global configurations. It's recommended to
146+
excellent locations for bootstrap code, making them the best place to set up global configurations. It's recommended to
131147
set the cache directory to the a subdirectory of the `storage` directory, as it's writable and not publicly accessible.
132148

133149
::: code-group

docs/getting-started.md

Lines changed: 9 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -21,17 +21,14 @@ You can install the library via Composer. This is the recommended way to install
2121
composer require codewithkyrian/transformers
2222
```
2323

24-
After installation, you need to initialize the package to download the necessary shared libraries for running the ONNX
25-
models:
26-
27-
```bash
28-
./vendor/bin/transformers install
29-
```
24+
ONNX runtime will be installed automatically as well. For Windows users, it may take more time to install the ONNX
25+
library compared to Linux or macOS users (no shades 😅).
3026

3127
> [!CAUTION]
32-
> These shared libraries to be downloaded are platform-specific, so it's important to run this command on the target
33-
> platform where the code will be executed. For example, if you're using a Docker container, run the `install` command
34-
> inside that container.
28+
> The ONNX library is platform-specific, so it's important to run the composer require command on the target platform
29+
> where the code will be executed. In most cases, this will be your development machine or a server where you deploy
30+
> your
31+
> application, but if you're using a Docker container, run the `composer require` command inside that container.
3532
3633
This command sets up everything you need to start using pre-trained ONNX models with TransformersPHP.
3734

@@ -142,7 +139,7 @@ in PHP 7.4 and later, but it may not be enabled by default. To check if the FFI
142139
command:
143140

144141
```bash
145-
php -m | grep ffi
142+
php -m | grep FFI
146143
```
147144

148145
If the FFI extension is not enabled, you can enable it by uncommenting(remove the `;` from the beginning of the line)
@@ -153,7 +150,7 @@ following line in your `php.ini` file:
153150
extension = ffi
154151
```
155152

156-
Also, you need to set the `ffi.enable` directive to `true` in your `php.ini` file:
153+
TransformersPHP does not support FFI preloading yet, so you need to enable the `ffi.enable` directive in your `php.ini`
157154

158155
```ini
159156
ffi.enable = true
@@ -166,7 +163,7 @@ After making these changes, restart your web server or PHP-FPM service, and you
166163
Just-In-Time (JIT) compilation is a feature that allows PHP to compile and execute code at runtime. JIT compilation can
167164
improve the performance of your application by compiling frequently executed code paths into machine code. While you
168165
can use TransformersPHP without JIT compilation, enabling it can provide a significant performance boost (> 2x in some
169-
cases).
166+
cases) since there are many matrix multiplications and other mathematical operations involved in running ONNX models.
170167

171168
JIT compilation is available in PHP 8.0 and later, but it may not be enabled by default. To enable JIT compilation,
172169
change the `opcache.jit` directive in your `php.ini` file:

docs/image-classification.md

Lines changed: 101 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,101 @@
1+
---
2+
outline: deep
3+
---
4+
5+
# Image Classification <Badge type="tip" text="^0.3.0" />
6+
7+
Image classification is a computer vision task that involves assigning a label or class to an image. An image
8+
is expected to have only one label in this task. The labels to be selected from are predefined by the model.
9+
This task accepts image inputs and returns the classification label and the confidence score.
10+
11+
## Task ID
12+
13+
- `image-classification`
14+
15+
## Default Model
16+
17+
- `Xenova/vit-base-patch16-224`.
18+
19+
## Use Cases
20+
21+
Image classification models find application in various scenarios, including:
22+
23+
- **Stock Photography Keywording:** Assigning keywords to images in stock photography databases.
24+
- **Image Search:** Organizing and categorizing photo galleries on devices or in the cloud based on multiple keywords or
25+
tags.
26+
- **Content Filtering:** Filtering and categorizing images for content moderation purposes.
27+
- **Medical Imaging:** Assisting in the diagnosis and classification of medical images such as X-rays and MRI scans.
28+
29+
## Running an Inference Session
30+
31+
Here's how to perform image classification using the pipeline:
32+
33+
```php
34+
use function Codewithkyrian\Transformers\Pipelines\pipeline;
35+
36+
$classifier = pipeline('image-classification');
37+
38+
$result = $classifier('path/to/image.jpg');
39+
```
40+
41+
::: details Click to view output
42+
43+
```php
44+
['label' => 'tiger, Panthera tigris', 'score' => 0.63534494664876]
45+
```
46+
47+
:::
48+
49+
## Pipeline Input Options
50+
51+
When running the `image-classification` pipeline, you can the following options:
52+
53+
- ### `texts` *(string)*
54+
The image(s) to classify. It can be a local file path, a file resource, a URL to an image (local or remote), or an
55+
array of these inputs. It's the first argument so there's no need to pass it as a named argument.
56+
```php
57+
$result = $classifier('https://example.com/image.jpg');
58+
```
59+
60+
- ### `topK` *(int)*
61+
The number of top labels to return. The default is `1`.
62+
```php
63+
$result = $classifier('https://example.com/image.jpg', topK: 3);
64+
```
65+
::: details Click to view output
66+
67+
```php
68+
[
69+
['label' => 'tiger, Panthera tigris', 'score' => 0.63534494664876],
70+
['label' => 'zebra', 'score' => 0.123456789],
71+
['label' => 'lion, Panthera leo', 'score' => 0.098765432]
72+
]
73+
```
74+
:::
75+
76+
## Pipeline Outputs
77+
78+
The output of the pipeline is an array containing the classification label and the confidence score. The confidence
79+
score is a value between 0 and 1, with 1 being the highest confidence.
80+
81+
Since the actual labels depend on the model, it's crucial to consult the model's documentation for the specific labels
82+
it uses. Here are examples demonstrating how outputs might differ:
83+
84+
For a single image:
85+
86+
```php
87+
['label' => 'tiger, Panthera tigris', 'score' => 0.63534494664876]
88+
```
89+
90+
For multiple images:
91+
92+
```php
93+
[
94+
['label' => 'tiger, Panthera tigris', 'score' => 0.63534494664876],
95+
['label' => 'cat', 'score' => 0.987654321],
96+
['label' => 'dog', 'score' => 0.87654321]
97+
]
98+
```
99+
100+
101+

docs/image-feature-extraction.md

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
---
2+
outline: deep
3+
---
4+
5+
# Image Feature Extraction <Badge type="tip" text="^0.3.0" />
6+
7+
Image feature extraction is a computer vision task that involves extracting high-level features from images. These
8+
features can be used for various purposes, such as image similarity search, image retrieval, and content-based image
9+
retrieval. The task accepts image inputs and returns a feature vector that represents the image.
10+
11+
## Task ID
12+
13+
- `image-feature-extraction`
14+
15+
## Default Model
16+
17+
- `Xenova/vit-base-patch16-224-in21k`
18+
19+
## Use Cases
20+
21+
Image feature extraction models find application in various scenarios, including:
22+
23+
- **Image Retrieval:** Generating feature vectors for images to enable similarity search and retrieval of similar images
24+
from a database.
25+
- **Content-Based Image Retrieval:** Enabling search engines to retrieve images based on their visual content rather
26+
than textual metadata.
27+
- **Image Similarity Search:** Finding visually similar images based on their feature representations.
28+
- **Visual Search:** Enhancing e-commerce platforms by allowing users to search for products using images rather than
29+
text.
30+
31+
## Running an Inference Session
32+
33+
Here's how to perform image feature extraction using the pipeline:
34+
35+
```php
36+
use function Codewithkyrian\Transformers\Pipelines\pipeline;
37+
38+
$extractor = pipeline('image-feature-extraction');
39+
40+
$result = $extractor('path/to/image.jpg');
41+
```
42+
43+
## Pipeline Input Options
44+
45+
When running the `image-feature-extraction` pipeline, you can use the following options:
46+
47+
- ### `texts` *(string|array)*
48+
The image(s) from which features are extracted. You can pass a single image path or an array of image paths for batch
49+
processing. It's required and is the first argument, so there's no need to pass it as a named argument.
50+
51+
- ### `pool` *(bool)*
52+
When set to `true`, it averages the feature vectors across all patches in the image. Before using this option, make
53+
sure the model has a pooler layer. The default value is `false`.
54+
55+
## Pipeline Output
56+
57+
The output of the `image-feature-extraction` pipeline is a feature vector that represents the input image. The shape
58+
and size of the feature vector depend on the model architecture and configuration. For no pooling, the shape is
59+
usually `[X, Y, Z]` where :
60+
61+
- `X` Represents the batch size (1 for single image input).
62+
- `Y` Denotes the sequence length or dimensionality of the features extracted from each token or patch. This dimension
63+
is typically fixed across tokens and corresponds to the size of the feature vectors extracted from the image patches.
64+
- `Z` Represents the size of the feature vector extracted from each patch. This dimension is typically fixed across
65+
patches and corresponds to the size of the feature vectors extracted from the image patches.
66+
67+
For example, with certain models, such as those based on the Vision Transformer (ViT) architecture, the feature vector's
68+
shape might be `[1, 197, 768]`.
69+
70+
When pooling is applied, the output shape is typically `[X, Z]`, where `Z` represents the size of the pooled feature
71+
vector.
72+
Pooling aggregates information from all the tokens or patches into a single feature vector, resulting in a
73+
reduced-dimensional representation of the input image. eg `[1, 768]`.
74+

0 commit comments

Comments
 (0)