You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/en/models/generative.md
+90-97Lines changed: 90 additions & 97 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,58 +11,57 @@ Generative recommendation models are an emerging approach that leverages generat
11
11
12
12
### Description
13
13
14
-
HSTU (Hierarchical Sequence Transformer Unit) is a hierarchical sequence transformation unit designed for large-scale sequence recommendation, capable of supporting trillion-parameter recommendation systems.
14
+
HSTU (Hierarchical Sequential Transduction Units) is an autoregressive sequence recommender for next-item prediction. In Torch-RecHub, `HSTUModel` consumes padded item-token sequences plus optional per-position time-difference features and returns logits over the item vocabulary at every sequence position.
15
15
16
16
### Core Principles
17
17
18
-
-**Hierarchical Structure**: Uses hierarchical design to decompose long sequences into multiple sub-sequences, improving model parallelism and scalability
19
-
-**Transformer Architecture**: Based on Transformer architecture, capable of capturing long-range dependencies
20
-
-**Large-scale Pretraining**: Supports large-scale pretraining, learning universal representations from massive data
-**Eq. 2 UVQK projection**: applies one `SiLU` to the joint `UVQK` projection before splitting, so `U`, `V`, `Q`, and `K` all pass through the same non-linearity.
19
+
-**Eq. 3 attention bias**: adds per-head bucketed relative position/time bias `rab^{p,t}` to attention scores before `silu(scores) / max_seq_len`.
20
+
-**Eq. 4 gated output**: projects `LayerNorm(A V) * U` through one output linear layer, without concat-u/x bypasses or a separate FFN.
21
+
-**External residuals**: each layer is wrapped as `x = x + HSTULayer(x)` in `HSTUBlock`.
22
+
-**Generative training**: predicts the next token in the sequence and masks PAD token `0` in the loss.
22
23
23
24
### Usage
24
25
25
26
```python
26
-
from torch_rechub.models.generative import HSTUModel
27
-
from torch_rechub.basic.features import SparseFeature, SequenceFeature
@@ -307,4 +300,4 @@ A: Try the following approaches:
307
300
- Develop more efficient model training and inference methods
308
301
- Achieve distributed and scalable generative recommendation systems
309
302
310
-
Generative recommendation is an important development direction for recommendation systems, capable of providing richer, more natural, and more personalized recommendation experiences. Torch-RecHub provides various advanced generative recommendation models for developers to choose based on business requirements. With the continuous development of large language models and generative AI technologies, generative recommendation will be applied in more scenarios, providing users with better recommendation experiences.
303
+
Generative recommendation is an important development direction for recommendation systems, capable of providing richer, more natural, and more personalized recommendation experiences. Torch-RecHub provides various advanced generative recommendation models for developers to choose based on business requirements. With the continuous development of large language models and generative AI technologies, generative recommendation will be applied in more scenarios, providing users with better recommendation experiences.
0 commit comments