the Channel dim of a color picture

Hello~ I am studying your code and i have a question about how the model handle the color image due to I can't find the RGB Channel when frame sequence input into the model.
In the _multi_head_attention.py_, at the beginning of the _call_ method (after _self.wq(q)_, and i know the _self.wq_ is a conv_layer), your comment says:`#(batch_size, num_heads, seq_len_q, rows, cols, depth)`, where is the channel-dim? The dimension meaning of the six i understand is: **seq_len_q** is the length of the frame sequence; **num_heads** × **depth = d_model**; **rows** is the H of image; **cols** is the W of image)

Sincerely hope that you can answer my doubts and if you do not mind, can i ask you for some knowledge about the field of Video Prediction? I am trying to do some research about predicting image sequence with Transformer

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

the Channel dim of a color picture #5

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

the Channel dim of a color picture #5

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions