-
Notifications
You must be signed in to change notification settings - Fork 15
Open
Description
Hello, i have some small questions about the code.
- First, the MLP block uses
self.poslayer, actually, the author hadn't mentioned it in the paper. It acts like a depth-wise separable convolution together withself.fc2, but it add some extra parameters, the effective of this layer are huge??? - Second, in the Block code, i see the default args of kernel_size is 11, and padding is 5 for
self.alayer, however, in the last stage(stage 4), the size of feature map is 7x7 (224x224 inputs), using kernel_size = 11 for convolution seems some strange.
Thanks for your replay!
class MLP(nn.Module):
def __init__(self, dim, mlp_ratio=4):
super().__init__()
self.norm = LayerNorm(dim, eps=1e-6, data_format="channels_first")
self.fc1 = nn.Conv2d(dim, dim * mlp_ratio, 1)
self.pos = nn.Conv2d(dim * mlp_ratio, dim * mlp_ratio, 3, padding=1, groups=dim * mlp_ratio)
self.fc2 = nn.Conv2d(dim * mlp_ratio, dim, 1)
self.act = nn.GELU()
def forward(self, x):
B, C, H, W = x.shape
x = self.norm(x)
x = self.fc1(x)
x = self.act(x)
x = x + self.act(self.pos(x))
x = self.fc2(x)
return
Metadata
Metadata
Assignees
Labels
No labels