PositionwiseFeedForward module

class positionwiseFeedForward.PositionwiseFeedForward(d_model, d_ff=2048)

Bases: Module

Position-wise Feed Forward Network block from Attention is All You Need.

Apply two linear transformations to each input, separately but indetically. We implement them as 1D convolutions. Input and output have a shape (batch_size, d_model).

Parameters
  • d_model (int) – Dimension of input tensor.

  • d_ff (Optional[int]) – Dimension of hidden layer, default is 2048.

forward(x)

Propagate forward the input through the PFF block.

Apply the first linear transformation, then a relu actvation, and the second linear transformation.

Parameters

x (Tensor) – Input tensor with shape (batch_size, K, d_model).

Return type

Tensor

Returns

Output tensor with shape (batch_size, K, d_model).