PositionwiseFeedForward module¶

class positionwiseFeedForward.PositionwiseFeedForward(d_model, d_ff=2048)¶

Bases: Module

Position-wise Feed Forward Network block from Attention is All You Need.

Apply two linear transformations to each input, separately but indetically. We implement them as 1D convolutions. Input and output have a shape (batch_size, d_model).

Parameters

d_model (int) – Dimension of input tensor.
d_ff (Optional[int]) – Dimension of hidden layer, default is 2048.

forward(x)¶

Propagate forward the input through the PFF block.

Apply the first linear transformation, then a relu actvation, and the second linear transformation.

Parameters: x (Tensor) – Input tensor with shape (batch_size, K, d_model).
Return type: Tensor
Returns: Output tensor with shape (batch_size, K, d_model).