transformer_implementation.blocks package

Subpackages

Submodules

transformer_implementation.blocks.DecoderBlock module

class transformer_implementation.blocks.DecoderBlock.DecoderBlock(config)

Bases: Module

Implements a decoder block module in PyTorch, as part of a transformer architecture.

This class is a child of the PyTorch nn.Module class. It includes self-attention, cross-attention with the encoder’s output, and a feed-forward network.

Attributes

ln_1, ln_2, ln_3LayerNorm

Layer normalization layers that normalize the input tensor before the attention and feed-forward networks.

attn1, attn2MultiHeadAttention

Multi-head attention layers. attn1 is for self-attention, while attn2 is for cross-attention with the encoder’s output.

ffwFeedForward

The feed-forward network layer.

Methods

forward(x: torch.Tensor, encoder_output: torch.Tensor, src_mask: Optional[torch.Tensor]=None, tgt_mask: Optional[torch.Tensor]=None) -> Tuple[torch.Tensor, torch.Tensor, torch.Tensor]:

Computes the forward pass of the network.

Parameters

configobject
A configuration object with the following attribute:

n_embd (int): The size of the input and output feature vectors. bias (bool): If True, the layer normalization will include a bias term.

forward(x, encoder_output, src_mask=None, tgt_mask=None) Tensor

Implements the forward pass of the decoder block. The method applies self-attention, cross-attention with the encoder’s output, and a feed-forward network to the input tensor.

Parameters

xtorch.Tensor

The input tensor from the previous decoder block or the embedding layer.

encoder_outputtorch.Tensor

The output tensor from the encoder.

src_maskOptional[torch.Tensor], default=None

The source mask tensor. If provided, it will be used in the cross-attention layer.

tgt_maskOptional[torch.Tensor], default=None

The target mask tensor. If provided, it will be used in the self-attention layer.

Returns

Tuple[torch.Tensor, torch.Tensor, torch.Tensor]

The output tensor from the decoder block, and the attention matrices from the self-attention and cross-attention layers.

transformer_implementation.blocks.EncoderBlock module

class transformer_implementation.blocks.EncoderBlock.EncoderBlock(config)

Bases: Module

Implements an encoder block of a Transformer model in PyTorch.

This class is a child of the PyTorch nn.Module class. It consists of two main components: multi-head attention and feed-forward neural network, each followed by a layer normalization.

Attributes

ln_1LayerNorm

Layer normalization before the multi-head attention block.

attnMultiHeadAttention

Multi-head attention block.

ln_2LayerNorm

Layer normalization before the feed-forward neural network block.

ffwFeedForward

Feed-forward neural network block.

Methods

forward(x: torch.Tensor, mask: Optional[torch.Tensor]=None) -> Tuple[torch.Tensor, torch.Tensor]:

Computes the forward pass of the encoder block.

Parameters

configobject
A configuration object with the following attribute:

n_embd (int): The size of the input and output feature vectors. bias (bool): If True, the layer normalization layers will include a bias term.

forward(x, mask=None) Tensor

Implements the forward pass of the encoder block.

First, it applies layer normalization and then applies multi-head attention. The input is then added to the output of the multi-head attention and passed through another layer normalization. Finally, it applies the feed-forward network and adds its output to the input of the feed-forward network.

Parameters

xtorch.Tensor

The input tensor with a size of n_embd.

maskOptional[torch.Tensor]

An optional mask tensor to be applied on the attention mechanism.

Returns

Tuple[torch.Tensor, torch.Tensor]

The output tensor from the encoder block and the attention tensor from the decoder block.

Module contents