transformer_implementation.blocks package¶
Subpackages¶
Submodules¶
transformer_implementation.blocks.DecoderBlock module¶
- class transformer_implementation.blocks.DecoderBlock.DecoderBlock(config)¶
Bases:
Module
Implements a decoder block module in PyTorch, as part of a transformer architecture.
This class is a child of the PyTorch nn.Module class. It includes self-attention, cross-attention with the encoder’s output, and a feed-forward network.
Attributes¶
- ln_1, ln_2, ln_3LayerNorm
Layer normalization layers that normalize the input tensor before the attention and feed-forward networks.
- attn1, attn2MultiHeadAttention
Multi-head attention layers. attn1 is for self-attention, while attn2 is for cross-attention with the encoder’s output.
- ffwFeedForward
The feed-forward network layer.
Methods¶
- forward(x: torch.Tensor, encoder_output: torch.Tensor, src_mask: Optional[torch.Tensor]=None, tgt_mask: Optional[torch.Tensor]=None) -> Tuple[torch.Tensor, torch.Tensor, torch.Tensor]:
Computes the forward pass of the network.
Parameters¶
- configobject
- A configuration object with the following attribute:
n_embd (int): The size of the input and output feature vectors. bias (bool): If True, the layer normalization will include a bias term.
- forward(x, encoder_output, src_mask=None, tgt_mask=None) Tensor ¶
Implements the forward pass of the decoder block. The method applies self-attention, cross-attention with the encoder’s output, and a feed-forward network to the input tensor.
Parameters¶
- xtorch.Tensor
The input tensor from the previous decoder block or the embedding layer.
- encoder_outputtorch.Tensor
The output tensor from the encoder.
- src_maskOptional[torch.Tensor], default=None
The source mask tensor. If provided, it will be used in the cross-attention layer.
- tgt_maskOptional[torch.Tensor], default=None
The target mask tensor. If provided, it will be used in the self-attention layer.
Returns¶
- Tuple[torch.Tensor, torch.Tensor, torch.Tensor]
The output tensor from the decoder block, and the attention matrices from the self-attention and cross-attention layers.
transformer_implementation.blocks.EncoderBlock module¶
- class transformer_implementation.blocks.EncoderBlock.EncoderBlock(config)¶
Bases:
Module
Implements an encoder block of a Transformer model in PyTorch.
This class is a child of the PyTorch nn.Module class. It consists of two main components: multi-head attention and feed-forward neural network, each followed by a layer normalization.
Attributes¶
- ln_1LayerNorm
Layer normalization before the multi-head attention block.
- attnMultiHeadAttention
Multi-head attention block.
- ln_2LayerNorm
Layer normalization before the feed-forward neural network block.
- ffwFeedForward
Feed-forward neural network block.
Methods¶
- forward(x: torch.Tensor, mask: Optional[torch.Tensor]=None) -> Tuple[torch.Tensor, torch.Tensor]:
Computes the forward pass of the encoder block.
Parameters¶
- configobject
- A configuration object with the following attribute:
n_embd (int): The size of the input and output feature vectors. bias (bool): If True, the layer normalization layers will include a bias term.
- forward(x, mask=None) Tensor ¶
Implements the forward pass of the encoder block.
First, it applies layer normalization and then applies multi-head attention. The input is then added to the output of the multi-head attention and passed through another layer normalization. Finally, it applies the feed-forward network and adds its output to the input of the feed-forward network.
Parameters¶
- xtorch.Tensor
The input tensor with a size of n_embd.
- maskOptional[torch.Tensor]
An optional mask tensor to be applied on the attention mechanism.
Returns¶
- Tuple[torch.Tensor, torch.Tensor]
The output tensor from the encoder block and the attention tensor from the decoder block.