gromo.modules.linear_growing_module.LinearGrowingModule#

class gromo.modules.linear_growing_module.LinearGrowingModule(in_features: int, out_features: int, use_bias: bool = True, post_layer_function: Module = Identity(), extended_post_layer_function: Module | None = None, previous_module: GrowingModule | MergeGrowingModule | None = None, next_module: GrowingModule | MergeGrowingModule | None = None, allow_growing: bool = False, device: device | None = None, name: str | None = None, target_in_features: int | None = None)[source]#

LinearGrowingModule is a GrowingModule for a Linear layer.

Parameters:
  • in_features (int) – input features

  • out_features (int) – output features

  • use_bias (bool, optional) – use bias, by default True

  • post_layer_function (torch.nn.Module, optional) – activation function, by default torch.nn.Identity()

  • extended_post_layer_function (torch.nn.Module | None, optional) – extended activation function, by default None

  • previous_module (GrowingModule | MergeGrowingModule | None, optional) – the preceding growing module, by default None

  • next_module (GrowingModule | MergeGrowingModule | None, optional) – the succeeding growing module, by default None

  • allow_growing (bool, optional) – allow growth of this module, by default False

  • device (torch.device | None, optional) – default device, by default None

  • name (str | None, optional) – name of the module, by default None

  • target_in_features (int | None, optional) – target fan-in size, by default None

add_parameters(matrix_extension: Tensor | None, bias_extension: Tensor | None, added_in_features: int = 0, added_out_features: int = 0) None[source]#

Add new parameters to the layer.

Parameters:
  • matrix_extension (torch.Tensor | None) – extension of the weight matrix of the layer if None, the layer is extended with zeros should be of shape: - (out_features, added_in_features) if added_in_features > 0 - (added_out_features, in_features) if added_out_features > 0

  • bias_extension (torch.Tensor | None) – extension of the bias vector of the layer shape (added_out_features,) if None the layer is extended with zeros

  • added_in_features (int, optional) – number of input features added if None, the number of input features is not changed, by default 0

  • added_out_features (int, optional) – number of output features added if None, the number of output features is not changed, by default 0

Raises:

AssertionError – if we try to add input and output features at the same time

compute_cross_covariance_update() tuple[Tensor, int][source]#

Compute the update of the tensor P := B[-2]^T B[-1] .

Returns:

  • torch.Tensor – update of the tensor P

  • int – number of samples used to compute the update

Raises:
  • ValueError – if there is no previous module

  • NotImplementedError – if the previous module is not of type LinearGrowingModule or LinearMergeGrowingModule

compute_m_prev_update(desired_activation: Tensor | None = None) tuple[Tensor, int][source]#

Compute the update of the tensor M_{-2} := B[-2]^T dA .

Parameters:

desired_activation (torch.Tensor | None) – desired variation direction of the output of the layer

Returns:

  • torch.Tensor – update of the tensor M_{-2}

  • int – number of samples used to compute the update

Raises:
  • ValueError – if there is no previous module

  • NotImplementedError – if the previous module is not of type LinearGrowingModule or LinearMergeGrowingModule

compute_m_update(desired_activation: Tensor | None = None) tuple[Tensor, int][source]#

Compute the update of the tensor M. With the input tensor X and dLoss/dA the gradient of the loss with respect to the pre-activity: M = B[-1]^T dA

Parameters:

desired_activation (torch.Tensor | None) – desired variation direction of the output of the layer

Returns:

  • torch.Tensor – update of the tensor M

  • int – number of samples used to compute the update

compute_n_update() tuple[Tensor, int][source]#

Compute the update of the tensor N. With the input tensor X and V[+1] the projected desired update at the next layer (V[+1] = dL/dA[+1] - dW[+1]* B), the update is U^{j k} = X^{i j} V[+1]^{i k}.

Returns:

  • torch.Tensor – update of the tensor N

  • int – number of samples used to compute the update

Raises:

TypeError – if the next module is not of type LinearGrowingModule

compute_s_update() tuple[Tensor, int][source]#

Compute the update of the tensor S. With the input tensor B, the update is U^{j k} = B^{i j} B^{i k}.

Returns:

  • torch.Tensor – update of the tensor S

  • int – number of samples used to compute the update

create_layer_in_extension(extension_size: int) None[source]#

Create the layer input extension of given size.

Parameters:

extension_size (int) – size of the extension to create

create_layer_out_extension(extension_size: int) None[source]#

Create the layer output extension of given size.

Parameters:

extension_size (int) – size of the extension to create

static get_fan_in_from_layer(layer: Linear) int[source]#

Get the fan_in (number of input features) from a given layer.

Parameters:

layer (torch.nn.Linear) – layer to get the fan_in from

Returns:

fan_in of the layer

Return type:

int

property in_features: int#

Fan-in size

Returns:

fan-in size

Return type:

int

property in_neurons: int#

Fan-in size

Returns:

fan-in size

Return type:

int

property input_extended: Tensor#

Return the input extended with a column of ones if the bias is used.

Returns:

input extended

Return type:

torch.Tensor

property input_volume: int#

Expected input volume. For linear layers reduced to input features

Returns:

input volume

Return type:

int

layer_in_extension(weight: Tensor) None[source]#

Extend the layer with the parameters of layer assuming that the input of the layer is extended but not the output.

Parameters:

weight (torch.Tensor) – weight of the extension of shape (out_features, K)

layer_of_tensor(weight: Tensor, bias: Tensor | None = None, force_bias: bool = True) Linear[source]#
Create a layer with the same characteristics (excepted the shape)

with weight as parameter and bias as bias.

Parameters:
  • weight (torch.Tensor) – weight of the layer

  • bias (torch.Tensor | None) – bias of the layer

  • force_bias (bool) – if True, the created layer require a bias if self.use_bias is True

Returns:

layer with the same characteristics

Return type:

torch.nn.Linear

layer_out_extension(weight: Tensor, bias: Tensor | None = None) None[source]#

Extend the layer with the parameters of layer assuming that the output of the layer is extended but not the input.

Parameters:
  • weight (torch.Tensor) – weight of the extension with shape (K, in_features)

  • bias (torch.Tensor | None, optional) – bias of the extension if needed with shape (K)

number_of_parameters() int[source]#

Return the number of parameters of the layer.

Returns:

number of parameters

Return type:

int

property out_features: int#

Fan-out size

Returns:

fan-out size

Return type:

int

property output_volume: int#

Expected output volume. For linear layers reduced to output features

Returns:

output volume

Return type:

int

property tensor_n: Tensor#

Compute the tensor N for the layer with the current M_{-2}, P and optimal delta.

Returns:

N

Return type:

torch.Tensor

Examples using gromo.modules.linear_growing_module.LinearGrowingModule#

GroMo tutorial

GroMo tutorial