gromo.modules.growing_module.MergeGrowingModule#

class gromo.modules.growing_module.MergeGrowingModule(post_merge_function: Module = Identity(), previous_modules: list[MergeGrowingModule | GrowingModule] | None = None, next_modules: list[MergeGrowingModule | GrowingModule] | None = None, allow_growing: bool = False, tensor_s_shape: tuple[int, int] | None = None, device: device | None = None, name: str | None = None)[source]#

Module to connect multiple modules with an merge operation. This module does not perform the merge operation, it is done by the user.

add_next_module(module: MergeGrowingModule | GrowingModule) None[source]#

Add a module to the next modules of the current module.

Parameters:

module (MergeGrowingModule | GrowingModule) – next module to add

add_previous_module(module: MergeGrowingModule | GrowingModule) None[source]#

Add a module to the previous modules of the current module.

Parameters:

module (MergeGrowingModule | GrowingModule) – previous module to add

compute_optimal_delta(update: bool = True, return_deltas: bool = False, force_pseudo_inverse: bool = False, dtype: dtype = torch.float32) list[tuple[Tensor, Tensor]] | None[source]#

Compute the optimal delta for each previous layer using current S and M tensors. dW* = M S[-1]^-1 (if needed we use the pseudo-inverse) Compute dW* (and dBias* if needed) and update the optimal_delta_layer attribute.

Parameters:
  • update (bool, optional) – if True update the optimal delta layer attribute, by default True

  • return_deltas (bool, optional) – if True return the deltas, by default False

  • force_pseudo_inverse (bool, optional) – if True, use the pseudo-inverse to compute the optimal delta even if the, by default False matrix is invertible

  • dtype (torch.dtype) – dtype for S and M during the computation

Returns:

optimal delta for the weights and the biases if needed

Return type:

list[tuple[torch.Tensor, torch.Tensor]] | None

compute_previous_m_update() tuple[Tensor, int][source]#

Compute the update of the tensor M for the input of all previous modules.

Returns:

  • torch.Tensor – update of the tensor M

  • int – number of samples used to compute the update

Raises:

NotImplementedError – abstract method

compute_previous_s_update() tuple[Tensor, int][source]#

Compute the update of the tensor S for the input of all previous modules.

Returns:

  • torch.Tensor – update of the tensor S

  • int – number of samples used to compute the update

Raises:

NotImplementedError – abstract method

compute_s_update() tuple[Tensor, int][source]#

Compute the update of the tensor S. Should be added to the type of layer.

Returns:

  • torch.Tensor – update of the tensor S

  • int – number of samples used to compute the update

Raises:

NotImplementedError – abstract method

delete_update(include_previous: bool = False) None[source]#

Delete the update of the optimal added parameters.

forward(x: Tensor) Tensor[source]#

Forward pass of the module. If needed, store the activity and pre-activity tensors.

Parameters:

x (torch.Tensor) – input tensor

Returns:

output tensor

Return type:

torch.Tensor

grow()[source]#

Function to call after growing previous or next modules.

init_computation() None[source]#

Initialize the computation of the optimal added parameters.

property input_volume: int#

Expected input volume

Returns:

input volume

Return type:

int

Raises:

NotImplementedError – abstract method

property number_of_parameters: int#

Get the number of parameters of the layer

Returns:

number of parameters

Return type:

int

property number_of_predecessors: int#

Get the number of preceding modules

Returns:

number of previous modules

Return type:

int

property number_of_successors: int#

Get the number of succeeding modules

Returns:

number of next modules

Return type:

int

property output_volume: int#

Expected output volume

Returns:

output volume

Return type:

int

Raises:

NotImplementedError – abstract method

parameters(recurse: bool = True) Iterator[Parameter][source]#

Parameter iterator

Parameters:

recurse (bool, optional) – use recursion, by default True

Returns:

parameters iterator

Return type:

Iterator[torch.nn.Parameter]

property pre_activity: Tensor#

Get the pre activity of the layer

Returns:

pre activity tensor

Return type:

torch.Tensor

projected_v_goal() Tensor[source]#

Compute the projected gradient of the goal with respect to the activity of the layer.

dLoss/dA_proj := dLoss/dA - dW B[-1] where A is the pre-activation vector of the layer, and dW the optimal delta for all the previous layers

Returns:

projected gradient of the goal with respect to the activity of the next layer dLoss/dA - dW B[-1]

Return type:

torch.Tensor

reset_computation() None[source]#

Reset the computation of the optimal added parameters.

set_next_modules(next_modules: list[MergeGrowingModule | GrowingModule]) None[source]#

Set the next modules of the current module.

Parameters:

next_modules (list[MergeGrowingModule | GrowingModule]) – list of next modules

Raises:

NotImplementedError – abstract method

set_previous_modules(previous_modules: list[MergeGrowingModule | GrowingModule]) None[source]#

Set the previous modules of the current module.

Parameters:

previous_modules (list[MergeGrowingModule | GrowingModule]) – list of previous modules

Raises:

NotImplementedError – abstract method

sum_in_features(with_bias: bool = False) int[source]#

Count total in_features of previous modules

Parameters:

with_bias (bool, optional) – add bias to the sum, by default False

Returns:

sum of previous in_features

Return type:

int

sum_out_features() int[source]#

Count total out_features of next modules

Returns:

sum of next out_features

Return type:

int

update_computation() None[source]#

Update the computation of the optimal added parameters.

update_scaling_factor(scaling_factor: Tensor | float) None[source]#

Update the scaling factor of all next modules and the _next_module_scaling_factor of the previous modules. Does only work if previous and next modules are GrowingModule.

Parameters:

scaling_factor (torch.Tensor | float) – scaling factor to apply to the optimal delta

Raises:

TypeError – if the previous and next modules are not of type GrowingModule

update_size() None[source]#

Update the size of the module Check number of previous modules and update input channels and tensor sizes