gromo.utils.tools.compute_optimal_added_parameters#

gromo.utils.tools.compute_optimal_added_parameters(matrix_s: Tensor | None, matrix_n: Tensor, numerical_threshold: float = 1e-06, statistical_threshold: float = 0.001, maximum_added_neurons: int | None = None, alpha_zero: bool = False, omega_zero: bool = False, ignore_singular_values: bool = False) tuple[Tensor, Tensor, Tensor][source]#

Compute the optimal added parameters for a given layer.

This function operates on primitive options, not method names.

Parameters:
  • matrix_s (torch.Tensor | None) – Square matrix S of shape (s, s). If None, identity matrix is used.

  • matrix_n (torch.Tensor) – Matrix N (correlation matrix) of shape (s, t).

  • numerical_threshold (float) – Threshold to consider an eigenvalue as zero in square root of inverse of S

  • statistical_threshold (float) – Threshold to consider a singular value as zero in the SVD

  • maximum_added_neurons (int | None) – Maximum number of added neurons, if None all significant neurons are kept

  • alpha_zero (bool) – If True, set alpha (incoming weights) to zero, else compute from SVD.

  • omega_zero (bool) – If True, set omega (outgoing weights) to zero, else compute from SVD.

  • ignore_singular_values (bool) – If True, ignore the actual singular values and treat them as 1 for computing alpha and omega, effectively only using the singular vectors for the update direction.

Returns:

  • torch.Tensor – Optimal added weights alpha, shape (k, s).

  • torch.Tensor – Optimal added weights omega, shape (t, k).

  • torch.Tensor – Singular values s, shape (k,).

Raises:

torch.linalg.LinAlgError – If SVD of S^{-1/2} N fails.