gromo.utils.tools.compute_optimal_added_parameters#
- gromo.utils.tools.compute_optimal_added_parameters(matrix_s: Tensor | None, matrix_n: Tensor, numerical_threshold: float = 1e-06, statistical_threshold: float = 0.001, maximum_added_neurons: int | None = None, alpha_zero: bool = False, omega_zero: bool = False, ignore_singular_values: bool = False) tuple[Tensor, Tensor, Tensor][source]#
Compute the optimal added parameters for a given layer.
This function operates on primitive options, not method names.
- Parameters:
matrix_s (torch.Tensor | None) – Square matrix S of shape (s, s). If None, identity matrix is used.
matrix_n (torch.Tensor) – Matrix N (correlation matrix) of shape (s, t).
numerical_threshold (float) – Threshold to consider an eigenvalue as zero in square root of inverse of S
statistical_threshold (float) – Threshold to consider a singular value as zero in the SVD
maximum_added_neurons (int | None) – Maximum number of added neurons, if None all significant neurons are kept
alpha_zero (bool) – If True, set alpha (incoming weights) to zero, else compute from SVD.
omega_zero (bool) – If True, set omega (outgoing weights) to zero, else compute from SVD.
ignore_singular_values (bool) – If True, ignore the actual singular values and treat them as 1 for computing alpha and omega, effectively only using the singular vectors for the update direction.
- Returns:
torch.Tensor – Optimal added weights alpha, shape (k, s).
torch.Tensor – Optimal added weights omega, shape (t, k).
torch.Tensor – Singular values s, shape (k,).
- Raises:
torch.linalg.LinAlgError – If SVD of S^{-1/2} N fails.