SENN¶

SENN [MMKM24] extends GradMax by considering a natural gradient descent step instead of standard gradient descent and switching from \(\boldsymbol{\Psi}= 0\) to \(\boldsymbol{\Omega}= 0\). It maximizes, using K-FAC approximation, the gradient for the natural gradient norm of \(\begin{pmatrix} \boldsymbol{\Omega}\\ W_l \end{pmatrix}\). However, in practice, SENN maximizes the norm of the gradient of \(\boldsymbol{\Omega}\) only, backpropagating the residual gradient \(\boldsymbol{G}^\perp\) to avoid redundancy with existing neurons. SENN uses this norm as a trigger, extending layers when and where it is above a predefined threshold.

References¶

[MMKM24]

Rupert Mitchell, Robin Menzenbach, Kristian Kersting, and Martin Mundt. Self-Expanding Neural Networks. 2024. arXiv:2307.04526. URL: http://arxiv.org/abs/2307.04526, doi:10.48550/arXiv.2307.04526.