Network Morphism

Network Morphism. Neuron splitting is not the only form of function-preserving network morphism. Indeed, for a linear network and in the absence of activation functions, any decomposition of the weight matrix \(\boldsymbol{W} = \boldsymbol{A}\boldsymbol{B}\) into two shape-compatible matrices is a valid function-preserving morphism. Network Morphism [WWRC16] describes a set of formal requirements for a morphism \(\mathcal{T}\) to be function-preserving. For example, rather than splitting individual neurons, for any matrices \(V \in \mathbb{R}^{k/2\times C_{l-2}}\) and \(Z \in \mathbb{R}^{C_l \times k/2}\), the addition of new neurons with a minus sign inserted

\[\begin{split}\begin{aligned} \boldsymbol{\Psi}= \begin{bmatrix} V \\ V \end{bmatrix}, \qquad \boldsymbol{\Omega}= \begin{bmatrix} Z & -Z \end{bmatrix} \end{aligned}\end{split}\]

ensures that the contributions of the new weights cancel, preserving the network function.

References

[WWRC16]

Tao Wei, Changhu Wang, Yong Rui, and Chang Wen Chen. Network Morphism. In ICML. March 2016. arXiv:1603.01670. URL: http://arxiv.org/abs/1603.01670, doi:10.48550/arXiv.1603.01670.