How to grow?¶
Algorithm |
How to grow |
|---|---|
Add blocks with random initialization. |
|
Gradient-based splits and additions. |
|
\(\boldsymbol{\Psi}=0\); maximize gradient norm. |
|
Sparse neuron or edge addition. |
|
Function-preserving neuron splitting. |
|
Function-preserving identity layer insertion. |
|
Function-preserving morphism. |
|
Add orthogonal neurons. |
|
\(\boldsymbol{\Omega}=0\); maximize natural-gradient objective. |
|
Split along the most unstable direction. |
|
Low-rank residual-gradient matching. |
|
Variance-preserving widening with stage-wise learning rates. |