Neural Network Efficiency?

Neural Network Efficiency?

I’m sure that this is covered by some advanced AI/Machine Learning class that I haven’t made it to yet, but I wanted to capture it now, just in case.

I’ve been giving some thought (okay, not much, yet) to putting a number to the ‘efficiency’ of a neural network. Specifically, being able to make a statement regarding the number of neurons and synapses used to map an input to an output. Arguably this number would inform actual ‘power use’ efficiency in some fashion, but I’m honestly less interested in that.

Far more interesting to me is observing how compactly an RNN can represent the transform it is performing.

Some questions that I would like to be able to answer for some smallish neural nets (e.g. the MNIST digits, or recognizing faces):

  • How many neurons are being reused for what we might consider wildly disparate tasks? This is more of a network visualization issue than it is an efficiency issue, though high-reuse obviously indicates a ‘more efficient’ network.
  • Relative efficiency between training methods, algorithms, learning rates, cost functions etc.
  • ???? I’m sure I’ll find more questions in the future.

I stumbled on to this question while thinking about the function of sleep in human learning and memory formation. As we sleep we consolidate what we’ve learned, strengthen memories, etc. We do this by dreaming – or, perhaps, dreams are the result of this process. It depends on how one defines dreaming I suppose. In any case, when we do something new, like practice a new fingering on the guitar, we create a set of weak memories of that action. When we next sleep, our brains replay the memory of that action, strengthening it – but also ‘trimming’ the network of neurons that aren’t really needed to perform the task[1].

Obviously we could come up with hundreds of different ways of measuring the efficiency of a neural network. And many measures would have to take into account the task. Is there a generalizable form? Something like

( Foo ) * (validation set accuracy?) * (input count) * (values per input) / ((non-input neuron count) * (synapses))

Foo might turn out to be optional, but I see it being some function of activations of each neuron averaged over some set of inputs ( like the training set). E.g. run the training set and count the number of times each neuron fires.

Hm. It occurs to me in writing the above that a step that calculates this ‘Foo’ could also be used to find neurons that never fire, and possibly neurons whose activations never result in the activation of another neuron. Although probably not the latter, at least cheaply. Though, the amazing power of neural nets is due in great part to the avoidance of building in too many prior assumptions, so probably best to muss too much with what already works incredibly well. Still, fun stuff to play with.

There’s probably a set of nice statistics-based algorithms already developed just for these types of questions. It will be interesting to revisit this post after I’ve seen them. Regardless, I’ll play with this more soon – once I’ve finished this MineSweeper bot…

[1] I believe this to be true, but can’t cite a source. The brain obviously has methods in place to prevent simple tasks from using all the neurons in the head.