On the performance of network parallel training in artificial neural networks


Artificial Neural Networks (ANNs) have received increasing attention in recent years with applications that span a wide range of disciplines including vital domains such as medicine, network security and autonomous transportation. However, neural network architectures are becoming increasingly complex and with an increasing need to obtain real-time results from such models, it has become pivotal to use parallelization as a mechanism for speeding up network training and deployment. In this work we propose an implementation of Network Parallel Training through Cannon’s Algorithm for matrix multiplication. We show that increasing the number of processes speeds up training until the point where process communication costs become prohibitive; this point varies by network complexity. We also show through empirical efficiency calculations that the speedup obtained is superlinear.

In arXiv preprint arXiv:1701.05130


#Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software. #


#Click the Slides button above to demo Academic’s Markdown slides feature. #

#Supplementary notes can be added here, including code and math.