Check out "Linear Algebra and Learning from Data" by Gilbert Strang. It includes a nice introduction to Linear Algebra, touches on relevant statistics and optimization, then puts them all together in chapters on neural networks. It's a textbook, so exercises are included.