I Don't Like NumPy

https://news.ycombinator.com/rss Hits: 11
Summary

They say you can’t truly hate someone unless you loved them first. I don’t know if that’s true as a general principle, but it certainly describes my relationship with NumPy. NumPy, by the way, is some software that does computations on arrays in Python. It’s insanely popular and has had a huge influence on all the popular machine learning libraries like PyTorch. These libraries share most of the same issues I discuss below, but I’ll stick to NumPy for concreteness. NumPy makes easy things easy. Say A is a 5×5 matrix, x is a length-5 vector, and you want to find the vector y such that Ay=x. In NumPy, that would be: y = np.linalg.solve(A, x) So elegant! So clear! But say the situation is even a little more complicated. Say A is a stack of 100 5×5 matrices, given as a 100×5×5 array. And say x is a stack of 100 length-5 vectors, given as a 100×5 array. And say you want to solve Aᵢyᵢ=xᵢ for 1≤i≤100. If you could use loops, this would be easy: y = np.empty_like(x) for i in range(100): y[i,:] = np.linalg.solve(A[i,:,:], x[i,:]) But you can’t use loops. To some degree, this is a limitation of loops being slow in Python. But nowadays, everything is GPU and if you’ve got big arrays, you probably don’t want to use loops in any language. To get all those transistors firing, you need to call special GPU functions that will sort of split up the arrays into lots of little pieces and process them in parallel. The good news is that NumPy knows about those special routines (at least if you use JAX or CuPy), and if you call np.linalg.solve correctly, it will use them. The bad news is that no one knows how do that. Don’t believe me? OK, which of these is right? y = linalg.solve(A,x) y = linalg.solve(A,x,axis=0) y = linalg.solve(A,x,axes=[[1,2],1]) y = linalg.solve(A.T, x.T) y = linalg.solve(A.T, x).T y = linalg.solve(A, x[None,:,:]) y = linalg.solve(A,x[:,:,None]) y = linalg.solve(A,x[:,:,None])[:,:,0] y = linalg.solve(A[:,:,:,None],x[:,None,None,:]) y = linalg.solve(A.transpose([1,2,0...

First seen: 2025-05-15 18:40

Last seen: 2025-05-16 04:42