Slow sparse matrix-vector product with symmetric matrices

@kristoffer.carlsson Can the above implementation be improved in your opinion? What would be missing for a PR? All this stuff?