We implement parallel and distributed versions of the sparse matrix-vector product and the sequence of matrixvector product operations, using OpenMP, MPI, and the ARM SVE intrinsic functions, for different matrix storage formats. We investigate the efficiency of these implementations on one and two A64FX processors, using a variety of sparse matrices as input. The matrices have different properties in size, sparsity and regularity. We observe that a parallel and distributed implementation shows good scaling on two nodes for cases where the matrix is close to a diagonal matrix, but the performances degrade quickly with variations to the sparsity or regularity of the input.