In the mid-second decade of new millennium, the development of IT has reached unprecedented new heights. As one derivative of Moore's law, the operating system evolves from the initial 16 bits, 32 bits, to the ultimate 64 bits. Most modern computing platforms are in transition to the 64-bit versions. For upcoming decades, IT industry will inevitably favor software and systems, which can efficiently utilize the new 64-bit hardware resources. In particular, with the advent of massive data outputs regularly, memory-efficient software and systems would be leading the future.In this paper, we aim at studying practical Walsh-Hadamard Transform (WHT). WHT is popular in a variety of applications in image and video coding, speech processing, data compression, digital logic design, communications, just to name a few. The power and simplicity of WHT has stimulated research efforts and interests in (noisy) sparse WHT within interdisciplinary areas including (but is not limited to) signal processing, cryptography. Loosely speaking, sparse WHT refers to the case that the number of nonzero Walsh coefficients is much smaller than the dimension; the noisy version of sparse WHT refers to the case that the number of large Walsh coefficients is much smaller than the dimension while there exists a large number of small nonzero Walsh coefficients. Clearly, general Walsh-Hadamard Transform is a first solution to the noisy sparse WHT, which can obtain all Walsh coefficients larger than a given threshold and the index positions. In this work, we study efficient implementations of very large dimensional general WHT. Our work is believed to shed light on noisy sparse WHT, which remains to be a big open challenge. Meanwhile, the main idea behind will help to study parallel data-intensive computing, which has a broad range of applications.