Memory disaggregation architecture is an emerging alternative to the traditional server architecture to address the issue of memory under-utilization in data centers. Memory disaggregation offers a large pool of remote memory connected via high-speed network fabric, thereby allows independent upgrade of memory components. Disaggregated design gives better adaptability of servers to the ever-increasing memory requirement from applications, such as HPC workload, video and big data analytics. However, disaggregrated memory architectures cause significantly large memory access latency due to the limited speed of interconnects currently used. In this paper, we propose a rack-scale disaggregated memory architecture and discuss the various software/hardware design aspects for such an architecture. We analyze the remote memory access latency in this architecture and observe that a significant part of the latency arises out of not only the interconnect, but also due to the contention at the queues of remote memory. To address this issue, we propose two-phase memory allocation policies for disaggregated memory environments that could significantly reduce the average memory access latency, compared to the conventional policies. We built a simulator that uses a trace-based front-end module and a cycle-accurate memory simulator to evaluate the memory access latency in a disaggregated memory system. We conduct experiments with different allocation policies across various benchmarks, representing applications with diverse memory access patterns. Our study shows encouraging results toward adoption of rack-scale memory disaggregation with acceptable memory latency, that could further be reduced with other system-level optimizations.