This paper presents an architecture and implementation issues of an "almost-all" optical packet switch that does not rely on recirculating loops for storage implementation. The architecture is based on two rearrangeably nonblocking stages interconnected by optical delay lines with different amounts of delay. We investigate the probability of loss and the switch latency as a function of link utilization and of the size of the switch. In general, with proper setting of the number of delay lines, the switch can achieve an arbitrarily low probability of loss. Growability patterns and extension of the design to the dense wavelength division multiplexing (WDM) case are also shown. In particular, we discuss an extension to the architecture whereby, through the use of WDM, the switch capacity may be increased several times, with only minor changes to the switch design. Additionally, issues involving practical implementation of such a switch are discussed. For example, we show a scheme that allows optical packet synchronization for the synchronously-operated switch. Using this scheme, the switch may be a central component in the design of future all-optical, packet-switched networks.