The universal approximation theorem established the density of specific families of neural networks in the space of continuous functions and in certain Bochner spaces, defined between any two Euclidean spaces. We extend and refine this result by proving that there exist dense neural network architectures on a larger class of function spaces and that these architectures may be written down using only a small number of functions. We prove that upon appropriately randomly selecting the neural networks architecture's activation function we may still obtain a dense set of neural networks, with positive probability. This result is used to overcome the difficulty of appropriately selecting an activation function in more exotic architectures.Conversely, we show that given any neural network architecture on a set of continuous functions between two T0 topological spaces, there exists a unique finest topology on that set of functions which makes the neural network architecture into a universal approximator. Several examples are considered throughout the paper.