Background: The scientific understanding of complex systems and deep neural networks (DNNs) are among the unsolved important problems of science; and DNNs are evidently complex systems. Meanwhile, conservative symmetry arguably is the most important concept of physics, and P.W. Anderson, Nobel Laureate in physics, speculated that increasingly sophisticated broken symmetry in many-body systems correlates with increasing complexity and functional specialization. Furthermore, in complex systems such as DNA molecules, different nucleotide sequences consist of different weak bonds with similar free energy; and energy fluctuations would break the symmetries that conserve the free energy of the nucleotide sequences, which selected by the environment would lead to organisms with different phenotypes.Purpose: When the molecule is very large, we might speculate that statistically the system poses in a state that would be of equal probability to transit to a large number of adjacent possible states; that is, an adaptive symmetry whose breaking is selected by the feedback signals from the environment. In physics, quantitative changes would accumulate into qualitative revolution where previous paradoxical behaviors are reconciled under a new paradigm of higher dimensionality (e.g., wave-particle duality in quantum physics). This emergence of adaptive symmetry and complexity might be speculated as accumulation of sophistication and quantity of conservative symmetries that lead to a paradigm shift, which might clarify the behaviors of DNNs.Results: In this work, theoretically and experimentally, we characterize the optimization process of a DNN system as an extended symmetry-breaking process where novel examples are informational perturbations to the system that breaks adaptive symmetries. One particular finding is that a hierarchically large DNN would have a large reservoir of adaptive symmetries, and when the information capacity of the reservoir exceeds the complexity of the dataset, the system could absorb all perturbations of the examples and self-organize into a functional structure of zero training errors measured by a certain surrogate risk. In this diachronically extended process, complexity emerges from quantitative accumulation of adaptive-symmetries breaking.Method: More specifically, this process is characterized by a statistical-mechanical model that could be appreciated as a generalization of statistics physics to the DNN organized complex system, and characterizes regularities in higher dimensionality. The model consists of three constitutes that could be appreciated as the counterparts of Boltzmann distribution, Ising model, and conservative symmetry, respectively: (1) a stochastic definition/interpretation of DNNs that is a multilayer probabilistic graphical model, (2) a formalism of circuits that perform biological computation, (3) a circuit symmetry from which self-similarity between the microscopic and the macroscopic adaptability manifests. The model is analyzed with a method referred as the statistical as...