As a key component of electromechanical equipment in the intelligent manufacturing process, rolling bearings play an important role to secure a safe, stable, and efficient operation. Deep learning can be used to guide a data-driven fault diagnosis which requires that all data are independently identically distribution (i.i.d). When the equipment is operated with multiple working conditions, the collected samples violates the assumption of i.i.d, which will inevitably make it difficult to extract accurate feature involved in the data. This paper proposes a deep learninag based fault diagnosis model to recursively fuse the multiscale feature on cross working conditions, such that data without working condition label can also be referred to train a satisfying deep learning model for fault diagnosis of bearing operated in multiple working conditions. In the case when only a small number size of training samples for a separated working condition are available, the proposes fusion mechanism aims to establish a jointly learning mechanism between different working conditions. To verify the effectiveness of the proposed algorithm, experimental validation was performed using the Case Western Reserve University (CWRU) rolling bearing public data set. The experimental results show that the proposed method can make full use of a small amount of labeled data with working conditions and a large amount of labeled data without working conditions. In ten types of fault diagnosis tasks with different fault sizes, the fault diagnosis accuracy reaches more than 94% for 4 working conditions and more than 86% for 8 working conditions.INDEX TERMS Fault diagnosis, feature fusion, multiple working conditions, rolling bearing.