Over the past few years, notable advancements have been achieved in predicting the remaining useful life (RUL) of rotating equipment through deep learning methodologies. However, existing RUL prediction models tend to implement the determination of the first prediction time (FPT) for stage division separately from the RUL prediction, ignoring their potential correlation in the degradation process. In response to this issue, this paper proposed a dual-task prediction network framework based on Patch ModernTCN-Mixer (PMTCN-Mixer), which adaptively and jointly achieved FPT detection and RUL prediction. Firstly, the network designed a hard sharing parameter feature extractor module, Patch ModernTCN, which is used to learn the temporal dependence and spatial correlation of degradation features. Secondly, to eliminate redundant information and noise during the feature extraction phase while enhancing the precision of detection and prognosis, a dynamic semi-soft thresholding (DST) module was constructed. Lastly, the dual-task learning network PMTCN-Mixer was constructed by combining Patch ModernTCN with DST, utilizing GradNorm to balance the gradients between FPT detection and RUL prediction tasks to achieve fusion prediction. The performance of the PMTCN-Mixer framework was validated on the XJTU-SY Bearing Datasets and IEEE PHM 2012 Challenge Datasets, compared with the state-of-the-art network's optimal results, the RUL prediction metrics root mean square error (RMSE), mean absolute error (MAE), Score and were improved by 17.31%, 23.59%, 10.66%, and 4.76%, respectively. The findings confirm that the dual-task prediction model PMTCN-Mixer effectively captures both spatial and temporal semantic information from deterioration data, precisely accomplishes the integration of FPT detection and RUL prediction, and possess good generalization ability and superiority.