Several deep learning-based tools for RNA 3D structure prediction have recently emerged, including DRfold, DeepFoldRNA, RhoFold, RoseTTAFoldNA, trRosettaRNA, and AlphaFold3. In this study, we systematically evaluate these six models on three datasets: RNA Puzzles, CASP15 RNA targets, and a newly generated large dataset of sequentially distinct RNAs, which serves as a benchmark for generalization capabilities. To ensure a robust evaluation, we also introduce a fourth, more stringent dataset that contains both sequentially and structurally distinct RNAs. We observed that each model predicts the best structure for certain RNAs, and evaluated whether commonly used scoring functions, Rosetta score and ARES, can reliably identify the most accurate structure from the predictions. Finally, since many RNA chains in the Protein Data Bank are part of complexes, we compare the performance of RoseTTAFoldNA and AlphaFold3 in predicting RNA structures within complexes versus isolated RNA chains extracted from these complexes. This comprehensive evaluation highlights the strengths and limitations of current deep learning-based tools and provides valuable insights for advancing RNA 3D structure prediction.