Risk assessments of hand-intensive and repetitive work are commonly done using observational methods, and it is important that the methods are reliable and valid. However, comparisons of the reliability and validity of methods are hampered by differences in studies, e.g., regarding the background and competence of the observers, the complexity of the observed work tasks and the statistical methodology. The purpose of the present study was to evaluate six risk assessment methods, concerning inter- and intra-observer reliability and concurrent validity, using the same methodological design and statistical parameters in the analyses. Twelve experienced ergonomists were recruited to perform risk assessments of ten video-recorded work tasks twice, and consensus assessments for the concurrent validity were carried out by three experts. All methods’ total-risk linearly weighted kappa values for inter-observer reliability (when all tasks were set to the same duration) were lower than 0.5 (0.15–0.45). Moreover, the concurrent validity values were in the same range with regards to total-risk linearly weighted kappa (0.31–0.54). Although these levels are often considered as being fair to substantial, they denote agreements lower than 50% when the expected agreement by chance has been compensated for. Hence, the risk of misclassification is substantial. The intra-observer reliability was only somewhat higher (0.16–0.58). Regarding the methods ART (Assessment of repetitive tasks of the upper limbs) and HARM (Hand Arm Risk Assessment Method), it is worth noting that the work task duration has a high impact in the risk level calculation, which needs to be taken into account in studies of reliability. This study indicates that when experienced ergonomists use systematic methods, the reliability is low. As seen in other studies, especially assessments of hand/wrist postures were difficult to rate. In light of these results, complementing observational risk assessments with technical methods should be considered, especially when evaluating the effects of ergonomic interventions.