Social work case files hold rich detail about the lives and needs of vulnerable groups. Traditional case-reading studies to gain generalisable knowledge are resource-intensive, however, and sample sizes thereby limited. The advent of ‘big data’ technology, and vast repositories of centrally stored electronic records offer social work researchers novel alternatives, including data linkage and predictive risk modelling using administrative data. Free-text documents, however – including assessments, reports, and case chronologies – remain a largely untapped resource. This paper describes how 5000 social work court statements held by the Child and Family Court Advisory Support Service in England (Cafcass) were analysed using natural language processing (NLP) based on simple rules and mathematical principles. Thirteen factors relating to harm and risk to children involved in care proceedings in England were identified by automated computer techniques, and almost 90% agreement with professional readers achieved when the factors were clear-cut. The study represents an innovative approach for social work research on complex social problems. In conclusion, the paper discusses learning points; practical implications; future research avenues; and the technical and ethical challenges of NLP.