BackgroundSelf-management is essential to caring for high-need, high-cost (HNHC) populations. Advances in mobile phone technology coupled with increased availability and adoption of health-focused mobile apps have made self-management more achievable, but the extent and quality of the literature supporting their use is not well defined.ObjectiveThe purpose of this review was to assess the breadth, quality, bias, and types of outcomes measured in the literature supporting the use of apps targeting HNHC populations.MethodsData sources included articles in PubMed and MEDLINE (National Center for Biotechnology Information), EMBASE (Elsevier), the Cochrane Central Register of Controlled Trials (EBSCO), Web of Science (Thomson Reuters), and the NTIS (National Technical Information Service) Bibliographic Database (EBSCO) published since 2008. We selected studies involving use of patient-facing iOS or Android mobile health apps. Extraction was performed by 1 reviewer; 40 randomly selected articles were evaluated by 2 reviewers to assess agreement.ResultsOur final analysis included 175 studies. The populations most commonly targeted by apps included patients with obesity, physical handicaps, diabetes, older age, and dementia. Only 30.3% (53/175) of the apps studied in the reviewed literature were identifiable and available to the public through app stores. Many of the studies were cross-sectional analyses (42.9%, 75/175), small (median number of participants=31, interquartile range 11.0-207.2, maximum 11,690), or performed by an app’s developers (61.1%, 107/175). Of the 175 studies, only 36 (20.6%, 36/175) studies evaluated a clinical outcome.ConclusionsMost apps described in the literature could not be located on the iOS or Android app stores, and existing research does not robustly evaluate the potential of mobile apps. Whereas apps may be useful in patients with chronic conditions, data do not support this yet. Although we had 2-3 reviewers to screen and assess abstract eligibility, only 1 reviewer abstracted the data. This is one limitation of our study. With respect to the 40 articles (22.9%, 40/175) that were assigned to 2 reviewers (of which 3 articles were excluded), inter-rater agreement was significant on the majority of items (17 of 30) but fair-to-moderate on others.