Dr Berntsen 1 makes some interesting points about our review on the ability of accelerometry-based devices to estimate energy expenditure in adults and children. 2 Conducting a meta-analysis may have indeed strengthened the review. The decision not to conduct a metaanalysis was, however, made for a number of reasons. As discussed in the review, a major finding during compilation of the review was the need for a standardized protocol or the development of best practices to guide research with accelerometers. As it stands, there is large variation in the characteristics of sample populations, protocols employed, outcome measures, data extraction methods, and analytical methods. In an attempt to minimize some of this variation, and include samples that are reasonably representative of the general population, only studies of apparently healthy adults and children were included in the review. Instead of selecting discrete groups such as children (younger than 10 years), adolescents (10 to 20 years), adults (20 to 60 years), and older adults (60 years and older) however, many studies included participants within a narrow age range of one to two years e.g. 12 to 14 years, 3 within a wide age range e.g. 15 to 61 years, 4 or participants exclusively of one sex. 5 Pooling this data would provide little insight into how variable results would be if the sample population was more representative of the general population.A further problem with combining results was the lack of appropriate analytical methods used by studies. The majority of studies compared group estimates from the monitor of interest and the criterion method, which would allow for pooling of data but may be misleading. This analysis may reveal non-significant differences between groups despite there being significant errors in individual estimates. Although many studies reported correlation coefficients as an indication of the association between the estimated energy expenditure and the criterion measure, correlations are not sufficient for establishing monitor validity. 6,7 It is possible for two measures to be correlated but provide different estimates. The aim when validating a monitor is ultimately to determine how much the monitor differs from an established measure, and if the difference is minimal enough not to cause problems when making clinical decisions or developing guidelines. Bland-Altman 95% limits of agreement provide the range within which most differences between measurements by two methods will lie by calculating the mean difference between the methods ¡1.96 standard deviations of the differences. 8 If differences within the observed limits of agreement are not clinically important the two measurement methods can be used interchangeably. Few studies calculated limits of agreement, making it difficult to determine if the difference between the monitor and the criterion measure is acceptable. It is also not appropriate to pool Bland-Altman limits of agreement as it would be impossible to interpret the clinical relevance of the pooled limits of...