BackgroundRelatively small changes to gene expression data dramatically affect co-expression networks inferred from that data which, in turn, can significantly alter the subsequent biological interpretation. This error propagation is an underappreciated problem that, while hinted at in the literature, has not yet been thoroughly explored. Resampling methods (e.g. bootstrap aggregation, random subspace method) are hypothesized to alleviate variability in network inference methods by minimizing outlier effects and distilling persistent associations in the data. But the efficacy of the approach assumes the generalization from statistical theory holds true in biological network inference applications.ResultsWe evaluated the effect of bootstrap aggregation on inferred networks using commonly applied network inference methods in terms of stability, or resilience to perturbations in the underlying expression data, a metric for accuracy, and functional enrichment of edge interactions.ConclusionBootstrap aggregation results in improved stability and, depending on the size of the input dataset, a marginal improvement to accuracy assessed by each method’s ability to link genes in the same functional pathway.Electronic supplementary materialThe online version of this article (10.1186/s12859-018-2402-0) contains supplementary material, which is available to authorized users.