In this article, we present a parallel geometric multigrid algorithm for solving variable-coefficient elliptic partial differential equations on the unit box (with Dirichlet or Neumann boundary conditions) using highly nonuniform, octree-based, conforming finite element discretizations. Our octrees are 2:1 balanced, that is, we allow no more than one octree-level difference between octants that share a face, edge, or vertex. We describe a parallel algorithm whose input is an arbitrary 2:1 balanced fine-grid octree and whose output is a set of coarser 2:1 balanced octrees that are used in the multigrid scheme. Also, we derive matrix-free schemes for the discretized finite element operators and the intergrid transfer operations. The overall scheme is second-order accurate for sufficiently smooth right-hand sides and material properties; its complexity for nearly uniformwhere N is the number of octree nodes and np is the number of processors. Our implementation uses the Message Passing Interface standard. We present numerical experiments for the Laplace and Navier (linear elasticity) operators that demonstrate the scalability of our method. Our largest run was a highly nonuniform, 8-billion-unknown, elasticity calculation using 32,000 processors on the Teragrid system, "Ranger," at the Texas Advanced Computing Center. Our implementation is publically available in the Dendro library, which is built on top of the PETSc library from Argonne National Laboratory. 1. Introduction. Various physical and biological processes are modeled using elliptic operators such as the Laplacian operator. They are encountered in diffusive transportation of energy, mass and momentum [21], electromagnetism [29], quantum mechanics [30], models for tumor growth [4], protein folding and binding [43], and cardiac electrophysiology [45]. They are also used in nonphysical applications such as mesh generation [51], image segmentation [27], and image registration [42].The finite element method is a popular technique for solving elliptic partial differential equations (PDEs) numerically. Finite element methods require grid generation (or meshing) to generate function approximation spaces. Regular grids are easy to generate but can be quite expensive when the solution of the PDE problem is highly localized. Localized solutions can be captured more efficiently using nonuniform or unstructured grids. However, the flexibility of unstructured grids comes at a price: they are difficult to construct in parallel, they are difficult to precondition, they incur the overhead of explicitly constructing element-to-node connectivity information, and