Gradients are Not All You Need

Metz, Luke; Freeman, C. Daniel; Schoenholz, Samuel S.; Kachman, Tal

doi:10.48550/arxiv.2111.05803

Cited by 20 publications

(31 citation statements)

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We used a progressively larger rollout for each round of training: 4, 8, and 12-step losses, corresponding to 1, 2, and 3-day rollouts, for the three rounds of training. Using even larger rollouts is enticing, but there are probably diminishing returns [Metz et al, 2021], and in practice we obtained only slightly worse results when using a 4-step loss throughout.…”

Section: Multi-step Lossmentioning

confidence: 77%

“…As an aside we note that, while it may be tempting to replace this heuristic with a a more end-to-end-learned approach, (i) you would still have to use human judgement to pick a metric to optimize (e.g. globe-averaged Z500 at a 10-day forecast horizon) and (ii) directly optimizing over tens of rollout steps might not be effective [Metz et al, 2021], even if you are able to fit the gradient into GPU memory.…”

Section: Discussionmentioning

confidence: 99%

See 1 more Smart Citation

Forecasting Global Weather with Graph Neural Networks

Keisler¹

2022

Preprint

View full text Add to dashboard Cite

We present a data-driven approach for forecasting global weather using graph neural networks. The system learns to step forward the current 3D atmospheric state by six hours, and multiple steps are chained together to produce skillful forecasts going out several days into the future. The underlying model is trained on reanalysis data from ERA5 or forecast data from GFS. Test performance on metrics such as Z500 (geopotential height) and T 850 (temperature) improves upon previous data-driven approaches and is comparable to operational, full-resolution, physical models from GFS and ECMWF, at least when evaluated on 1-degree scales and when using reanalysis initial conditions. We also show results from connecting this data-driven model to live, operational forecasts from GFS.

show abstract

Section: Multi-step Lossmentioning

confidence: 77%

Section: Discussionmentioning

confidence: 99%

Forecasting Global Weather with Graph Neural Networks

Keisler¹

2022

Preprint

View full text Add to dashboard Cite

show abstract

“…We quite robustly addressed this by isolating nondimensionalized functions that were trivially carefully implemented to obtain the correct asymptotic result in all cases. For more discussions on both numerical and analytical issues with gradients, we refer the interested readers to [Johnson and Fedkiw 2022;Metz et al 2021].…”

Section: Discussionmentioning

confidence: 99%

Analytically Integratable Zero-restlength Springs for Capturing Dynamic Modes unrepresented by Quasistatic Neural Networks

Jin¹,

Han²,

Geng³

et al. 2022

Preprint

View full text Add to dashboard Cite

“…But much of this progress is restricted to systems that rely on gradient descent, a highly effective optimization method when we provide it with a well-defined, differentiable objective function. But in areas such as artificial life, complex systems, computational biology, and even classical physics [17], much of the interesting behaviors we observe take place near the chaotic states, where a system is constantly transitioning between order and disorder. It can be argued that intelligence life and even civilization are all complex systems operating at the edge of chaos [3,15].…”

Section: Introductionmentioning

confidence: 99%

EvoJAX: Hardware-Accelerated Neuroevolution

Tang,

Tian,

2022

Preprint

View full text Add to dashboard Cite

Evolutionary computation has been shown to be a highly effective method for training neural networks, particularly when employed at scale on CPU clusters. Recent work have also showcased their effectiveness on hardware accelerators, such as GPUs, but so far such demonstrations are tailored for very specific tasks, limiting applicability to other domains. We present EvoJAX, a scalable, general purpose, hardware-accelerated neuroevolution toolkit. Building on top of the JAX library, our toolkit enables neuroevolution algorithms to work with neural networks running in parallel across multiple TPU/GPUs. EvoJAX achieves very high performance by implementing the evolution algorithm, neural network and task all in NumPy, which is compiled just-in-time to run on accelerators. We provide extensible examples of EvoJAX for a wide range of tasks, including supervised learning, reinforcement learning and generative art. Since EvoJAX can find solutions to most of these tasks within minutes on a single accelerator, compared to hours or days when using CPUs, we believe our toolkit can significantly shorten the iteration time of conducting experiments for researchers working with evolutionary computation. Our project is available at https://github.com/google/evojax

show abstract

Gradients are Not All You Need

Cited by 20 publications

References 25 publications

Forecasting Global Weather with Graph Neural Networks

Forecasting Global Weather with Graph Neural Networks

Analytically Integratable Zero-restlength Springs for Capturing Dynamic Modes unrepresented by Quasistatic Neural Networks

EvoJAX: Hardware-Accelerated Neuroevolution

Contact Info

Product

Resources

About