5th Neural Scaling Laws worksop

Actions Panel

5th Neural Scaling Laws worksop

We're organizing the 5th workshop about machine learning models, scaling laws, and much more, through presentations, posters and discussions

By CERC - Autonomous Artificial Intelligence

When and where


Honolulu TBO Honolulu, HI 96813

Refund Policy

No Refunds

About this event

  • 10 hours
  • Mobile eTicket

This is the 5th workshop in the workshop series that started Oct 2021, motivated by recent advances in the rapidly developing area of foundation models - i.e., large-scale neural network models pretrained in an unsupervised way on very large and diverse datasets. Such models often demonstrate significant improvement in their few-shot generalization abilities, as compared to their smaller-scale counterparts, across a wide range of downstream tasks.

Practical information

The 5th NLSW will be help in person in Hawaii. We will have in person and remote presentations.

If you book an onsite ticket. Please make sure you'll attend the workshop onsite as we have a limited number of spots.

Online version is also avaible, please check our registration page.

Schedule is available here and will be update soon.

Main topic

The theme of this particular workshop is on emergent behaviours and phase transitions in deep learning. In recent years, we have seen many emergent capabilities appear with models at scale, such as in-context learning (Brown et al., 2020b), reasoning and systematic generalization in language and vision models (Wei et al. 2022, Ramesh et al. 2021). On the other hand, even toy models have been found to exhibit sudden changes in their behaviour or performance, as was discovered with the grokking phenomenon (Power et al. 2022) and the appearance of induction heads in Transformers (Olsson et al. 2022). Researchers have studied emergent behaviors using various approaches, including scaling laws (Kaplan et al. 2020, Brown et al. 2020a), statistical mechanics (Bahri et al. 2020, Zdeborová 2020), and mechanistic interpretability (Olsson et al., 2022, Nanda et al., 2023).

Why is this topic important and timely?

There are increasingly debates in the machine learning community around scaling laws and emergent phenomena in models . The discussions primarily revolve around the mechanisms that underlie these phenomena and their implications for the design and generalization performance of machine learning models (Tay et al. 2022). A key question is whether scaling laws and phase transitions in machine learning models are purely empirical phenomena or whether they can be explained by fundamental principles of statistical mechanics or other fields of physics. Some groups argue that scaling laws and phase transitions in machine learning are purely phenomenological and can be described empirically (Caballero et al. 2022), while others argue that these phenomena can be explained by deeper principles and mechanisms (Sharma and Kaplan 2022, Bahri et al. 2021). There are strong implications for the design and generalization performance of machine learning models. However, there is no doubt that understanding the scaling behavior of machine learning models is crucial for designing efficient and scalable algorithms. This workshop is timely since the development of new theoretical frameworks and empirical methods for studying these phenomena is likely to be an active area of research in the coming years. Our objective is to encourage more research around understanding the underlying mechanisms behind these emergent phenomena and how these behaviors may be predicted.

Our objectives

A set of questions we aim to address in the workshop includes, but is not limited to, the following:

1. What are the underlying mechanisms and principles that govern the behavior of machine learning models?

2. When and under what conditions can scaling laws be applied to machine learning models?

3. What are the critical behaviors of machine learning models near phase transitions, and how can these behaviors be explained and predicted?

4. Can we find progress measures that underlie sudden performance improvements? See Nanda et al. [2023], Barak et al. [2022].

5. What are the future directions and opportunities for research in emergent phenomena in machine learning, and how can this research contribute to the development of more robust and generalizable machine learning models?


About the organizer