Free

Natural Language Processing Symposium

Event Information

Share this event

Date and Time

Location

Location

661 University Ave

661 University Avenue

Toronto, ON M5G 1M1

Canada

View Map

Event description
The Vector Institute is hosting a Natural Language Processing (NLP) Symposium showcasing the NLP project

About this Event

Natural Language Processing Symposium

September 15 & 16, 2020

10:00 am - 1:30 pm EST

To be held virtually

The Vector Institute is hosting a Natural Language Processing (NLP) Symposium showcasing the NLP project with academic-industry collaborators to facilitate interaction between our industry sponsors, researchers, students and faculty members.

In June 2019, Vector Institute launched a multi-phase industry-academic collaborative project focusing on recent advances in NLP. Participants replicated a state-of-the-art NLP model called BERT and fine tuned a transfer learning approach to optimize domain-specific tasks in areas such as health, law and finance.

To follow-up on the outcomes of the project, a two-day symposium will be held featuring presentations and hands-on workshops, delivered by the project participants and Vector researchers.

The symposium will support knowledge transfer and provide an exclusive opportunity for Vector’s industry sponsors to engage with talent in the NLP domain.

Workshop Information:

Level of workshops: Beginner/Intermediate

Required skill set: Fundamentals of machine learning and deep learning; knowledge of Language modelling and/or transformers; experience programming in Python and any of the deep learning frameworks (Tensorflow, Pytorch); experience using GPUs for accelerated deep learning training; experience in using jupyter notebook and/or Google Colab.

** Participants must be individuals actively involved in NLP research and/or development*

September 15 Workshops:

WS1: Performing down-stream NLP tasks with transformers

Facilitators: Nidhi Arora, Intact, Faiza Khan Khattak, Manulife, Max Tian, former Goldspot

Training NLP models from scratch requires large amounts of computational resources that may not be financially feasible for most organizations. By leveraging pre-trained models and transfer learning, we can fine-tune NLP models for a specific task at a fraction of the time and resources. In this workshop, we will explore how to use HuggingFace to fine-tune Transformer models to perform specific downstream tasks. The purpose of this workshop is to provide learning through demonstration and hands-on experience.

WS2: Distributed multi-node pre-training

Facilitators: Jacob Lin, Vector Institute, University of Toronto, Gennady Pekhimenko, Assistant Professor,Department of Computer Science, University of Toronto, Faculty Member, Vector Institute, Canada CIFAR Artificial Intelligence Chair, Filippo Pompili, Thomson Reuters, Kuhan Wang, CIBC

n order to significantly reduce the training time when dealing with large datasets we will demonstrate multi-node distributed training; this allows us to efficiently parallelize the training updates of deep neural networks across multiple nodes.

September 16:

WS3: How to use Using Fairseq (Facebook AI Research Sequence-to-Sequence Toolkit

Facilitators: Joey Cheng, Machine Learning Research Scientist, Layer 6, Gary Huang, Machine Learning Research Scientist, Layer 6, Felipe Perez, Senior Machine Learning Research Scientist, Layer 6

Fairseq library is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling and other text generation tasks. In this short workshop we will walk participants through the basics of the Fairseq library. We will dive into their codebase and learn how to modify existing modules to create and keep track of new applications. The purpose of this workshop is to provide learning through demonstration and hands-on experience.

WS4: WS2: Kaggle COVID-19 Open Research Dataset Challenge

Facilitators: TBD

In response to the COVID-19 pandemic, the White House and a coalition of leading research groups have prepared the COVID-19 Open Research Dataset (CORD-19).

This workshop includes Kaggle CORD19 challenge notebooks and demonstrates how to apply text mining tools that could help the medical community develop answers to high priority scientific questions. The CORD-19 dataset represents the most extensive machine-readable coronavirus literature collection available for data mining to date. This allows the worldwide AI research community the opportunity to apply text and data mining approaches to find answers to questions within, and connect insights across, this content in support of the ongoing COVID-19 response efforts worldwide.

Who should attend:

- Individuals who are interested in learning more about natural language processing

- Vector sponsors involved in the NLP project

- Technical experts from Vector Sponsor companies

- Vector PGA, Alumni, Scholarship recipient students interested in NLP

Please visit our events section of the Vector website for the most up to date event information.

September 15: Agenda

10:00 am - 10:10 am Opening Remarks

10:10 am - 10:40 am Keynote Presentation, Kyunghyun Cho, Associate Professor of Computer Science and Data Science, New York University

10:40 am - 11:00 am Keynote Presentation, Jimmy Ba, Assistant Professor, Department of Computer Science, University of Toronto, Machine Learning Group, University of Toronto, Faculty Member, Vector Institute, Canada CIFAR Artificial Intelligence Chair

11:00 am - 11:20 am Keynote Presentation, Gennady Pekhimenko, Assistant Professor,Department of Computer Science, University of Toronto, Faculty Member, Vector Institute, Canada CIFAR Artificial Intelligence Chair

11:20 am - 12 noon Project Presentations

12:00 noon - 12:30 pm Networking and Poster Session

12:30 noon - 1:30 pm Workshops

WS1: Performing down-stream NLP tasks with Transformers

WS3: Distributed multi-node pre-training

September 16: Agenda

10:00 am - 10:10 am Opening Remarks, Cameron Schuler, Chief Commercialization Officer and VP, Industry Innovations, Vector Institute

10:10 am - 10:40 am Keynote Presentation

10:40 am - 11:00 am Keynote Presentation , Frank Rudzicz, Associate Scientist, International Centre for Surgical Safety, Li Ka Shing Institute, St. Michael’s Hospital, Associate Professor Department of Computer Science, University of Toronto, Director of AI, Surgical Safety Technologies Inc., Co-Founder, WinterLight Labs Inc., Faculty Member, Vector Institute, Canada CIFAR Artificial Intelligence Chair

11:00 am - 11:30 am Project Presentations

11:30 am - 12:15 pm Panel Discussion: Business impact

Moderator: Frank Rudzicz, Associate Scientist International Centre for Surgical Safety, Li Ka Shing Institute, St. Michael's Hospital, Associate Professor, Department of Computer Science, University of Toronto, Director of AI, Surgical Safety Technologies Inc., Co-Founder, WinterLight Labs Inc., Faculty Member, Vector Institute, Canada CIFAR Artificial Intelligence Chair

Panelists: Khalid Al-Kofahi, Vice President Research and Operations, Thomson Reuters

Stephanie Lapierre, Chief Executive Officer, Tealbook

Yevgeniy Vahlis, Head of Artificial Intelligence Capabilities, BMO Financial Group

Ozge Yeloglu, VP Enterprise Advanced Analytics, CIBC

12:00 noon - 12:30 pm Networking and Poster Session

12:30 noon - 1:30 pm Workshops

WS1: How to use Using Fairseq (Facebook AI Research Sequence-to-Sequence Toolkit

WS2: Kaggle COVID-19 Open Research Dataset Challenge

Date and Time

Location

661 University Ave

661 University Avenue

Toronto, ON M5G 1M1

Canada

View Map

Save This Event

Event Saved