CNN Architectures for Large-Scale Audio Classification

Event Information

Share this event

Date and Time

Location

Location

Microsoft Canada

222 Bay Street

#Suite 1201

Toronto, ON M5K 1E7

Canada

View Map

Refund Policy

Refund Policy

Contact the organizer to request a refund.

Eventbrite's fee is nonrefundable.

Event description
CNN Architectures for Large-Scale Audio Classification

About this Event

Event and livestreaming details

If you are an AISC member, please use your discount code. If you're interested in becoming an AISC member, please fill out the registration form.

Lead: Amber Ma

Facilitators: Abul Hassan Sheikh, Zak Semenov

Convolutional Neural Networks (CNNs) have proven very effective in image classification and show promise for audio. We use various CNN architectures to classify the soundtracks of a dataset of 70M training videos (5.24 million hours) with 30,871 video-level labels. We examine fully connected Deep Neural Networks (DNNs), AlexNet [1], VGG [2], Inception [3], and ResNet [4]. We investigate varying the size of both training set and label vocabulary, finding that analogs of the CNNs used in image classification do well on our audio classification task, and larger training and label sets help up to a point. A model using embeddings from these classifiers does much better than raw features on the Audio Set [5] Acoustic Event Detection (AED) classification task.

Link to the paper.

Please note: If you arrive late the doors will be closed and someone will need to come get you. Message us on Slack (#center_circle channel) if you're stuck downstairs. We won't be checking Slack after the circles end (6:15), so please try your best to be there before that.

Date and Time

Location

Microsoft Canada

222 Bay Street

#Suite 1201

Toronto, ON M5K 1E7

Canada

View Map

Refund Policy

Contact the organizer to request a refund.

Eventbrite's fee is nonrefundable.

Save This Event

Event Saved