Skip to main content

Using the Universal Category System and Deep Learning to Automate Audio Categorization and Embedded Metadata.

Class of 2025
Major:
Computer Science
Minor:
Outdoor Studies
Michael (Cooper) Anderson is a Computer Science major and Outdoor Studies minor at St. Lawrence University. As a St. Lawrence University Fellow, he has developed deep learning models for audio classification using TensorFlow and PyTorch. Cooper is also an experienced Teaching Assistant, helping students with programming languages and fundamental CS...
Semester:
Summer 2024
Description

During the St. Lawrence Summer Fellowship, Cooper A. focused on addressing the challenges faced by sound editors and designers due to the lack of standardized organization in large audio databases. The project involved curating a dataset of over 77,000 audio files, totaling 151 hours, from various sound editors, including his father’s work. Using machine learning frameworks like TensorFlow and PyTorch, the project developed deep learning models and audio spectrogram transformers to categorize different sound effects. A data processing pipeline was designed to efficiently handle the large volume of audio data, streamlining the categorization process and making it possible to categorize backlogged libraries. The project aimed to build an AI engine capable of automatically determining audio file categories according to the Universal Category System (UCS) v8.2. This effort resulted in a large, organized, and efficient audio dataset, contributing valuable advancements to the field of AI and audio processing. The project also navigated challenges related to device storage and hardware limitations, ultimately producing a developed and efficient model.

44.585943850622, -75.15043258667
0

Share: