Visual pitch estimation
In this work, we propose the task of automatically estimating pitch (fundamental frequency) from video frames of violin playing using vision alone. Here, we consider only monophonic violin playing (where only one note is being played at a time). In order to investigate this task, we curate a new dat...
Main Authors: | Koepke, A, Wiles, O, Zisserman, A |
---|---|
Format: | Conference item |
Published: |
Society for Sound and Music Computing
2019
|
Similar Items
-
Sight to Sound: An End-to-End Approach for Visual Piano Transcription
by: Koepke, S, et al.
Published: (2020) -
X2Face: A network for controlling face generation using images, audio, and pose codes
by: Wiles, O, et al.
Published: (2018) -
Self-supervised learning of class embeddings from video
by: Wiles, O, et al.
Published: (2020) -
X2Face: A network for controlling face generation using images, audio, and pose codes
by: Wiles, O, et al.
Published: (2018) -
Self-supervised learning of a facial attribute embedding from video
by: Wiles, O, et al.
Published: (2018)