en

ACCENTED ENGLISH AUTOMATIC SPEECH RECOGNITION WORKSHOP & CHALLENGE

Organizers

  • ACCENTED ENGLISH AUTOMATIC SPEECH RECOGNITION WORKSHOP & CHALLENGE_Organizers

    CCF Task Force on

    Speech Dialogue

    and Auditory Processing

  • ACCENTED ENGLISH AUTOMATIC SPEECH RECOGNITION WORKSHOP & CHALLENGE_Organizers

    SHAANXI PROVINCIAL KEY LABORATORY

    OF SPEECH & LMAGE

    INFORMATION PROCESSING

  • ACCENTED ENGLISH AUTOMATIC SPEECH RECOGNITION WORKSHOP & CHALLENGE_Organizers

    xi'an software park

  • ACCENTED ENGLISH AUTOMATIC SPEECH RECOGNITION WORKSHOP & CHALLENGE_Organizers

    Shanxi Kunpeng 

    Ecological

    Innovation Center

  • ACCENTED ENGLISH AUTOMATIC SPEECH RECOGNITION WORKSHOP & CHALLENGE_Organizers

    Speech Lab, Shanghai Jiao

    Tong University, China

  • ACCENTED ENGLISH AUTOMATIC SPEECH RECOGNITION WORKSHOP & CHALLENGE_Organizers

    School of Computer Science and

    Engineering, Nanyang Technological

    University, Singapore

  • ACCENTED ENGLISH AUTOMATIC SPEECH RECOGNITION WORKSHOP & CHALLENGE_Organizers

    Center for Language and Speech

    Processing, John Hopkins

    University, United States

  • ACCENTED ENGLISH AUTOMATIC SPEECH RECOGNITION WORKSHOP & CHALLENGE_Organizers

    Datatang (Beijing) Techn

    ology Co., Ltd.

  • ACCENTED ENGLISH AUTOMATIC SPEECH RECOGNITION WORKSHOP & CHALLENGE_
  • CHALLENGE BACKGROUND

    INTERSPEECH2020 Accented English Speech Recognition Workshop

    INTERSPEECH has grown into the world's largest technical conference focused on speech processing and application with over 1000 attendees and over 600 papers. The conferences emphasize interdisciplinary approaches addressing all aspects of speech science and technology, ranging from basic theories to advanced applications.

    As the flagship technical activity, the Accented English Automatic Speech Recognition Workshop will be held On October 25, 2020 in Shanghai. Our award ceremony will be held during the workshop. In this workshop, Datatang will join hands with CCF Task Force on Speech Dialogue and Auditory Processing and the Northwestern Polytechnical University Audio Speech and Language Processing Research Group ([email protected]) to apply for organizing an Accent English Speech Recognition Workshop as well as launch an Accent English Speech Recognition Challenge. The workshop also receives strong support from the Software Park Development Center of Xi'an High-tech Industrial Technology Development Zone.

  • CHALLENGE INTRODUCTION

    Accented English Speech Recognition Chanllenge

    English is the most influential universal language in the world. English speech recognition is also one of the most concerned areas in both academia and industry. At present, advanced ASR systems have achieved good effect and meet most requirements for standard English. In accent English field, however, recognizing English speech with accents still remains a challenging task. The difficulties in building an accent English ASR system mainly arise from the diversity of pronunciation accuracy, intonation speed and pronunciation of some syllables. On the other hand, the shortage of accent speech data limits the relevant research.

    The Interspeech 2020 Accented English Speech Recognition Challenge (AESRC) will open 8 sets of accented English data from different countries to the participants, covering various pronunciation characteristics and accents, aiming to promote the discussion and exchange on English language research and accent speech recognition. It is expected that all researchers from academia and industry can learn from each other and truly gain by participating our challenge & workshop.

    Computing resources will be provided by Huawei

  • ACCENTED ENGLISH AUTOMATIC SPEECH RECOGNITION WORKSHOP & CHALLENGE_

TRACK SETTING

Track1

Accents Recognition

To use the official various accents English training data to train language classification models. To submit the language recognition results on the test set

Note: There are no restrictions on the models and training techniques used, but any other official datasets will be banned. The accuracy rate of language recognition is the only criterion.
Track2

English Speech Recognition With Different Accents.

To use the official various accents English training data to train language classification models. To submit the language recognition results on the test set. There will be some accents data not included in the training datasets to test the generalization performance of the models.

Note: The use of model fusion techniques, including ROVER, will be prohibited, and audio training data is limited to the 160 hours of official accented English speech data. Don’t use the textual information not belong to the corresponding transcripts of the audio data for the language models training. And the speech data augmentation can only be based on the restricted data.

Specified data

20 hours labelled speech data for each accent (Russia, Korea, US, Portugal, Japan, India, UK, China), 160 hours in total, officially provided by DataTang, to the participants.

Duration

20 hours ×8

Language & Accent

Accented English from Russia, Korea, US, Portugal, Japan, India, UK, China

Speaker

40 – 110 speakers per accent

Audio Format

16kHz, 16bit, single channel wav

Recording environment

Indoor, mobile phone

Speech content

Daily communication, interaction with smart devices, etc

Datasets will be released with metadata files organized as in the following format

FIELD

DESCRIPTION

SEX

Speaker gender

AGE

Speaker age

ACT

Accent type

MIT

Recording device

SCC

Recording environment

LBR

Utterance duration

ORS

Raw text

Librispeech data is also permitted to use in both tracks. (http://www.openslr.org/12/

Challenge Schedule

Awards

Note:All the prize amounts include the tax.

International Scientific Committee

(Names listed in no particular order)

Lei Xie

Professor

Northwestern Polytechnical University, China

Yanmin Qian

Associate Professor

Shanghai Jiao Tong University, China

ShinjiWatanabe

Associate Research Professor

John Hopkins University, United States

ChngEngSiong

Associate Professor

Nanyang Technological University, Singapore

Qiangze Feng

CTO

Datatang(Beijing)Technology Co.,Ltd, China

Participants

Open to the whole society such as colleges, scientific research institutes, Internet companies and other personnel can register for the challenge.

Note: The challenge organizers and technical support units such as the employees who have the access to the business, products and data about the challenge will automatically withdraw from the challenge and give up the qualifications.

Registration

  • If you are interested in the challenge, please contact us by email to [email protected]
  • Download the registration form (either English or Chinese version), fill in the information, and send it to the email address above. The registration deadline is Aug 20 2020.
  • The organizing committee will review and verify the qualifications of the participating teams within 5 working days. The teams that have passed the review will sign the challenge data usage agreement, and qualified to join the challenge.
  • The training data will be announced on Aug 21 2020, and the data downloading method will be provided to the participants who have passed the review and signed the agreement.
Download

Anti-cheating Statement

  • Participants are forbidden to submit multiple applications, and the results will be cancelled.

  • Participants are prohibited from using any other ways that outside of the designated assessment such as loopholes in the rules or technical loopholes, additional data or other undesirable ways to improve the ranking of results. Once found, the results will be cancelled.

All rights reserved by Data Palace (Beijing) Technology Co.

SOLUTIONS

Please fill in your name

Mobile phone format error

Please enter the phone number

Please fill in the full name of the company

Please fill in your e-mail

Requirement description cannot be empty!

Successful submission! Thank you for your support.

Format error, Please fill in again

Confirm

Minimum 5 characters required!

No data available

Terms Privacy Datatang. All Rights Reserved. Legal statement and privacy policy

*Name:

*Phone:

*Company:

*E-mail:

*Requirement:

数据堂_datatang