TRATAHI - Bringing Down Language Barriers on the Internet through Human-in-the-Loop Joint Transcription and Translation

/uploadedImages/EBPs/Projects/RamonFAstudillo_.png  Bhiksha Raj  "This project joins the efforts of the Carnegie Mellon University with the INESC-ID research institute in Portugal and the startup Unbabel to bring robust man-in-the-loop joint transcription and translation of audio and video documents. The project provides an exciting opportunity to put in practice existing and prospective research in uncertain representations of information and its integration with human-in-the-loop systems. From an entrepreneurial point of view, the project also represents an opportunity to explore potential products utilizing state of the art research." Ramón Fernandés Astudillo and Bhiksha Raj 
Portuguese PI
Ramón Fernández Astudillo    (INESC ID/IST-UL)
Bhiksha Raj


Research teams: INESC ID/IST-UL; CMU
Organizations: Unbabel 
Funding Reference: FCT CMUP-EPB/TIC/0065/2013 
12 months
Keywords: Automatic Speech Recognition; Machine Translation; Uncertainty propagation; Human in the loop

Every minute, 100 hours of video are uploaded to YouTube and hours of new video content spread daily through the social networks. Video has become a fundamental means for communication on the internet and consequently there is a growing demand to break the barriers imposed by language in this type of content. This project aims to respond to this growing demand by designing a first prototype for human-in-the-loop transcription and translation of audio content. The tool will provide an initial transcription and translation to a target language. The user may then choose to correct either the transcription or translation, or both, according to his or her expertise. Upon correction, the tool will incorporate this new knowledge into the system to produce improved outputs. Through an optimal integration of human and machine, this prototype is expected to produce quality results even for noisy or low quality audio content in an effortless way.


The Phase II of the Carnegie Mellon Portugal Program emphasizes advanced education and research that can lead to significant entrepreneurial impact. The Early Bird Projects are designed to assist small teams of researchers from Portuguese institutions, Carnegie Mellon University and industry partners, to jumpstart high-impact potential activities of strategic relevance for the Program. 

Research Opportunities more