What is Real-time Closed Captioning?
Real-time closed captioning is the process of creating text from audible speech broadcast on television video programming and/or online streaming video.
The text data is encoded into the transmission at the broadcast facilities to be displayed on televisions equipped with decoder chips, or computer browsers with caption decoding software, with the readable text displayed on video screens to enable Deaf and hard-of-hearing viewers to follow the context of the broadcasts.
The captioned text is created by individuals using stenographic-based systems, or voice-captioning systems based on speech-to-text technologies, combined with stenographic theories, as they view the programming live in real-time along with the general audience.
Closed Captioning was developed as a federally mandated technology in the 1980s to provide accessibility to Deaf and hard-of-hearing individuals to all broadcast video programming. In 1990 Congress passed the Television Decoder Circuitry Act requiring all televisions sold in the U.S. with a screen 13 inches or larger to be equipped with a chip to decode captioning text encoded in a broadcaster’s TV signal.
In 1997, the FCC implemented a Report and Order which mandated closed captioning to be phased in on all video programming networks (broadcast and cable) starting in 2000, with the goal of achieving 20 hours a day of closed captioning on every broadcast and cable network in the U.S.
Real-time closed captioning also is very useful as a tool promoting the use and comprehension of language by individuals learning to read or studying English as a second language. Its use has become ubiquitous among the general public, i.e., individuals not using it as an assistive access tool.
Closed Captioning promotes accessibility to program content in news, sports, entertainment, government or business, large gatherings, and other scenarios, for a wide range of individuals whose access to real-time information is essential.
AI vs Real-time Captioning:
What’s In Your Head-end?
Advances in the field of artificial intelligence spurred by developments in neural network research, computer processing power, and, in some respects, the Internet, have given rise to a myriad of speculation about deployment possibilities for this technology in the development of speech-to-text applications.
Although recent advances have been impressive, current speech-to-text usage fails to reach 90% accuracy on even the most common transcription or speech-to-text applications, at least from commercially available systems.
Cognitive resolution of words to produce accurate replication of speech-to-text is still elusive, and awaits further advances in neural network/cognitive recognition by these computing systems.
Stations contemplating the use of such systems should assess whether these systems are capable of rendering the level of accuracy, let alone other elements of quality that will serve individuals in need of such an accessibility tool, particularly when emergencies or emergency coverage arises.
Stations can ill afford to transmit incomplete or inaccurate information during emergency announcements/live coverage of special reports. Real-time captioning by highly skilled, trained captioners is the best and most reliable technology available today to ensure that timely, accurate, potentially life-saving information can be delivered to individuals dependent on captioning for vital information.
In this respect, real-time captioning implemented on regularly scheduled programming for stations located in areas prone to emergency weather circumstances, and in large DMAs, should view the use of real-time captioning as another form of business insurance.
The potential liability to a broadcaster by failing to provide accurate information, or any information which provides accessible information through captioning in an emergency situation, particularly if injury or death follows such failure, could lead to penalties.
The cost of maintaining an AI backbone to ensure highly accurate captioning should be considered by stations, i.e., cost of on-site management, including engineering and IT network support for such a system.
“What’s in your head-end?” is the question engineering personnel need to consider, to assess all the costs of benefits before implementing speech-to-text systems not ready for prime time.