"On-Demand" Remote Sign Language Interpretation
Kitch Barnicle, Gregg Vanderheiden, Al Gilman
Trace R & D Center, University of Wisconsin - Madison
Experiments were carried out at the Supercomputing 99 (SC99) conference in Portland, Oregon to assess the feasibility of providing sign language "interpreter-on-demand" services to conference attendees who are deaf. In the future such "pop-up-interpreters" could be accessed through standard web browsers so that interpreters would be reachable on any web-capable device with a video display and fast enough connection.
Traditionally, interpreters provide sign language interpretation services in-person for individuals who are deaf. However, in-person delivery limits the availability of these services. Web-based video communication technology promises to widen access to interpretation services. Access to remote interpreters via the web can eliminate the time that interpreters spend traveling to a location to provide services. This can lower cost and increase the availability of interpreters. Similarly, "on-demand" interpreters can be used to provide interpretation for as little as a few minutes at a time, rather than a two-hour minimum, leading to an additional cost savings. Finally, with remote access, interpreters from anywhere in the world, including interpreters with expertise in a particular discipline, can be hired. As people and businesses gain access to the web, on-demand, anytime, anywhere interpretation services become feasible.
Sign language interpretation involves rapid hand, arm and finger movements, changes in facial expressions and lip movements. These fast, often small movements can be difficult to detect unless the video achieves high fidelity both in detail and in timing. Data communication over today's commodity Internet is subject to performance limitations and fluctuations which degrade video fidelity to an unacceptable degree. Fortunately, working with SC99 provided us access to advanced networks and we were able to avoid this problem and carry out the experiments.
The two primary objectives for this project were to 1) demonstrate to the high performance computing community the potential application of high-speed networks for the provision of remote sign language interpretation and 2) to develop an understanding of the technical issues surrounding the provision of remote sign language interpretation over high performance and wireless networks.
- Interpretation of Keynote and Plenary Sessions - Interpreters at the remote site listened to the keynote and plenary sessions on a speaker phone and signed the session. The video image of the interpreter was sent back to the convention center via Microsoft NetMeeting and the Internet2. This image was projected onto an 8' screen in a room that held over a 1000 people.
- Interpretation of Informal Conversations - An individual who is deaf used an interpreter-on-demand during informal conversations. He carried a Sony PictureBook mini-notebook computer with a wireless network connection as he roamed the convention center. Upon request, a remote interpreter signed informal conversions. Audio and video were transmitted back and forth via NetMeeting and a wireless network. The interpreters image was displayed on the PictureBook.
- Individualized Interpretation of a Conference Session - Tests were also carried out to see if the wireless system could provide the user with support during an individual conference session. A wireless assisted listening device was used to feed the audio from the speaker's presentation into the PictureBook and then to the interpreter over the wireless and Internet infrastructure of the conference. Since the PictureBook had a built in camera, the user could also sign back to the interpreter to confirm a sign or request clarification.
- Interpretation Delivered through a Head Mounted Display - A final series of tests were out carried using a head mounted display (HMD). With this configuration, the user was able to view the presenter and presentation screen by looking "through" the HMD while simultaneously viewing the interpreter on the HMD.
Findings and Next Steps
- Feedback from both the user who was deaf and the interpreters who were present at the SC99 sessions indicated that the remote interpretation provided for the keynote and plenary sessions was of sufficient quality to convey the content of the sessions. Although no direct measurements were possible with the setup in place, it was estimated that frame rates of 20-25 per second were attained. Future experiments will include performance monitoring techniques.
- Although this remote interpretation provided sufficient content delivery, feedback suggested that simultaneous text captions would also be useful. Future experiments will combine remote sign interpretation and captions.
- Despite the high bandwidth of the Internet2, momentary freezes in the video still occurred, although they were few and only amounted to a lost word or so. Thus, high bandwidth alone is not sufficient to support remote interpretation. Future experiments will attempt to identify and address potential end-to-end limiting factors such as software buffering, processing speed, video capture devices, and network bottlenecks.
- Interpreters at the remote site, who were unable to view the user who was deaf, the speaker or the overheads, commented on the missing contextual information that would be available to in-person interpreters, such as the speaker's body language, presentation slides, room layout, and visual confirmation of understanding from clients. Future experiments will include testing two-way video transmissions that will provide the remote interpreters with access to visual information in the environment.
- The Sony PictureBook served as a valuable remote interpretation tool. Its small size and built in camera provided a convenient device for two-way sign communication, although the performance provided over the wireless network varied. Industry developments in the area of mobile devices with video capabilities will be monitored and considered for future implementations.
- The head mounted display led to user eye fatigue after relatively short viewing periods. Future experiments will include displays that do not require the user to gaze upwards to view the display and that do not attenuate the background image, in this case the presenter and slide screen.
The research team sought to integrate off-the-shelf hardware and software and high speed networks in order to demonstrate a useful and practical application of these technologies, the delivery of remote interpretation services. These experiments, as well as developments in the areas of networks with Quality of Service (QoS) capabilities, high speed networks, and mobile devices suggest that remote interpretation services are feasible and can be practical in the near future.
By working with research programs and emerging commercial services, the goal is to eventually create mechanisms for combining computer speech recognition and translation technologies with human assistance when and where needed to yield low cost text and sign language "interpretation on demand." Even before these types of devices can become a standard tool, "pop-up interpreter" windows could be built into standard browsers so that wherever there was a browser, there could be an interpreter.
This project was funded by the National Institute on Disability and Rehabilitation Research (NIDRR) and the Education Outreach and Training Program (EOT) of the Partnership for Advanced Computation Infrastructure which is funded by the National Science Foundation (NSF).