"On-Demand" Remote Sign Language Interpretation
Experiments were carried out at the Supercomputing 99 (SC99) conference in
Portland, Oregon to assess the feasibility of providing sign language
"interpreter-on-demand" services to conference attendees who are
deaf. In the future such "pop-up-interpreters" could be accessed
through standard web browsers so that interpreters would be reachable on any
web-capable device with a video display and fast enough connection.
Traditionally, interpreters provide sign language interpretation services
in-person for individuals who are deaf. However, in-person delivery limits the
availability of these services. Web-based video communication technology
promises to widen access to interpretation services. Access to remote
interpreters via the web can eliminate the time that interpreters spend
traveling to a location to provide services. This can lower cost and increase
the availability of interpreters. Similarly, "on-demand" interpreters
can be used to provide interpretation for as little as a few minutes at a time,
rather than a two-hour minimum, leading to an additional cost savings. Finally,
with remote access, interpreters from anywhere in the world, including
interpreters with expertise in a particular discipline, can be hired. As people
and businesses gain access to the web, on-demand, anytime, anywhere
interpretation services become feasible.
Sign language interpretation involves rapid hand, arm and finger movements,
changes in facial expressions and lip movements. These fast, often small
movements can be difficult to detect unless the video achieves high fidelity
both in detail and in timing. Data communication over today's commodity
Internet is subject to performance limitations and fluctuations which degrade
video fidelity to an unacceptable degree. Fortunately, working with SC99
provided us access to advanced networks and we were able to avoid this problem
and carry out the experiments.
The two primary objectives for this project were to 1) demonstrate to the
high performance computing community the potential application of high-speed
networks for the provision of remote sign language interpretation and 2) to
develop an understanding of the technical issues surrounding the provision of
remote sign language interpretation over high performance and wireless
- Interpretation of Keynote and Plenary Sessions -
Interpreters at the remote site listened to the keynote and plenary sessions on
a speaker phone and signed the session. The video image of the interpreter was
sent back to the convention center via Microsoft NetMeeting and the Internet2.
This image was projected onto an 8' screen in a room that held over a 1000
- Interpretation of Informal Conversations - An individual
who is deaf used an interpreter-on-demand during informal
conversations. He carried a Sony PictureBook mini-notebook computer with a
wireless network connection as he roamed the convention center. Upon request, a
remote interpreter signed informal conversions. Audio and video were
transmitted back and forth via NetMeeting and a wireless network. The
interpreters image was displayed on the PictureBook.
- Individualized Interpretation of a Conference Session -
Tests were also carried out to see if the wireless system could provide the
user with support during an individual conference session. A wireless assisted
listening device was used to feed the audio from the speaker's presentation
into the PictureBook and then to the interpreter over the wireless and Internet
infrastructure of the conference. Since the PictureBook had a built in camera,
the user could also sign back to the interpreter to confirm a sign or request
- Interpretation Delivered through a Head Mounted Display -
A final series of tests were out carried using a head mounted display (HMD).
With this configuration, the user was able to view the presenter and
presentation screen by looking "through" the HMD while simultaneously
viewing the interpreter on the HMD.
Findings and Next Steps
- Feedback from both the user who was deaf and the interpreters who were
present at the SC99 sessions indicated that the remote interpretation provided
for the keynote and plenary sessions was of sufficient quality to convey the
content of the sessions. Although no direct measurements were possible with the
setup in place, it was estimated that frame rates of 20-25 per second were
attained. Future experiments will include performance monitoring techniques.
- Although this remote interpretation provided sufficient content delivery,
feedback suggested that simultaneous text captions would also be useful. Future
experiments will combine remote sign interpretation and captions.
- Despite the high bandwidth of the Internet2, momentary freezes in the video
still occurred, although they were few and only amounted to a lost word or so.
Thus, high bandwidth alone is not sufficient to support remote interpretation.
Future experiments will attempt to identify and address potential end-to-end
limiting factors such as software buffering, processing speed, video capture
devices, and network bottlenecks.
- Interpreters at the remote site, who were unable to view the user who was
deaf, the speaker or the overheads, commented on the missing contextual
information that would be available to in-person interpreters, such as the
speaker's body language, presentation slides, room layout, and visual
confirmation of understanding from clients. Future experiments will include
testing two-way video transmissions that will provide the remote interpreters
with access to visual information in the environment.
- The Sony PictureBook served as a valuable remote interpretation tool. Its
small size and built in camera provided a convenient device for two-way sign
communication, although the performance provided over the wireless network
varied. Industry developments in the area of mobile devices with video
capabilities will be monitored and considered for future implementations.
- The head mounted display led to user eye fatigue after relatively short
viewing periods. Future experiments will include displays that do not require
the user to gaze upwards to view the display and that do not attenuate the
background image, in this case the presenter and slide screen.
The research team sought to integrate off-the-shelf hardware and software
and high speed networks in order to demonstrate a useful and practical
application of these technologies, the delivery of remote interpretation
services. These experiments, as well as developments in the areas of networks
with Quality of Service (QoS) capabilities, high speed networks, and mobile
devices suggest that remote interpretation services are feasible and can be
practical in the near future.
By working with research programs and emerging commercial services, the
goal is to eventually create mechanisms for combining computer speech
recognition and translation technologies with human assistance when and where
needed to yield low cost text and sign language "interpretation on
demand." Even before these types of devices can become a standard tool,
"pop-up interpreter" windows could be built into standard browsers so
that wherever there was a browser, there could be an interpreter.
This project was funded by the National Institute on Disability and
Rehabilitation Research (NIDRR) and the Education Outreach and Training Program
(EOT) of the Partnership for Advanced Computation Infrastructure which is
funded by the National Science Foundation (NSF).