American Sign Language, or ASL, is the preferred language of people born deaf in the United States. Signers communicate through the use of hand and body movement along with facial expression. It is important to recognize that ASL is not a direct gestural expression of English; it is its own unique language. The Deaf community, a minority group of about 500,000 in both the United States and Canada, views the use of ASL as an integral part of their culture where they grow, develop and share similar experiences.
ASL is used by both the totally deaf and the hard of hearing. Although some of them can read, speak, or read lips in order to successfully communicate with the hearing, this proportion of the community is small, and is mainly composed of those who lost their hearing later in life. Most often an interpreter is needed in order for a successful exchange to take place between Deaf and hearing people. Unfortunately, interpreters are expensive to hire, and are often unavailable on short notice or for short, simple exchanges. For these scenarios we are creating a digital interpreter that can use voice recognition to translate speech into ASL, providing the necessary bridge between the hearing and the Deaf communities.
A critical component of an automatic translation system is a signing avatar capable of producing the flowing motion characteristic of ASL. However, current avatars are not well accepted by the Deaf community. They prefer recordings of humans because avatar are currently too mechanical.
Creating the motion is not simply a matter of concatenating animations of individual signs (lexical items of ASL). ASL signs can change depending on the context in which they are used. Coarticulation is the study of context-dependent language changes that occur at the subphonemic level. Recently, linguists have begun to study coarticulation in ASL and have found clear analogs with coarticulation in spoken language. Although a French research group has begun exploring the use of coarticulation in French sign language generation, no one has considered incorporating coarticulation into ASL generation.
This project proposes the incorporation of coarticulation into a generative model of ASL animation with the goal of creating an avatar capable of more natural motion. The proposed project will be part of a larger, ongoing effort to create a 3D animation program that synthesizes depictions of ASL that are not only grammatically correct, but appear realistic and are easy to comprehend.
The major questions we will address are
It has been a busy fall term. Marie and Melissa both presented in the 2013 International Symposium on Sign Language Translation and Avatar Technology (sltat.cs.depaul.edu) and received valuable feedback on their work.
This section reviews each of the four project questions, and gives a status update on each one.
This is not a straightforward matter. Developing a list of known coarticulation types has involved an intensive literature search, in both the linguistic and computer science disciplines. Some of the most important work in this area is a set of dissertations published in French, and currently, we are finishing the translation of the most relevant passages.
We have identified two known challenges involving data availability. The first involved identifying known characteristics that could be used to identify coarticulation. The French effort involved French Sign Language (LSF), and there were only a few characterizations, the most salient being that signs are produced more quickly in isolation than they are in the context of a sentence. To be fair, the French study addresses a very restrictive domain (train station announcements) so there was a very sparse data set available for analysis.
The second challenge is the availability of ASL corpora. There is only one that is published, but it is not in a format that is amenable for analysis. To address this, we wrote a separate conversion application. We are clearing up a few issues with the newly-formatted data. Some issues stem from bugs in the conversion application, but others are intrinsic in the original corpus. Our next step here is to attempt running some statistical analyses on the data, which appears to be noisy.
Even if the results from the analyses are less than statistically significant at the 0.05 level, we are hopeful that we can use the trends in the data to craft a coarticulation model that would be applied to sign language animation.
Our plans are to generate animations that incorporate coarticulation and to ask members of the Deaf community to evaluate them for clarity, acceptability and naturalness. We need to have some definitive results from questions 2 and 3 before we can move forward on this question. However, while we are working on the corpus analysis, we are also working on improving the avatar that will carry the animation. Previous feedback from members of the Deaf community indicated that refinements to the facial expressions of affect, and improving the mouth positions would help with the clarity and acceptability of any of the animations we produce.
Study conducted by: Marie Stumbo and Melissa Bialek. Their blogs are:
Research supervisors: Dr. Rosalee Wolfe and Dr. John C. McDonald
The CREU project is sponsored by the Computing Research Association Committee on the Status of Women in Computing Research (CRA-W) and the Coalition to Diversify Computing (CDC). Funding for this project is provided by the National Science Foundation.