Last updated on December 5, 2020
Podcast: Play in new window | Download (Duration: 42:29 — 39.0MB) | Embed
Subscribe: Apple Podcasts | Email | RSS | More
In this podcast episode, Tech Talk: Language, ASL, & Machine Learning, we interview software engineer, KJ Price, whose many work projects include the creation of a website to help people find datasets and a Machine Learning model project in conjunction with ASL Machine Learning is a sub-field of Data Science, which is a popular study field that has expanded tremendously in the last decade. Machine Learning is a form of AI or artificial intelligence
Language, as it is used by humans to communicate, involves complex and complicated sets of rules for sounds, sentences, and meaning. The use of language allows humans to collect information into broad bodies of knowledge about life and the world. Future generations draw on that collected knowledge expressed in language and do not have to start from scratch every time.
Language seems to be a uniquely human ability.
But as of yet, there are no unique finding that scientists agree upon that explain the exclusive connection of language to humans. Although chimps and apes have some capacity for language, it is primarily expressed in the use of signs. Their ability to respond to gestures and pointing varies with their environment and exposure to human interaction. One especially bright gorilla named Koko was able to understand a thousand or more signs and two thousand or so words of spoken English. (insert link)
There are many debates about how language skills are acquired by humans.
One of the more prominent language theories came from linguist Noam Chomsky in 1957. He suggested that human beings may be born with an innate understanding of how language works. What language we learn, whether it is French, Arabic, Chinese or sign language is determined by the circumstances in our lives. Chomsky believed, as did/do many linguists, that humans have some kind of genetic wiring that predisposes them to understanding communication structure.
Machines can be taught language type skills using machine learning methods.
Machine Learning works by using datasets and teaching a computer application to recognize patterns in the data so that the machine can perform tasks automatically, where these tasks used to be done manually. An simple example of this type of task would be to “teach” a machine to be able to distinguish between a cat or a dog in an image. After a few hundred examples of images that contain cats and other images that contain dogs, a Machine Learning model could be “taught” to recognize the difference. Then, none would need to manually look through each image and label them as “cat” or “dog” as a computer could label hundreds of these images in just a couple seconds.
Siri uses Machine Learning to handle requests from Apple users. Siri’s process is related to a data science field called “Natural Language Processing” Natural Language Processing, as used by Siri, is an exciting field and a bit of a holy grail to be able to have computers and humans communicate using human language.
Another example of this type of Machine Learning is to teach a machine to tell if a body of text represents a positive attitude or a negative attitude. For example, you could analyze published articles and see if they are in favor or against a political party. This is called “Sentiment Analysis,” where the machine is taught to decide the general sentiment of a body of text.
There are many other ways that Machine Learning can be used in natural language. For example, IBM’s Watson can understand the English language with very high proficiency. Another example is Google Translate. These all use Machine Learning for natural language understanding and natural language generation.
Interestingly, the same Machine Learning algorithms which do really well in handling image recognition, like the cats and dogs example also work really well with natural language understanding too.
“Natural Language” is just another way of saying a “human language.”
More specifically, “Natural Language” is a human language that was created “naturally”. It would be important to distinguish what is not a natural language. There are some languages that are “created in a lab” so to speak. A few of these “synthetic” languages would be Klingon (from Star Trek), Elvish (from JRR Tolkien), and also computer programming languages. Natural languages are languages that evolved over time naturally, for the purpose of humans to be able to communicate with each other. These natural languages include English, German, Spanish, and most spoken languages.
American Sign Language or “ASL” is a natural language.
Sign language was created naturally. Some people believe sign language started in France in the 1700’s where people with hearing disabilities practiced an intricate collection of signs to communicate with each other. Sign language is a bona fide language.
“Natural Language Processing” is when a computer is able to do “something” using real, natural, human language skills.
A recent project, described by our guest and software engineer, KJ Price, involved teaching a computer to understand sign language.
“I worked on the project for my thesis during my graduate degree. My team decided to try to have a computer understand sign language. This had been tried many times before, most notably with the use of haptic gloves to recognize the exact positions of the hands and fingers and the machine was to decide the correct sign related to the orientation of the hands.
We found out that with machine learning we do the same thing with image recognition. At first we wanted the Mercedes Benz of sign language. We wanted it to have all the bells and whistles. We wanted a model to understand and translate full “sentences” from American Sign Language (ASL) into English.
When we first got started, I knew nothing about ASL and thought this could just be a direct translation. It was actually incredibly difficult to do.
There are many challenges in the teaching a machine to recognize ASL
In the first couple of months of working with ASL, we quickly started to understand that we were way over our heads. We came to understand that ASL is absolutely nothing like English. In particular, the “grammar” between the languages are completely different.
Our understanding changed as we came to recognize there are two distinct perspectives when it comes to grappling with language differences.
People in general, the “average Joe,” focus more on on how words are different between different languages, or the etymology of words.
Linguists, on the other hand, are actually much more interested in grammatical differences, or the syntax of a language. Linguists can tell how similar a language is to another language based on the grammar, or how the language structures its words or symbols to make a complete thought. Its not just about word to word exchanges.
Because of this aspect, we did not realize the size of the task of teaching a machine to translate ASL language, and we were quickly overwhelmed as we realized how different the English language is from ASL.
There are dozens of examples of the “syntax” differences with ASL language I could give, but just one scenario is this. In English you might say “I am going to the store”, but in ASL you would probably sign this as “Me store go” or “Me go store”. You can tell right away that there is no distinction between the pronoun “I” and “me” in ASL. Also, articles like “the” are not used in ASL. Also, the subject, in this case “store,” can be placed all around the sentence that the signer is presenting.
Being able to switch the grammar is just one tiny part of fully translating the language. After syntax (or “grammar”), we really care about the semantics of a language, or the real underlying meaning of the phrase. This is a monumental task for any language, and I think this is particularly true for ASL.
It turned out that just being able to translate a single sign into its English equivalent was really difficult and that is as far as we got. We ended up training a model using 30,000 images of ASL signs into their English word equivalents. We weren’t even close to getting into syntax or semantics. But what we were able to get is 100% accuracy and precision on matching an individual sign to the English word. A bit technical but for people that care, we got these metrics off of the unseen testing data to know that our model generalizes well to data that it hasn’t seen yet.
Even doing the most simple task of recognizing single signs turned out to be difficult. For example, the sign for the letter “I” is made by just putting your pinky up in the air. The letter “J” is the exact same as the letter “I” except it has a little flick of the wrist. This makes our machine learning task much more difficult. We couldn’t just look at individual images to create a sign. The sequence of the images was important. This would require using an algorithm that is able to “remember” what the last image was that it was given.
Other considerations to sign language are that not all hands are the same. For hands in men are differently shaped than hands in females. The dataset we used was of a man, so the machine learning algorithm could recognize the signs I performed pretty easily. My partner in the project was not able to get the same results for the simple fact that the machine learning model could not recognize her hand as well. This issue became even more pronounced for different skin tones.
Overall, we were really proud of what we achieved. It took a lot of work. We published the paper which will hopefully help people get to where we got pretty easily. We also documented some of the complexities of the ASL language and Machine Learning.”
To discover more of KJ’s Data Science work and knowledge of language studies, listen to the full podcast, Tech Talk: Language, ASL, & Machine Learning.
Here are some of the topics we discuss in the episode.
3:10 What are some theories about how humans acquire language skills?
5:55 How do machines acquire language skills?
7:00 Isn’t machine learning already going on when we use Siri?
8:00 What is an examples of a machine learning language model in use?
10:00 What does that term “natural language processing” mean?
11:07 Tell us about the project you worked on for ASL- American Sign Language and Machine Learning.
17:60 How far did you get with your Machine Learning project and ASL language translation?
25:20 How do linguists categorize language and how have languages changed over time?
26:00 What family of language does English belong to?
27:40 What are some other language families?
31:30 In your opinion, how has the English language changed over time?
35:10 What do experts say is the evolution of our current modern English Language?