Liwaiwai Liwaiwai
  • /
  • Artificial Intelligence
  • Machine Learning
  • Robotics
  • Engineering
    • Architecture
    • Design
    • Software
    • Hybrid Cloud
    • Data
  • About
Liwaiwai Liwaiwai
  • /
  • Artificial Intelligence
  • Machine Learning
  • Robotics
  • Engineering
    • Architecture
    • Design
    • Software
    • Hybrid Cloud
    • Data
  • About
  • Machine Learning

Translating Lost Languages Using Machine Learning

  • October 23, 2020
  • relay

Recent research suggests that most languages that have ever existed are no longer spoken. Dozens of these dead languages are also considered to be lost, or “undeciphered” — that is, we don’t know enough about their grammar, vocabulary, or syntax to be able to actually understand their texts.

Lost languages are more than a mere academic curiosity; without them, we miss an entire body of knowledge about the people who spoke them. Unfortunately, most of them have such minimal records that scientists can’t decipher them by using machine-translation algorithms like Google Translate. Some don’t have a well-researched “relative” language to be compared to, and often lack traditional dividers like white space and punctuation. (To illustrate, imaginetryingtodecipheraforeignlanguagewrittenlikethis.)

However, researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) recently made a major development in this area: a new system that has been shown to be able to automatically decipher a lost language, without needing advanced knowledge of its relation to other languages. They also showed that their system can itself determine relationships between languages, and they used it to corroborate recent scholarship suggesting that the language of Iberian is not actually related to Basque.

The team’s ultimate goal is for the system to be able to decipher lost languages that have eluded linguists for decades, using just a few thousand words.

Spearheaded by MIT Professor Regina Barzilay, the system relies on several principles grounded in insights from historical linguistics, such as the fact that languages generally only evolve in certain predictable ways. For instance, while a given language rarely adds or deletes an entire sound, certain sound substitutions are likely to occur. A word with a “p” in the parent language may change into a “b” in the descendant language, but changing to a “k” is less likely due to the significant pronunciation gap.

Read More  AI Should Augment Human Creativity, Not Replace It

By incorporating these and other linguistic constraints, Barzilay and MIT PhD student Jiaming Luo developed a decipherment algorithm that can handle the vast space of possible transformations and the scarcity of a guiding signal in the input. The algorithm learns to embed language sounds into a multidimensional space where differences in pronunciation are reflected in the distance between corresponding vectors. This design enables them to capture pertinent patterns of language change and express them as computational constraints. The resulting model can segment words in an ancient language and map them to counterparts in a related language.

The project builds on a paper Barzilay and Luo wrote last year that deciphered the dead languages of Ugaritic and Linear B, the latter of which had previously taken decades for humans to decode. However, a key difference with that project was that the team knew that these languages were related to early forms of Hebrew and Greek, respectively.

With the new system, the relationship between languages is inferred by the algorithm. This question is one of the biggest challenges in decipherment. In the case of Linear B, it took several decades to discover the correct known descendant. For Iberian, the scholars still cannot agree on the related language: Some argue for Basque, while others refute this hypothesis and claim that Iberian doesn’t relate to any known language.

The proposed algorithm can assess the proximity between two languages; in fact, when tested on known languages, it can even accurately identify language families. The team applied their algorithm to Iberian considering Basque, as well as less-likely candidates from Romance, Germanic, Turkic, and Uralic families. While Basque and Latin were closer to Iberian than other languages, they were still too different to be considered related.

Read More  Taking The Guesswork Out Of Dental Care With Artificial Intelligence

In future work, the team hopes to expand their work beyond the act of connecting texts to related words in a known language — an approach referred to as “cognate-based decipherment.” This paradigm assumes that such a known language exists, but the example of Iberian shows that this is not always the case. The team’s new approach would involve identifying semantic meaning of the words, even if they don’t know how to read them.

“For instance, we may identify all the references to people or locations in the document which can then be further investigated in light of the known historical evidence,” says Barzilay. “These methods of ‘entity recognition’ are commonly used in various text processing applications today and are highly accurate, but the key research question is whether the task is feasible without any training data in the ancient language.”      .

The project was supported, in part, by the Intelligence Advanced Research Projects Activity (IARPA).

 

By Adam Conner-Simons, MIT CSAIL

Source https://www.csail.mit.edu/news/translating-lost-languages-using-machine-learning

relay

Related Topics
  • CSAIL
  • Lost Languages
  • MIT
  • Translation
You May Also Like
View Post
  • Artificial Intelligence
  • Data
  • Data Science
  • Machine Learning
  • Technology

Google Data Cloud & AI Summit : In Less Than 12 Hours From Now

  • March 29, 2023
View Post
  • Artificial Intelligence
  • Machine Learning
  • Technology

ChatGPT 4.0 Finally Gets A Joke

  • March 27, 2023
View Post
  • Artificial Intelligence
  • Machine Learning
  • Technology

Mr. Cooper Is Improving The Home-buyer Experience With AI And ML

  • March 24, 2023
View Post
  • Artificial Intelligence
  • Machine Learning
  • Technology

GPT-4 : The Latest Milestone From OpenAI

  • March 24, 2023
View Post
  • Engineering
  • Machine Learning

Peacock: Tackling ML Challenges By Accelerating Skills

  • March 23, 2023
View Post
  • Data
  • Machine Learning
  • Platforms

Coop Reduces Food Waste By Forecasting With Google’s AI And Data Cloud

  • March 23, 2023
View Post
  • Artificial Intelligence
  • Machine Learning
  • Robotics

Gods In The Machine? The Rise Of Artificial Intelligence May Result In New Religions

  • March 23, 2023
View Post
  • Artificial Intelligence
  • Machine Learning

6 ways Google AI Is Helping You Sleep Better

  • March 21, 2023

Leave a Reply

Your email address will not be published. Required fields are marked *

Stay Connected!
LATEST
  • 1
    DBS Singapore: The Best Boasting To Be The Best For So Long, Humbled By Hubris
    • March 31, 2023
  • 2
    Bard And ChatGPT — A Head To Head Comparison
    • March 31, 2023
  • 3
    Modernize Your Apps And Accelerate Business Growth With AI
    • March 31, 2023
  • 4
    Why Your Open Source Project Needs A Content Strategy
    • March 31, 2023
  • 5
    From Raw Data To Actionable Insights: The Power Of Data Aggregation
    • March 30, 2023
  • 6
    Effective Strategies To Closing The Data-Value Gap
    • March 30, 2023
  • 7
    Unlocking The Secrets Of ChatGPT: Tips And Tricks For Optimizing Your AI Prompts
    • March 29, 2023
  • 8
    Try Bard And Share Your Feedback
    • March 29, 2023
  • 9
    Google Data Cloud & AI Summit : In Less Than 12 Hours From Now
    • March 29, 2023
  • 10
    Talking Cars: The Role Of Conversational AI In Shaping The Future Of Automobiles
    • March 28, 2023

about
About
Hello World!

We are liwaiwai.com. Created by programmers for programmers.

Our site aims to provide materials, guides, programming how-tos, and resources relating to artificial intelligence, machine learning and the likes.

We would like to hear from you.

If you have any questions, enquiries or would like to sponsor content, kindly reach out to us at:

[email protected]

Live long & prosper!
Most Popular
  • 1
    Introducing GPT-4 in Azure OpenAI Service
    • March 21, 2023
  • 2
    Document AI Introduces Powerful New Custom Document Classifier To Automate Document Processing
    • March 28, 2023
  • 3
    How AI Can Improve Digital Security
    • March 27, 2023
  • 4
    ChatGPT 4.0 Finally Gets A Joke
    • March 27, 2023
  • 5
    Mr. Cooper Is Improving The Home-buyer Experience With AI And ML
    • March 24, 2023
  • /
  • Artificial Intelligence
  • Machine Learning
  • Robotics
  • Engineering
  • About

Input your search keywords and press Enter.