Liwaiwai Liwaiwai
  • /
  • Artificial Intelligence
  • Machine Learning
  • Robotics
  • Engineering
    • Architecture
    • Design
    • Software
    • Hybrid Cloud
    • Data
  • Learning
  • About
  • /
  • Artificial Intelligence
  • Machine Learning
  • Robotics
  • Engineering
    • Architecture
    • Design
    • Software
    • Hybrid Cloud
    • Data
  • Learning
  • About
Liwaiwai Liwaiwai
  • /
  • Artificial Intelligence
  • Machine Learning
  • Robotics
  • Engineering
    • Architecture
    • Design
    • Software
    • Hybrid Cloud
    • Data
  • Learning
  • About
  • Artificial Intelligence

Dynatask, A New Paradigm Of AI Benchmarking Is Now Available For The AI Community

  • October 1, 2021
  • liwaiwai.com

It’s been one year since Facebook AI launched Dynabench, a first-of-its-kind platform that radically rethinks benchmarking in AI. Starting today, we’re unlocking Dynabench’s full capabilities for the AI community — AI researchers can now create their own custom tasks to better evaluate the performance of natural language processing (NLP) models in more flexible, dynamic, and realistic settings for free.

This new feature, called Dynatask, makes it easy for researchers to leverage human annotators to actively fool NLP models and identify weaknesses through natural interactions. This dynamic approach arguably better reflects the way people behave and react as compared with previous benchmarks, which test against fixed data points and are prone to saturation. Researchers can also use our evaluation-as-a-service capabilities and compare models on our dynamic leaderboard, which goes beyond just accuracy and explores a more holistic measurement of fairness, robustness, compute, and memory.


Partner with liwaiwai.com
for your next big idea.
Let us know here.



From our partners:

CITI.IO :: Business. Institutions. Society. Global Political Economy.
CYBERPOGO.COM :: For the Arts, Sciences, and Technology.
DADAHACKS.COM :: Parenting For The Rest Of Us.
ZEDISTA.COM :: Entertainment. Sports. Culture. Escape.
TAKUMAKU.COM :: For The Hearth And Home.
ASTER.CLOUD :: From The Cloud And Beyond.
LIWAIWAI.COM :: Intelligence, Inside and Outside.
GLOBALCLOUDPLATFORMS.COM :: For The World's Computing Needs.
FIREGULAMAN.COM :: For The Fire In The Belly Of The Coder.
ASTERCASTER.COM :: Supra Astra. Beyond The Stars.
BARTDAY.COM :: Prosperity For Everyone.

Dynabench initially launched with four tasks: natural language inference (created by Yixin Nie and Mohit Bansal of UNC Chapel Hill, question answering (created by Max Bortolo, Pontus Stenetorp, and Sebastian Riedel of UCL), sentiment analysis (created by Atticus Geiger and Chris Potts of Stanford), and hate speech detection (Bertie Vidgen of Turing Institute and Zeerak Waseem Talat of University of Sheffield/Simon Fraser University).

Over the past year, we’ve launched a visual question answering task and low-resource machine translation tasks. We also powered the multilingual translation challenge at the Workshop for Machine Translations. Cumulatively, the dynamic data collection efforts have, so far, resulted in eight published papers, 400K raw examples, and four open source large-scale data sets.

“Dynatask opens up a world of possibilities for task creators. They can set up their own tasks with little coding experience, easily customize annotation interfaces, and enable interactions with models hosted on Dynabench. This makes dynamic adversarial data collection considerably more accessible to the research community,” Max Bartolo, University College London.

Now, we hope that by enabling custom NLP tasks for the entire AI community, we’ll empower the field to explore entirely new research directions. High-quality and holistic model evaluation is critical to the long term success of AI, and we believe that Dynabench as a collaborative effort will play an important role in the future of benchmarking.

Read More  Using AI To Help Health Experts Address The COVID-19 Pandemic

How to use Dynatask

Dynatask is highly flexible and customizable. A single task can have one or more owners, who define the settings of each task. For example, owners can choose which existing data sets they want to use in the evaluation-as-a-service framework. They can select from a wide variety of evaluation metrics to measure model performance, including not only accuracy but also robustness, fairness, compute, and memory. Anyone can upload models to a task’s evaluation cloud, where scores and other metrics are computed on the selected data sets. Once those models have been uploaded and evaluated, they can be placed in the loop for dynamic data collection and human-in-the-loop evaluation. Task owners can also collect data via the web interface on dynabench.org or with annotators (such as Mechanical Turk).

Let’s walk through a concrete example that illustrates the different components. Suppose there were no Natural Language Inference tasks yet, and you wanted to start one.

  • Step 1: Log into your Dynabench account and fill out the “Request new task” form on your profile page.

  • Step 2:Once approved, you will have a dedicated task page and corresponding admin dashboard that you control, as the task owner.

  • Step 3: On the dashboard, choose the existing datasets that you want to evaluate models on when they are uploaded, along with the metrics you want to use for evaluation.

  • Step 4: Next, submit baseline models, or ask the community to submit them.

  • Step 5: If you then want to collect a new round of dynamic adversarial data, where annotators are asked to create examples that fool the model, you can upload new contexts to the system and start collecting data through the task owner interface.

  • Step 6: Once you have enough data and find that training on the data helps improve the system, you can upload better models and then put those in the data collection loop to build even stronger ones.

Read More  Neuroscientists Tap Gamers To Learn How People Problem-Solve

 

We used the same basic process to construct several dynamic data sets, like Adversarial Natural Language Inference. Now, with our tools available for the broader AI community, anyone can construct data sets with humans and models in the loop.

Get started with Dynatask now

Dynabench is centered on community. We want to empower the AI community to explore better, more holistic, and more reproducible approaches to model evaluation. Our goal is to make it easy for anyone to construct high quality human-and-model-in-the-loop data sets. With Dynatask, you can move beyond accuracy-only leaderboards toward a more holistic evaluation of AI models that are more closely aligned with the expectations and needs of the people who interact with them.

At Facebook AI, we believe in collaborative open science, scientific rigor, and responsible innovation. Of course, the platform will continue to evolve and change as the community grows, befitting its dynamic nature. We invite you to join our Dynabench community.

Create new examples that fool existing models, upload new models for evaluation, or request your own Dynabench task now.

Go to Tasks under your user profile to start creating a task today

By Tristan Thrush, Research Associate | Adina Williams, Research Scientist | Douwe Kiela, Research Scientist
Source Facebook AI Research


For enquiries, product placements, sponsorships, and collaborations, connect with us at [email protected]. We'd love to hear from you!

Our humans need coffee too! Your support is highly appreciated, thank you!

liwaiwai.com

Related Topics
  • Dynatask
  • Facebook AI
  • Facebook AI Research
You May Also Like
OpenAI
View Post
  • Artificial Intelligence
  • Platforms

How We Interact With Information: The New Era Of Search

  • September 28, 2023
View Post
  • Artificial Intelligence
  • Engineering
  • Machine Learning
  • Platforms

Bring AI To Looker With The Machine Learning Accelerator

  • September 28, 2023
View Post
  • Artificial Intelligence
  • Technology

Microsoft And Mercy Collaborate To Empower Clinicians To Transform Patient Care With Generative AI

  • September 27, 2023
View Post
  • Artificial Intelligence
  • Technology

NASA’s Mars Rovers Could Inspire A More Ethical Future For AI

  • September 26, 2023
View Post
  • Artificial Intelligence
  • Platforms

Oracle CloudWorld 2023: 6 Key Takeaways From The Big Annual Event

  • September 25, 2023
View Post
  • Artificial Intelligence

3 Ways AI Can Help Communities Adapt To Climate Change In Africa

  • September 25, 2023
Robotic Hand | Lights
View Post
  • Artificial Intelligence
  • Technology

Nvidia H100 Tensor Core GPUs Come To Oracle Cloud

  • September 24, 2023
View Post
  • Artificial Intelligence
  • Engineering
  • Technology

AI-Driven Tool Makes It Easy To Personalize 3D-Printable Models

  • September 22, 2023
A Field Guide To A.I.
Navigate the complexities of Artificial Intelligence and unlock new perspectives in this must-have guide.
Now available in print and ebook.

charity-water



Stay Connected!
LATEST
  • OpenAI 1
    How We Interact With Information: The New Era Of Search
    • September 28, 2023
  • 2
    Bring AI To Looker With The Machine Learning Accelerator
    • September 28, 2023
  • 3
    3 Questions: A New PhD Program From The Center For Computational Science And Engineering
    • September 28, 2023
  • 4
    Microsoft And Mercy Collaborate To Empower Clinicians To Transform Patient Care With Generative AI
    • September 27, 2023
  • 5
    NASA’s Mars Rovers Could Inspire A More Ethical Future For AI
    • September 26, 2023
  • 6
    Oracle CloudWorld 2023: 6 Key Takeaways From The Big Annual Event
    • September 25, 2023
  • 7
    3 Ways AI Can Help Communities Adapt To Climate Change In Africa
    • September 25, 2023
  • Robotic Hand | Lights 8
    Nvidia H100 Tensor Core GPUs Come To Oracle Cloud
    • September 24, 2023
  • 9
    AI-Driven Tool Makes It Easy To Personalize 3D-Printable Models
    • September 22, 2023
  • 10
    Huawei: Advancing a Flourishing AI Ecosystem Together
    • September 22, 2023

about
About
Hello World!

We are liwaiwai.com. Created by programmers for programmers.

Our site aims to provide materials, guides, programming how-tos, and resources relating to artificial intelligence, machine learning and the likes.

We would like to hear from you.

If you have any questions, enquiries or would like to sponsor content, kindly reach out to us at:

[email protected]

Live long & prosper!
Most Popular
  • Coffee | Laptop | Notebook | Work 1
    First HP Work Relationship Index Shows Majority of People Worldwide Have an Unhealthy Relationship with Work
    • September 20, 2023
  • 2
    Huawei Connect 2023: Accelerating Intelligence For Shared Success
    • September 20, 2023
  • 3
    Applying Generative AI To Product Design With BigQuery DataFrames
    • September 21, 2023
  • 4
    Combining AI With A Trusted Data Approach On IBM Power To Fuel Business Outcomes
    • September 21, 2023
  • Microsoft and Adobe 5
    Microsoft And Adobe Partner To Deliver Cost Savings And Business Benefits
    • September 21, 2023
  • /
  • Artificial Intelligence
  • Explore
  • About
  • Contact Us

Input your search keywords and press Enter.