Liwaiwai Liwaiwai
  • /
  • Artificial Intelligence
  • Machine Learning
  • Robotics
  • Engineering
    • Architecture
    • Design
    • Software
    • Hybrid Cloud
    • Data
  • About
  • /
  • Artificial Intelligence
  • Machine Learning
  • Robotics
  • Engineering
    • Architecture
    • Design
    • Software
    • Hybrid Cloud
    • Data
  • About
Liwaiwai Liwaiwai
  • /
  • Artificial Intelligence
  • Machine Learning
  • Robotics
  • Engineering
    • Architecture
    • Design
    • Software
    • Hybrid Cloud
    • Data
  • About
  • Artificial Intelligence

Democratizing Access To Large-Scale Language Models With OPT-175B

  • May 5, 2022
  • liwaiwai.com

Large language models — natural language processing (NLP) systems with more than 100 billion parameters — have transformed NLP and AI research over the last few years. Trained on a massive and varied volume of text, they show surprising new capabilities to generate creative text, solve basic math problems, answer reading comprehension questions, and more. While in some cases the public can interact with these models through paid APIs, full research access is still limited to only a few highly resourced labs. This restricted access has limited researchers’ ability to understand how and why these large language models work, hindering progress on efforts to improve their robustness and mitigate known issues such as bias and toxicity.

In line with Meta AI’s commitment to open science, we are sharing Open Pretrained Transformer (OPT-175B), a language model with 175 billion parameters trained on publicly available data sets, to allow for more community engagement in understanding this foundational new technology. For the first time for a language technology system of this size, the release includes both the pretrained models and the code needed to train and use them. To maintain integrity and prevent misuse, we are releasing our model under a noncommercial license to focus on research use cases. Access to the model will be granted to academic researchers; those affiliated with organizations in government, civil society, and academia; along with industry research laboratories around the world.


Partner with liwaiwai.com
for your next big idea.
Let us know here.


cyberpogo

We believe the entire AI community — academic researchers, civil society, policymakers, and industry — must work together to develop clear guidelines around responsible AI in general and responsible large language models in particular, given their centrality in many downstream language applications. A much broader segment of the AI community needs access to these models in order to conduct reproducible research and collectively drive the field forward. With the release of OPT-175B and smaller-scale baselines, we hope to increase the diversity of voices defining the ethical considerations of such technologies.

Read More  Reopening The Economy: How AI Is Providing Guidance

 

Responsible publication with OPT-175B

Following the publication guidelines for researchers generated by the Partnership on AI, along with the governance guidance outlined by NIST in March 2022 (section 3.4), we are releasing all our notes documenting the development process, including the full logbook detailing the day-to-day training process, so other researchers can more easily build on our work. Furthermore, these details disclose how much compute was used to train OPT-175B and the human overhead required when underlying infrastructure or the training process itself becomes unstable at scale.

We are sharing OPT-175B, along with the codebase used to train and deploy the model using only 16 NVIDIA V100 GPUs, in order to increase the accessibility of these models specifically for research purposes and to provide a foundation for analyzing potential harms rooted in quantifiable metrics on a common, shared model. We are also fully releasing a suite of smaller-scale baseline models, trained on the same data set and using similar settings as OPT-175B, to enable researchers to study the effect of scale alone. The parameter count for these smaller-scale models includes 125 million, 350 million, 1.3 billion, 2.7 billion, 6.7 billion, 13 billion, and 30 billion (66 billion to be released soon).

Responsible compute

Recent developments in AI research have consumed an extraordinary amount of compute power. While industry labs have started to report the carbon footprint of these models, most do not include the computational cost associated with the R&D phases of experimentation, which in some cases can be an order of magnitude more resource-intensive than training the final model.

Read More  Data2vec 2.0: Highly Efficient Self-Supervised Learning For Vision, Speech And Text

We developed OPT-175B with energy efficiency in mind by successfully training a model of this size using only 1/7th the carbon footprint as that of GPT-3. This was achieved by combining Meta’s open source Fully Sharded Data Parallel (FSDP) API and NVIDIA’s tensor parallel abstraction within Megatron-LM. We achieved ~147 TFLOP/s/GPU utilization on NVIDIA’s 80 GB A100 GPUs, roughly 17 percent higher than published by NVIDIA researchers on similar hardware.

By sharing these baselines along with the codebase to train a 175B model efficiently, we have an opportunity to reduce our collective environmental footprint while also allowing new results and progress in the field to be measurable in a consistent manner.

Propelling research forward through open collaboration

For AI research to advance, the broader scientific community must be able to work together with cutting-edge models to effectively explore their potential while also probing for their vulnerabilities at the same time. As with our previous open-science initiatives, such as the Image Similarity Challenge, the Deepfake Detection Challenge, and the Hateful Memes Challenge, Meta AI believes that collaboration across research organizations is critical to the responsible development of AI technologies.

While there are many exciting developments in the space of large language models, the limitations and risks these models pose are still not well understood. Without direct access to these models, researchers are also limited in their ability to design detection and mitigation strategies for possible harm, which leaves detection and mitigation in the hands of only those with sufficient capital to access models of this scale. We hope that OPT-175B will bring more voices to the frontier of large language model creation, help the community collectively design responsible release strategies, and add an unprecedented level of transparency and openness to the development of large language models in the field.

Read More  Stanford Engineers Present New Chip That Ramps Up AI Computing Efficiency

Access the open source code and small-scale pretrained models here, request access to OPT-175B here, and read the paper here.

Pretrained models are all licensed under the OPT-175B License Agreement.

This work on large-scale pretraining is being undertaken by a multidisciplinary team that includes Stephen Roller, Naman Goyal, Anjali Sridhar, Punit Singh Koura, Moya Chen, Kurt Shuster, Mikel Artetxe, Daniel Simig, and Tianlu Wang. Advisers for responsible AI conduct also include Adina Williams, Eric Smith, Emily Dinan, Y-Lan Boureau, Melanie Kambadur, and Joelle Pineau.

 

By Susan Zhang, Research Engineer | Mona Diab, Research Scientist | Luke Zettlemoyer, Research Director
Source Meta AI


Our humans need coffee too! Your support is highly appreciated, thank you!

liwaiwai.com

Related Topics
  • Meta AI
  • NLP
  • Open Pretrained Transformer
  • OPT-175B
You May Also Like
View Post
  • Artificial Intelligence

Introducing 100K Context Windows

  • May 30, 2023
View Post
  • Architecture
  • Artificial Intelligence
  • Design

Sandvik unveils the Impossible Statue – an AI-enabled collaboration between Michelangelo, Rodin, Kollwitz, Kotaro, Savage and Sandvik

  • May 30, 2023
View Post
  • Artificial Intelligence

How Auditoria.AI Is Building AI-Powered Smart Assistants For Finance Teams

  • May 29, 2023
View Post
  • Artificial Intelligence
  • Technology

AI Coming To The PC At Scale

  • May 27, 2023
View Post
  • Artificial Intelligence
  • Platforms

Build Next-Generation, AI-Powered Applications On Microsoft Azure

  • May 26, 2023
View Post
  • Artificial Intelligence
  • Data
  • Machine Learning

Faster Together: How Dun & Bradstreet Datasets Accelerate Your Real-Time Insights

  • May 24, 2023
View Post
  • Artificial Intelligence

Wipro Expands Google Cloud Partnership To Advance Enterprise Adoption Of Generative AI

  • May 23, 2023
View Post
  • Artificial Intelligence
  • Programming

How To Build An AI-generated Procedural Application

  • May 22, 2023
Stay Connected!
LATEST
  • 1
    Introducing 100K Context Windows
    • May 30, 2023
  • 2
    Sandvik unveils the Impossible Statue – an AI-enabled collaboration between Michelangelo, Rodin, Kollwitz, Kotaro, Savage and Sandvik
    • May 30, 2023
  • 3
    Effective Management Of Data Sources In Machine Learning
    • May 29, 2023
  • 4
    How Auditoria.AI Is Building AI-Powered Smart Assistants For Finance Teams
    • May 29, 2023
  • 5
    G7 2023: The Real Threat To The World Order Is Hypocrisy.
    • May 28, 2023
  • 6
    AI Coming To The PC At Scale
    • May 27, 2023
  • 7
    Build Next-Generation, AI-Powered Applications On Microsoft Azure
    • May 26, 2023
  • 8
    Faster Together: How Dun & Bradstreet Datasets Accelerate Your Real-Time Insights
    • May 24, 2023
  • 9
    5 Skills Every Successful MLOps Engineer Should Have
    • May 24, 2023
  • 10
    London & UK Is The Best For International Students! You Have Been Warned.
    • May 23, 2023

about
About
Hello World!

We are liwaiwai.com. Created by programmers for programmers.

Our site aims to provide materials, guides, programming how-tos, and resources relating to artificial intelligence, machine learning and the likes.

We would like to hear from you.

If you have any questions, enquiries or would like to sponsor content, kindly reach out to us at:

[email protected]

Live long & prosper!
Most Popular
  • 1
    Wipro Expands Google Cloud Partnership To Advance Enterprise Adoption Of Generative AI
    • May 23, 2023
  • 2
    Google Cloud Launches AI-Powered Solutions To Safely Accelerate Drug Discovery And Precision Medicine
    • May 16, 2023
  • 3
    Huawei And Partners Announce Yucatan Wildlife Conservation Findings
    • May 18, 2023
  • 4
    Cloudflare’s R2 Is The Infrastructure Powering Leading AI Companies
    • May 16, 2023
  • 5
    TCS Announces Generative AI Partnership With Google Cloud And New Offering For Enterprise Customers
    • May 22, 2023
  • /
  • Artificial Intelligence
  • Explore
  • About
  • Contact Us

Input your search keywords and press Enter.