Wu Dao
Wu Dao is a multimodal artificial intelligence developed by the Beijing Academy of Artificial Intelligence. Wu Dao 1.0 was first announced on January 11, 2021; an improved version, Wu Dao 2.0, was announced on May 31. It has been compared to GPT-3, and is built on a similar architecture; in comparison, GPT-3 has 175 billion parameters — variables and inputs within the machine learning model — while Wu Dao has 1.75 trillion parameters. Wu Dao was trained on 4.9 terabytes of images and texts, while GPT-3 was trained on 45 terabytes of text data. Yet, a growing body of work highlights the importance of increasing both data and parameters. The chairman of BAAI said that Wu Dao was an attempt to "create the biggest, most powerful AI model possible". Wu Dao 2.0, was called "the biggest language A.I. system yet". It was interpreted by commenters as an attempt to "compete with the United States". Notably, the type of architecture used for Wu Dao 2.0 is a mixture-of-experts model, unlike GPT-3, which is a "dense" model: while MoE models require much less computational power to train than dense models with the same numbers of parameters, trillion-parameter MoE models have shown comparable performance to models that are hundreds of times smaller.
Wu Dao's creators demonstrated its ability to perform natural language processing and image recognition, in addition to generation of text and images. The model can not only write essays, poems and couplets in traditional Chinese, it can both generate alt text based on a static image and generate nearly photorealistic images based on natural language descriptions. Wu Dao also showed off its ability to power virtual idols and predict the 3D structures of proteins like AlphaFold.
History
Wu Dao's development began in October 2020, several months after the May 2020 release of GPT-3. The first iteration of the model, Wu Dao 1.0, "initiated large-scale research projects" via four related models.- Wu Dao – Wen Yuan, a 2.6-billion-parameter pretrained language model, was designed for tasks like open-domain answering, sentiment analysis, and grammar correction.
- Wu Dao – Wen Lan, a 1-billion-parameter multimodal graphic model, was trained on 50 million image pairs to perform image captioning.
- Wu Dao – Wen Hui, an 11.3-billion-parameter generative language model, was designed for "essential problems in general artificial intelligence from a cognitive perspective"; Synced says that it can "generate poetry, make videos, draw pictures, retrieve text, perform complex reasoning, etc".
- Wu Dao – Wen Su, based on Google's BERT language model and trained on the 100-gigabyte UNIPARC database, was designed for biomolecular structure prediction and protein folding tasks.
WuDao Corpora