Google DeepMind
DeepMind Technologies Limited, trading as Google DeepMind or simply DeepMind, is a British-American artificial intelligence research laboratory which serves as a subsidiary of Alphabet Inc. Founded in the UK in 2010, it was acquired by Google in 2014 and merged with Google AI's Google Brain division to become Google DeepMind in April 2023. The company is headquartered in London, with research centres in the United States, Canada, France, Germany, and Switzerland.
In 2014, DeepMind introduced neural Turing machines. The company has created many neural network models trained with reinforcement learning to play video games and board games. It made headlines in 2016 after its AlphaGo program beat Lee Sedol, a Go world champion, in a five-game match, which was later featured in the documentary AlphaGo. A more general program, AlphaZero, beat the most powerful programs playing go, chess and shogi after a few days of play against itself using reinforcement learning. DeepMind has since trained models for game-playing, for geometry, and for algorithm discovery.
In 2020, DeepMind made significant advances in the problem of protein folding with AlphaFold, which achieved state of the art records on benchmark tests for protein folding prediction. In July 2022, it was announced that over 200 million predicted protein structures, representing virtually all known proteins, would be released on the AlphaFold database.
Google DeepMind has become responsible for the development of Gemini and other generative AI tools, such as the text-to-image model Imagen, the text-to-video model Veo, and the text-to-music model Lyria.
History
The start-up was founded by Demis Hassabis, Shane Legg and Mustafa Suleyman in November 2010. Hassabis and Legg first met at the Gatsby Computational Neuroscience Unit at University College London.Demis Hassabis has said that the start-up began working on artificial intelligence technology by teaching it how to play old games from the seventies and eighties, which are relatively primitive compared to the ones that are available today. Some of those games included Breakout, Pong, and Space Invaders. AI was introduced to one game at a time, without any prior knowledge of its rules. After spending some time on learning the game, AI would eventually become an expert in it. "The cognitive processes which the AI goes through are said to be very like those of a human who had never seen the game would use to understand and attempt to master it." The goal of the founders is to create a general-purpose AI that can be useful and effective for almost anything.
Major venture capital firms Horizons Ventures and Founders Fund invested in the company, as well as entrepreneurs Scott Banister, Peter Thiel, and Elon Musk. Jaan Tallinn was an early investor and an adviser to the company. On 26 January 2014, Google confirmed its acquisition of DeepMind for a price reportedly ranging between $400 million and $650 million. and that it had agreed to take over DeepMind Technologies. The sale to Google took place after Facebook reportedly ended negotiations with DeepMind Technologies in 2013. The company was afterwards renamed Google DeepMind and kept that name for about two years.
In 2014, DeepMind received the "Company of the Year" award from Cambridge Computer Laboratory.
In September 2015, DeepMind and the Royal Free NHS Trust signed their initial information sharing agreement to co-develop a clinical task management app, Streams.
After Google's acquisition the company established an artificial intelligence ethics board. The ethics board for AI research remains a mystery, with both Google and DeepMind declining to reveal who sits on the board. DeepMind has opened a new unit called DeepMind Ethics and Society and focused on the ethical and societal questions raised by artificial intelligence featuring prominent philosopher Nick Bostrom as advisor. In October 2017, DeepMind launched a new research team to investigate AI ethics.
In December 2019, co-founder Suleyman announced he would be leaving DeepMind to join Google, working in a policy role. In March 2024, Microsoft appointed him as the EVP and CEO of its newly created consumer AI unit, Microsoft AI.
In April 2023, DeepMind merged with Google AI's Google Brain division to form Google DeepMind, as part of the company's continued efforts to accelerate work on AI in response to OpenAI's ChatGPT. This marked the end of a years-long struggle from DeepMind executives to secure greater autonomy from Google.
Products and technologies
As of 2020, DeepMind has published over a thousand papers, including thirteen papers that were accepted by Nature or Science. DeepMind received media attention during the AlphaGo period; according to a LexisNexis search, 1842 published news stories mentioned DeepMind in 2016, declining to 1363 in 2019.Games
Unlike earlier AIs, such as IBM's Deep Blue or Watson, which were developed for a pre-defined purpose and only function within that scope, DeepMind's initial algorithms were intended to be general. They used reinforcement learning, an algorithm that learns from experience using only raw pixels as data input. Their initial approach used deep Q-learning with a convolutional neural network. They tested the system on video games, notably early arcade games, such as Space Invaders or Breakout. Without altering the code, the same AI was able to play certain games more efficiently than any human ever could.In July 2018, researchers from DeepMind trained one of its systems to play the computer game Quake III Arena.
In 2013, DeepMind published research on an AI system that surpassed human abilities in games such as Pong, Breakout and Enduro, while surpassing state of the art performance on Seaquest, Beamrider, and Q*bert. This work reportedly led to the company's acquisition by Google. DeepMind's AI had been applied to video games made in the 1970s and 1980s; work was ongoing for more complex 3D games such as Quake, which first appeared in the 1990s.
In 2020, DeepMind published Agent57, an AI Agent which surpasses human level performance on all 57 games of the Atari 2600 suite. In July 2022, DeepMind announced the development of DeepNash, a model-free multi-agent reinforcement learning system capable of playing the board game Stratego at the level of a human expert.
AlphaGo and successors
In October 2015, a computer Go program called AlphaGo, developed by DeepMind, beat the European Go champion Fan Hui, a 2 dan professional, five to zero. This was the first time an artificial intelligence defeated a professional Go player. Previously, computers were only known to have played Go at "amateur" level. Go is considered much more difficult for computers to win compared to other games like chess, due to the much larger number of possibilities, making it prohibitively difficult for traditional AI methods such as brute-force.In March 2016 it beat Lee Sedol, a 9-dan professional player, with a score of 4 to 1 in a five-game match. In the 2017 Future of Go Summit, AlphaGo won a three-game match with Ke Jie, who had been the world's highest-ranked player for two years. In 2017, an improved version, AlphaGo Zero, defeated AlphaGo in a hundred out of a hundred games. Later that year, AlphaZero, a modified version of AlphaGo Zero, gained superhuman abilities at chess and shogi. In 2019, DeepMind released a new model named MuZero that mastered the domains of Go, chess, shogi, and Atari 2600 games without human data, domain knowledge, or known rules.
AlphaGo technology was developed based on deep reinforcement learning, making it different from the AI technologies then on the market. The data fed into the AlphaGo algorithm consisted of various moves based on historical tournament data. The number of moves was increased gradually until over 30 million of them were processed. The aim was to have the system mimic the human player, as represented by the input data, and eventually become better. It played against itself and learned from the outcomes; thus, it learned to improve itself over the time and increased its winning rate as a result.
AlphaGo used two deep neural networks: a policy network to evaluate move probabilities and a value network to assess positions. The policy network trained via supervised learning, and was subsequently refined by policy-gradient reinforcement learning. The value network learned to predict winners of games played by the policy network against itself. After training, these networks employed a lookahead Monte Carlo tree search, using the policy network to identify candidate high-probability moves, while the value network evaluated tree positions.
In contrast, AlphaGo Zero was trained without being fed data of human-played games. Instead it generated its own data, playing millions of games against itself. It used a single neural network, rather than separate policy and value networks. Its simplified tree search relied upon this neural network to evaluate positions and sample moves. A new reinforcement learning algorithm incorporated lookahead search inside the training loop. AlphaGo Zero employed around 15 people and millions in computing resources. Ultimately, it needed much less computing power than AlphaGo, running on four specialized AI processors, instead of AlphaGo's 48. It also required less training time, being able to beat its predecessor after just three days, compared with months required for the original AlphaGo. Similarly, AlphaZero also learned via self-play.
Researchers applied MuZero to solve the real world challenge of video compression with a set number of bits with respect to Internet traffic on sites such as YouTube, Twitch, and Google Meet. The goal of MuZero is to optimally compress the video so the quality of the video is maintained with a reduction in data. The final result using MuZero was a 6.28% average reduction in bitrate.