DBRX


DBRX is a large language model developed by Mosaic under its parent company Databricks, released on March 27, 2024 under the Databricks Open Model License. It is a mixture-of-experts transformer model, with 132 billion parameters in total. 36 billion parameters are active for each token. The released model comes in either a base foundation model version or an instruction-tuned variant.
At the time of its release, DBRX outperformed other prominent open-source models such as Meta's LLaMA 2, Mistral AI's Mixtral, and xAI's Grok, in several benchmarks ranging from language understanding, programming ability and mathematics.
It was trained for 2.5 months and reported using on 3,072 Nvidia H100s connected by 3.2 terabytes per second bandwidth, for a training cost of US$10M.