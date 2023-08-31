An AI group linked to Abu Dhabi’s rulers has released a top-notch Arabic AI tool, as the United Arab Emirates (UAE) strives to pioneer the Gulf’s generative AI movement.

The Jais model, created by a collaboration between UAE’s technology holding company, G42, Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), and California-based Cerebras, is open-source and bilingual, designed for over 400 million Arabic speakers worldwide.

This release coincides with the UAE and Saudi Arabia’s massive purchase of Nvidia chips crucial for AI software, amid a worldwide race to secure supplies for AI growth.

The UAE had previously developed another open-source model, Falcon, using 300+ Nvidia chips. This year, Cerebras signed a $100 million contract to supply G42 with nine supercomputers, marking one of the biggest deals of its kind with a potential Nvidia competitor.

Andrew Jackson of G42’s Inception noted that most large language models (LLMs) focus on English, despite Arabic being one of the world’s major languages. He asked why the Arabic-speaking community should not have an LLM.

While existing advanced LLMs like OpenAI’s ChatGPT, Google’s PaLM, and Meta’s LLaMA can understand and generate Arabic text, Jackson argued that the Arabic component in these models is heavily diluted.

According to its developers, Jais outperforms Falcon and other open-source models like LLaMA in Arabic accuracy. Also, Jais is designed to have a more accurate understanding of the region’s culture and context, unlike most US-centric models, said MBZUAI’s acting provost, Professor Timothy Baldwin.

Baldwin added that measures were taken to ensure Jais respects cultural and religious sensitivities. Extensive testing was conducted to remove harmful, sensitive, offensive, or inappropriate content that does not align with the values of the organizations involved in its development.

Named after UAE’s highest peak, Jais was trained for 21 days on a part of Cerebras’s Condor Galaxy 1 AI supercomputer by a team in Abu Dhabi. G42 has collaborated with other Abu Dhabi entities, including Abu Dhabi National Oil Company, Mubadala, and Etihad Airways, as launch partners to use the technology.

Training the model was challenging due to the lack of high-quality Arabic language data online compared to English. Jais addresses this by using both modern standard Arabic, understood across the Middle East, and the region’s diverse spoken dialects, sourced from media, social media, and code.

Baldwin concluded that Jais is clearly superior in Arabic and competitively comparable or even slightly better in English across various tasks than existing models.