Apple's OpenELM / MS's Phi-3 / Meta's Llama 3 Released

This is an AI translated post.

해리슨 블로그

Apple's OpenELM / MS's Phi-3 / Meta's Llama 3 Released

Writing language: Korean
•
Base country: All countries
•
Information Technology

해리슨

0000-00-00 00:00:00

Select Language

English
汉语
Español
Bahasa Indonesia
Português
Русский
日本語
한국어
Deutsch
Français
Italiano
Türkçe
Tiếng Việt
ไทย
Polski
Nederlands
हिन्दी
Magyar

Summarized by durumis AI

Major tech companies such as Apple, Microsoft, and Meta are causing significant changes in the AI industry by recently releasing new large language models.
Each company is showcasing differentiated models by reducing model size, optimizing data/algorithms, or enhancing contextual understanding.
In particular, Apple's OpenELM is designed to be suitable for small devices, and Meta's Llama 3 exhibits excellent performance despite its small size thanks to its efficient model structure.

Recent Notable Releases of Large Language Models

In the past week, major tech giants such as Apple, Microsoft, and Meta have consecutively unveiled new large language models, causing a significant stir in the AI industry. Let's delve deeper into the key features and significance of these recently released models.

Apple's OpenELM

On April 25th, Apple unveiled its own OpenELM language model suite. It consists of four models of varying sizes: 0.27B, 0.45B, 1.08B, and 3.04B. Even the largest model only has 3 billion parameters, making it relatively small. Considering that most current large language models have at least 3 billion parameters, OpenELM is quite compact.
This is because Apple developed OpenELM with the intention of primarily using it on small devices. In the past, increasing the number of parameters was the main approach to achieving high performance, but recently, the trend has shifted towards miniaturization and lightweighting. Apple has also increased openness by releasing the entire package, including model weights, inference code, datasets, and frameworks.

MS's Phi-3 Series

Microsoft also first released the Phi-3 Mini model (3.8B parameters) on April 23rd, and plans to release the Phi-3 Small with 7B parameters and the Phi-3 Medium with 14B parameters in the future. Phi-3 Mini is an open model that anyone can use for commercial purposes free of charge. All new Phi-3 series models will be provided through MS's cloud service, Azure.

Meta's Llama 3

Meta (formerly Facebook) first released the 8B and 70B versions of the Llama 3 model on April 18th, and plans to release the larger 400B model in the summer. Particularly, the 8B model has been praised by the developer community for its outstanding performance despite its small size.
This is believed to be due to Meta's investment of a vast amount of training data to build an efficient model structure. It can be seen as a result of focusing on data and algorithm optimization instead of increasing the number of parameters.

xAI's Grok 1.5

The Grok 1.5 model announced on March 38th can process long context tokens up to 128K, enabling complex and lengthy prompting. While the trend in language model development has focused on simply increasing the size of parameters, Grok 1.5 has presented a new direction of enhancing long-context understanding ability.

⁠⁠⁠⁠⁠⁠⁠
As such, the release of new large language models by leading companies like Apple, MS, and Meta in the past week has diversified the direction of AI technology evolution. New attempts are being made in various aspects, including model size reduction and lightweighting, data/algorithm optimization, and enhanced contextual understanding. It remains to be seen how the AI ecosystem will evolve in the future.

Topic

#AppleOpenELM
#Grok
#Llama3
#LLM
#Phi-3

Summarized by durumis AI

Major tech companies such as Apple, Microsoft, and Meta are causing significant changes in the AI industry by recently releasing new large language models.
Each company is showcasing differentiated models by reducing model size, optimizing data/algorithms, or enhancing contextual understanding.
In particular, Apple's OpenELM is designed to be suitable for small devices, and Meta's Llama 3 exhibits excellent performance despite its small size thanks to its efficient model structure.

해리슨: 해리슨 블로그; 해리슨의 깜짝 블로그

More posts by this author
View full post

Gemini 1.5 Flash, GPT-4o, and Pricing of Other LLMs Compare the performance and pricing of the latest AI models such as GPT-4o, Gemini 1.5 Pro, Claude 3 Haiku, and Gemini 1.5 Flash. We will guide you on how to choose the right model for you. Consider input token size, output ratio, task difficulty, etc. to

May 18, 2024

Claude 3 vs Gemini Price Comparison Anthropic's Claude 3 Haiku model is now available on GCP, and H2O.ai's evaluation using RAG shows that it outperforms Gemini in terms of price-to-performance. Claude 3 Haiku is the cheapest based on input and output costs per million tokens.

April 7, 2024

June 23, 2024