* Please refer to the English Version as our Official Version.
News Focus:
By integrating Kleidi technology into PyTorch and ExecutuTorch, Arm extends key AI performance advantages from the edge to the cloud, empowering next-generation applications to run large language models on Arm CPUs.
Continuous investment in popularizing ML workloads will enable developers of any technology stack to immediately achieve significant inference performance improvements on the latest generative AI models.
By expanding partnerships with cloud service providers and major ML independent software developers, we aim to further empower AI developers worldwide.
Arm Holdings Limited (NASDAQ: ARM, hereinafter referred to as "Arm") recently announced the integration of Arm Kleidi technology into PyTorch and ExecutuTorch, empowering next-generation applications to run Large Language Models (LLMs) on Arm CPUs. Kleidi brings together the latest developer empowerment technologies and key resources aimed at driving technological collaboration and innovation in the machine learning (ML) technology stack. Through these significant advancements, Arm is committed to providing a smoother experience for developers of any ML technology stack.
Alex Spinelli, Vice President of Developer Technology at Arm Strategy and Ecology, said, "Arm is working closely with leading cloud service providers and framework designers to create a convenient development environment that allows software developers to easily accelerate artificial intelligence (AI) and ML workloads on Arm architecture based hardware. Since the launch of this technology four months ago, Kleidi has accelerated development on Arm CPUs and significantly improved key AI performance. The close collaboration between Arm and the PyTorch community confirms that this technology can greatly reduce the workload required for developers to utilize efficient AI
Integrate with leading frameworks to achieve significant cloud advantages
On the cloud, Kleidi builds a blueprint for developers around the world to optimize AI on the Arm platform by leveraging Arm Compute Libraries (ACL) to enhance the achievements brought by PyTorch. By eliminating unnecessary engineering work for developers, Arm can be seen as the preferred platform for running their critical ML workloads. As a crucial step towards realizing this vision, Arm is collaborating directly with PyTorch and TensorFlow to integrate the Arm Kleidi Libraries, which includes integrating basic Arm software libraries directly into the aforementioned leading frameworks.
Importantly, this means that when a new framework version is released, application developers can automatically benefit from its significant performance improvements without the need for additional recompilation on the Arm platform. This investment has had a positive impact on the partnership:
The Arm chatbot demonstration is driven by Meta Llama 3 LLM and runs on Amazon Web Services (AWS) Graviton processors, achieving real-time chat response for the first time in the mainline PyTorch.
According to data measured on AWS Graviton4, integrating Kleidi technology into the open-source PyTorch code repository can increase token first response time by 2.5 times.
By optimizing torch.compile to fully utilize the Kleidi technology provided through ACL, data measured on AWS Graviton3 shows that the performance of various Hugging Face models on inference workloads can be improved by 1.35 to 2 times.
These are just one excellent example of cloud computing, but they represent the type of performance acceleration that can be achieved when ML workloads are popularized on the Arm platform. Arm will continue to invest to ensure that developers' AI applications can run perfectly on its technology from cloud to edge, including implementing forward compatibility for new features that developers can benefit from immediately.
Collaboration helps developers keep pace with the development of generative AI
With the rapid emergence of new language model versions, generative AI has sparked a wave of AI innovation. Arm continues to work closely with key components of the ML technology stack, partnering with cloud service providers such as AWS and Google, as well as rapidly growing communities of ML independent software vendors (ISVs) such as VNet, to help developers stay at the forefront of technology.
Nirav Mehta, Senior Director of Product Management at Google Cloud Compute, said, "Arm and Google Cloud are committed to improving the accessibility and agility of AI for developers, and Kleidi represents an important progress in meeting AI needs through software and hardware co optimization. As our customers actively adopt custom CPUs based on Arm architecture - Axion, we look forward to bringing customers a smoother integration experience throughout the entire ML technology stack
Lin Yuan, a software engineer at VNet, said, "Enterprises that utilize the VNet Data Intelligence Platform for AI and ML workflows will benefit from the performance optimization brought by Arm Kleidi integration across ML software stacks. With the Arm architecture AWS Graviton processor supported by the VNet ML Runtime cluster, enterprises can benefit from the acceleration of various ML software libraries while reducing costs for cloud service providers
It is crucial to assist developers in applying the resources provided by Arm to practical use cases. To this end, Arm has created a sample software stack and learning resources to demonstrate to developers how to build AI workloads on Arm CPUs, which has rapidly driven the widespread adoption of the Arm system and accelerated the deployment speed of developers on the Arm system. The first case is accelerating the implementation of chatbots through Kleidi technology, and ML Ops and Retrieval Enhanced Generation (RAG) will also be added to these use cases later this year, with plans to achieve more results by 2025.
Continuously improving end-to-end performance
Based on the development trend of Kleidi on the end side, KleidiAI will also be integrated into ExecutuTorch (PyTorch's new end side inference runtime). This integration is expected to be completed in October 2024 and is expected to bring significant performance improvements to the end-to-end applications currently undergoing production testing or implementation in ExecutuTorch. The multiple KleidiAI integrations that have been completed so far include integration with Google XNNPACK and MediaPipe, as well as Tencent's hybrid big model, which has significantly improved its actual workload.
Kleidi will continue to integrate with various versions of PyTorch and ExecutuTorch, as well as other major AI frameworks. From cloud data centers to end-to-end devices, developers can now efficiently run high-performance AI workloads on various devices based on the Arm platform. Arm will continue to actively introduce enhanced features to the PyTorch community and focus on providing quantization optimization for various integer formats to further improve performance and empower Arm CPUs to seamlessly run next-generation AI experiences on a large scale.
Realize more achievements to empower developers
PyTorch is driving innovation in the field of ML development. Recently, Arm joined the PyTorch Foundation as a Premier member, which is undoubtedly an important moment for Arm's AI journey. Arm will continue to strive to empower developers around the world to fully unleash the potential of end-to-end AI on the Arm platform, thereby shaping cutting-edge AI and application capabilities.
Additional resources:
Regarding Kleidi:
Kleidi (meaning "key" in ancient Greek) is built on three key pillars:
Open Arm technology is directly integrated into key frameworks, allowing developers to seamlessly achieve Arm CPU performance without any additional work. Arm will ensure that new technologies are always forward compatible so that developers can immediately benefit from them.
Empower developers by providing various resources such as usage guides, learning resources, and technical demonstrations.
By leveraging a vibrant ecosystem of ML software vendors, frameworks, and open source projects, we aim to acquire the latest AI features and make the Arm platform the preferred platform for developers to build solutions.
About Arm
Arm, as the industry's most powerful and energy-efficient computing platform, covers 100% of the world's connected population with unparalleled scale. Arm provides advanced solutions to meet the endless demand for computing, empowering leading global technology companies to unleash unprecedented artificial intelligence experiences and performance. Arm collaborates with the world's most extensive computing ecosystem and 20 million software developers to build the future of artificial intelligence on the Arm platform.
This is reported by Top Components, a leading supplier of electronic components in the semiconductor industry
They are committed to providing customers around the world with the most necessary, outdated, licensed, and hard-to-find parts.
Media Relations
Name: John Chen
Email: salesdept@topcomponents.ru