Gemma 4 on Arm: Accessible, immediate, optimized on-device AI to accelerate the mobile app experience The launch of Gemma 4 on Arm represents a significant advancement in on-device AI, offering developers a powerful tool to enhance mobile app experiences without relying on cloud infrastructure. Google’s Gemma 4 model, optimized for Arm-based devices, enables real-time, privacy-preserving, and power-efficient AI capabilities that align with the growing demand for instant, intelligent interactions on smartphones. This shift underscores the importance of Arm’s compute architecture in scaling on-device AI across the Android ecosystem, where performance, efficiency, and security are critical for delivering seamless user experiences. Gemma 4 introduces improved performance and efficiency, expanding support for multimodal applications such as reasoning, agentic workflows, and vision-and-audio integration. These enhancements allow developers to create more responsive, context-aware interactions directly on-device, without increasing memory usage. The model’s broader language support and foundation for real-time assistive experiences further position it as a versatile tool for developers aiming to integrate AI into everyday apps. Arm’s role in enabling Gemma 4’s capabilities is central to its success. Early engineering tests on Arm CPUs, particularly the SME2 architecture, demonstrate significant performance gains for running Gemma 4 E2B (Effective 2 Billion) workloads. Initial tests on the Gemma 4 2B model show an average 5.5x speedup in prefill operations (processing user input) and up to 1.6x faster decode (generating responses). These results highlight the potential of Armv9 CPU innovations, such as the Scalable Matrix Extension 2 (SME2), which accelerates matrix-heavy AI tasks within the power constraints of modern smartphones.#arm #gemma_4 #sandeep_patil #kleidi_ai #envision

Arm Launches In-House CPU, Meta Signs as First Customer Arm has unveiled its first-ever in-house central processing unit (CPU), the AGI CPU, designed specifically for AI inference in data centers. The chip marks a significant shift in the company’s business model, transitioning from licensing its architecture to manufacturing physical silicon. Meta, the social media giant, has become the first official customer, with additional commitments from OpenAI, Cloudflare, and SAP. For over three decades, Arm has operated as a fabless semiconductor company, licensing its instruction sets to major chipmakers like Apple, Nvidia, Amazon, and Google. This new venture represents a bold move into direct competition with its former clients. Arm CEO Rene Haas introduced the AGI CPU at a San Francisco event, emphasizing its role in reshaping the AI chip landscape. Meta’s involvement underscores the chip’s potential. The company is expanding its AI data centers and plans to invest up to $135 billion in capital expenditures this year. Meta’s software engineer Paul Saab, who worked on the Arm project since 2023, highlighted the strategic value of the partnership. “This adds yet another player to the ecosystem for us,” Saab said, noting the flexibility the deal provides for Meta’s software stack and supply chain. The agreement’s terms remain undisclosed, but analysts predict it could be a major revenue boost for Arm. Chip analyst Patrick Moorhead estimated that even a fraction of Meta’s future spending on the AGI CPU could significantly impact Arm’s top line. “Let’s say they get 5% of Meta’s $115 to $135 billion capex,” Moorhead said, calling it a “game changer.” The move also signals a broader trend in the CPU market.#meta #openai #cloudflare #sap #arm