Login
Sign Up
Woofun AI reports that global smartphone shipments are projected to fall by 12.2% year-on-year to 1.093 billion units in 2026, even as the total market value climbs by 6.1%. This divergence signals a structural shift where declining volume is offset by a dramatic rise in the average selling price, which is forecast to jump from $467 in 2025 to $565 in 2026. This represents a 21% increase, or an additional $98 per unit, establishing new all-time highs for the industry. The primary catalyst for this pricing surge is the escalating cost of memory components. In the first quarter of 2026 alone, average prices for DRAM and NAND flash memory surged by over 80% month-on-month. This spike is driven by the insatiable demand for HBM from AI servers, which has absorbed a significant portion of memory manufacturers' production capacity, leaving consumer electronics with tighter supply constraints. Omdia predicts that even if the growth rate moderates to single digits in the second half of the year, component costs will remain elevated, compelling manufacturers to pass these expenses directly to retail prices. Consequently, the market structure is fracturing: lower-end models face pressure to reduce prices despite rising costs, while higher-end models capture increased market share, and the secondary market for refurbished devices expands to fill the gap.
In direct response to these economic pressures, the Chinese government has implemented demand-side subsidies to stimulate consumption. On June 18th, eight departments, including the Ministry of Commerce, released the "Implementation Opinions on Accelerating the Development of 'Artificial Intelligence + Consumption'," explicitly backing the purchase of AI phones, smart computers, and AI glasses through financial interest subsidies for personal consumption loans. Simultaneously, as hardware costs rise, AI phones have been designated as a critical category within the national consumption strategy. This policy intervention aims to counteract the friction caused by higher entry prices for advanced devices. The result is a bifurcated market where mid-range phone prices are declining while high-end models continue to ascend. At recent press conferences, manufacturers have ceased highlighting processor speed as a primary differentiator. Flagship devices launched in the summer of 2026 almost universally utilize the same Snapdragon 8 Gen 5 processor. Instead, Xiaomi emphasizes its 7000 mAh battery, vivo highlights folding screen technology, iQOO focuses on gaming-specific cooling systems, and moto boasts the ability to fit a 6000 mAh battery into foldable designs. Within the Android ecosystem, discussions regarding SoC performance have faded, mirroring the dynamic of the 4G era where processor chips became a commodity rather than a competitive edge.
Caught in this squeeze, phone manufacturers are forced to seek alternative avenues for product differentiation, leading to an intense parametric arms race in edge AI, particularly regarding raw numbers. According to a Counterpoint report, smartphones with GenAI capabilities will account for 45% of global shipments in 2026.
However, analysts note a significant disconnect between devices capable of running GenAI and those where users actually engage these functions in daily life. While hardware capabilities have advanced, user behavior has not evolved in tandem. MediaTek and vivo jointly demonstrated that the Dimensity 9300 can process 33B-parameter large models, while Huawei claimed its Kirin flagship processors can perform local inference on sparse models with hundreds of billions of parameters. Xiaomi proudly stated that its flagship models can handle hundreds of billions of MoE parameters. These figures have escalated rapidly every few months: 7B, 13B, 33B, 100B. Yet, vivo subsequently decided to revert its main edge AI model to 3B. This strategic pivot was not driven by technical limitations but by efficiency metrics: the 3B model consumes only 2GB of memory and draws approximately 750mA of power while processing long texts of up to 128K characters. In practical usage, the 3B model delivers performance comparable to the 7B model without causing overheating or significantly degrading battery life. Users ultimately require an AI system that enhances daily experience through quick responses, thermal stability, low power consumption, and practical assistance across various scenarios.
Underpinning this shift is an industry fact often overlooked: massive edge AI parameter counts, such as 10 or 100 billion, rely fundamentally on sparse MoE architecture. Although the total parameter count is vast, only a few billion are activated during any single inference task. Following quantization and compression using the INT4 format, the actual computational load mirrors that of a 7B Dense model. The figure of 'hundreds of billions' reflects the total storage system capacity, not the data volume actively utilized per inference. This reality dictates that smartphone AI capabilities are determined by LPDDR5X memory capacity, NPU hash rates, and power consumption budgets. In practice, the optimal model size appears to be around 7B. A 7B model requires approximately 4GB of memory after INT4 quantization, fitting comfortably within the 12-16GB LPDDR5X range found in flagship phones. MediaTek asserts that the APU 790 in the Dimensity 9300 can process 7B models at roughly 20 tokens per second, while OPPO has deployed 7B edge AI models for over 100 AI functions. Although Qualcomm does not disclose specific parameter values, its AI engines demonstrate similar performance levels. Exceeding this threshold would impose memory and cooling requirements that surpass the actual capabilities of most flagship devices.
This evolution fundamentally alters the metrics used to evaluate the chip industry. Historically, NPU performance was judged by peak TOPS, where higher values indicated superiority. As manufacturers shift from large to smaller models, the challenge for NPU makers is ensuring stable performance for long-context inference tasks within a strict 750mA power budget, rather than chasing peak scores. Metrics such as SRAM allocation for KV Cache, memory bandwidth scheduling efficiency, and native support for low-precision formats like INT4/FP8 are now more critical to the user experience than raw TOPS numbers. The bottleneck in inference performance extends beyond NPU hash rates to the storage system's ability to deliver model weights timely. A read speed of 10.8GB/s directly impacts model loading speed and KV Cache refresh efficiency, determining the responsiveness of AI interactions. Memory manufacturers have acknowledged this shift. Samsung's UFS 5.0 solution, released on June 23rd, offers a sequential read speed of 10.8GB/s, more than double the previous generation's UFS 4.1, with overall energy efficiency improved by over 40%. Samsung describes this product as "the core infrastructure for edge AI." However, mass production of UFS 5.0 will not commence until the fourth quarter of this year, meaning it will appear in next year's flagship phones rather than current press conferences.
Counterpoint's analysis indicates that storage constraints are a primary driver keeping GenAI phone prices above $400. While UFS 5.0 promises significant performance gains, the initial cost will be high, and the pattern of higher-end models benefiting first will persist in the short term. The competitive focus in smartphone AI is shifting from the hardware itself to the AI models running on them. Counterpoint research shows that in the high-end market, Google Gemini is becoming the central battleground. Gemini underpins Apple's redesigned Siri, forms the basis of Samsung's Galaxy AI, and enhances the AI capabilities of major Chinese smartphone brands in overseas markets. OEM manufacturers are now responsible for organizing the logic, user experience, and ecosystem integration around these models, marking the true arena for the next phase of competition. The competitive logic of edge AI has transformed, yet one constant remains: there is no room for differentiation among flagship phone processors. Two distinct phones may use the same SoC, but they cannot promote identical features at press conferences. Differentiation must now be sought outside the SoC's scope, in imaging algorithms, gaming experiences, and battery life management, areas not naturally covered by general SoC design.
For smartphone manufacturers, the strategic option is to develop their own chips to excel where the SoC falls short. Since Apple established a performance gap with Android using its A series chips, "developing one's own SoC" became the industry's ultimate goal. Many companies claim to pursue this, but developing a flagship SoC to compete with Qualcomm and MediaTek is rarely cost-effective. Manufacturers have realized they do not need to replace Qualcomm's processors but can instead develop smaller chips for specific tasks. iQOO's Q2 gaming chip exemplifies this approach. It bypasses the CPU, GPU, and NPU, focusing solely on improving gaming graphics quality and frame rates. While the Adreno GPU in the Snapdragon 8 Gen 5 can perform these tasks, it must also handle system graphics rendering and UI synthesis, leading to suboptimal performance and power consumption. By dedicating a chip to these tasks, iQOO achieves superior results while freeing resources for the main SoC to maintain stable frame rates. Xiaomi's self-developed imaging chip follows similar logic. It does not replace Qualcomm's ISP but takes over tasks like computational photography, multi-frame synthesis, and long-focus image optimization after basic processing. This division of labor yields higher efficiency and less heat generation compared to a single-chip solution. The cost-effectiveness of this approach far exceeds developing a full SoC. Coprocessors have clear functional boundaries, shorter development cycles, and utilize mature 12/16/28nm manufacturing processes, resulting in significantly lower production costs than advanced nodes. They also avoid the need for complete compiler and driver ecosystems. A gaming chip can be developed and mass-produced between two SoC generations, one to two years faster than waiting for Qualcomm's next GPU update.
This trend exerts a dual impact on the chip industry. Demand for specialized chips using mature manufacturing processes is rising, boosting utilization rates for 12/16/28nm production lines. Simultaneously, Qualcomm and MediaTek are forced to adapt; for coprocessors to function smoothly with SoCs, they must provide more underlying interfaces, shifting the cooperation model from 'selling a single chip' to 'providing a collaborative platform.' OpenAI plans to launch an AI-powered smartphone in 2028, with Qualcomm and MediaTek collaborating on its chip development. This choice is significant: when the world's largest AI company enters the smartphone market, it opts for existing platforms rather than developing its own SoC. This reinforces that the SoC is not the key focus; capturing market share in the AI model layer, currently dominated by Google Gemini, is the true objective. This aligns with current industry trends where SoCs become infrastructure, and real competition spreads across the AI model layer, the coprocessor layer, and the application layer. In the model competition, the question is which vendor's edge AI models run longest within a 750mA power budget and offer the best logical frameworks. In the coprocessor race, the focus is on optimizing gaming graphics and imaging for specific use cases. In the application layer, the contest is about which vendor can genuinely alter daily user habits through edge AI.
On one hand, demand for smartphone chips is driven by the need for greater efficiency via AI technologies; on the other, cost considerations push manufacturers toward integrated solutions. Both factors reduce the exclusive value of flagship SoCs while opening opportunities for mature manufacturing processes and local companies. In the future, price differences among smartphones will stem from manufacturers' innovation capabilities and the speed of AI technology implementation. These are the key factors that will determine real breakthroughs. This marks a definitive end to the era where processor specifications alone dictated market leadership, replacing it with a complex ecosystem of specialized hardware and software integration.