What is an NPU? Let’s explore the chip designed to handle Artificial Intelligence directly on our devices, without relying on the external resources of the ‘cloud’: a feature that makes it both particularly energy-efficient and respectful of privacy, as no sensitive data is transmitted. Happy reading!

In recent years, the world of hardware has welcomed a remarkable new player: the NPU, short for ‘Neural Processing Unit’. The name may sound technical, but it essentially refers to a highly specialized chip, designed to perform the mathematical calculations that underpin artificial neural networks, themselves the engine driving Artificial Intelligence (*1). Unlike more familiar hardware components such as the central processor or the graphics card, the NPU is not a general-purpose tool capable of handling any kind of task: it is, rather, a high-level specialist, optimized exclusively for the specific type of processing that AI requires. As one might expect, this extreme focus is limiting in some respects, but it also allows the NPU to perform certain tasks far more quickly and efficiently than any non-dedicated solution.
Originally confined to large data centers and research laboratories, the NPU has gradually made its way into everyday devices: first smartphones, then tablets, and more recently Personal Computers. A journey that has transformed what was once a niche technology into a component set to become an integral part of every ‘intelligent’ device.
*1: IBM, “What is an NPU?”.


When talking about NPUs, it is easy to come across a particular unit of measurement that is unfamiliar to non-specialists: TOPS, short for ‘Trillion Operations Per Second’. An intimidating figure, no doubt, but it is the standard benchmark used to assess the computational power of an AI-dedicated chip: the higher the value, the more operations the chip can execute in parallel each second.
A few concrete examples: the Neural Engine in Apple’s M4 chip reaches 38 TOPS (*1), while the Qualcomm Snapdragon X Elite processor delivers up to 45 TOPS (*2). Microsoft, for its part, has set 40 TOPS as the minimum threshold required for its Copilot+ PCs to access advanced Artificial Intelligence features (*3). Figures that, not so long ago, were the exclusive domain of the most powerful data centers. Today, we find them in the laptops we carry in our bags every day!
Notes:
*1: Apple, “M4 chip”;
*2: Qualcomm, “Snapdragon X Elite Product Brief”;
*3: Microsoft, “Windows 11 Specifications”.

To begin truly understanding the specific role of an NPU, it’s useful to place it within the broader context of the hardware that ‘drives’ our Personal Computers. Let’s start by pointing out that, in each of them (or almost all of them), the work is distributed among multiple components, each designed to excel in a well-defined area.
The ‘CPU’, or central processing unit, is the, so to speak, ‘all-purpose’ processor that, among its various tasks, manages the operating system, runs programs and coordinates the machine’s activities. It is precise and capable of handling almost any type of operation, working primarily in a sequential manner, processing instructions one after another. The ‘GPU’, on the other hand, is the chip born to accelerate graphic rendering in video games, which has proven unexpectedly valuable for the demanding workloads associated with AI as well.
All of this thanks to its architecture designed for ‘parallel’ computation, made up of thousands of cores operating simultaneously.
In this landscape, the ‘NPU’ has found its own complementary role: not the raw power of the GPU, nor the versatility of the CPU, but the ability to perform the calculations specific to neural networks in a fast, efficient manner and above all with limited energy consumption (*1).
Three chips, three specializations, one team.
Note:
*1: IBM, “NPU, GPU e CPU a confronto”.

Unlike the CPU, the ‘nerve center’ of our Personal Computers, which manages a large number of different tasks required for its own operation, an ‘NPU’ is optimized for just one, tied to a specific type of mathematical computation: the ‘MAC operation’, short for ‘Multiply-Accumulate’. In essence, the NPU takes two numbers, multiplies them together and adds the result to a running total. A step that appears straightforward, but one that must be repeated billions of times per second, involving enormous data matrices, proving absolutely essential to guarantee the effective functioning of artificial neural networks.
To meet this requirement in the most efficient way possible, the NPU was designed around a particular architecture, distributed across a vast number of elementary computational units working simultaneously, using reduced arithmetic precision (typically 8 or 16 bits) that lightens the computational load without compromising the quality of results.
It’s also equipped with a high-speed internal memory that allows it to ‘process’ data while avoiding continuous transfers to the RAM (the main memory) of the device (*1).
The result is a chip capable of handling intensive and sustained processing workloads with an energy efficiency that no traditional CPU or GPU could guarantee when performing the same operations.
Note:
*1: IBM, “How do NPUs work?”.

It was September 2017 when Apple introduced the iPhone X, equipped with the A11 Bionic chip, one of the very first portable devices in the world to integrate a component dedicated exclusively to Artificial Intelligence: the ‘Neural Engine’ (*1). With its two cores and a capacity of 0.6 TOPS, it was designed to handle tasks that were entirely new for a smartphone at the time, such as facial recognition for Face ID and the augmented reality filters behind Animoji.
Performance was, of course, far from what today’s processors deliver, but the principle behind the new hardware was undeniably revolutionary: dedicating specific components to AI, rather than relying on the CPU or GPU already present in the phone.
From that moment on, Apple has never stopped investing in this technology: the Neural Engine in the M4 chip, launched in 2024, reaches 38 TOPS (*2), representing a growth of more than sixty times in less than a decade.
A journey that has driven the entire industry forward, pushing Intel, Qualcomm and AMD to follow the same path in their Personal Computer products.
Notes:
*1: Apple, iPhone X press release.
*2: Apple, “M4 chip”.

Among all the undeniable benefits of using NPUs, one deserves particular attention: privacy.
When an Artificial Intelligence task is carried out locally by the new chip, the data does not necessarily have to ‘leave’ the device. It’s therefore not sent to remote servers, does not travel through external networks, nor is it shared with third parties. This applies, for example, to photos processed by AI, to transcripts of private conversations, and to any other type of sensitive material.
In today’s climate, where awareness of personal information protection is steadily growing, this represents a concrete and significant advantage. From this perspective, the NPU is not merely a useful tool that boosts performance: it’s, above all, an instrument of control over what must remain private.

While understanding how an NPU works is undeniably useful, knowing what this technology can concretely do for us in everyday life may be even more so. It should be noted that not all devices currently in circulation are equipped with one, although it is being integrated into a growing number of newer models. What is perhaps most surprising is that many users benefit from its capabilities on a daily basis without even knowing it: the NPU is, in fact, the ‘magic mechanism’ behind features like unlocking a smartphone through facial recognition, softening the background during a video call, or transcribing speech in real time as it is spoken.
Moving on to the latest-generation Personal Computers equipped with an NPU, those that Microsoft calls ‘Copilot+ PCs’, the possibilities are equally tangible. A clear example is Windows 11, which offers ‘Live Captions with automatic translation’: any audio playing on the computer, such as a foreign-language video or an online meeting, can be transcribed and instantly translated into any language, without an internet connection and without sending anything to the cloud (*1).
Those who want to give their photos a fresh look can use the ‘Restyle Image’ feature in the ‘Photos’ app, transforming them into artistic images by applying styles such as watercolor (*2). Paint, the app everyone has known since the days of Windows 95, has been enriched with ‘Cocreator’: simply describe what you want to create, and the NPU generates original illustrations entirely on the machine, fully offline (*3).
The range of possibilities is continuously expanding, but the common thread remains the same: thanks to the NPU, operations that until recently required remote servers can now be completed locally, quickly and above all in full respect of the user’s privacy.
Notes:
*1: Microsoft, “Introducing Copilot+ PCs”;
*2: Microsoft, “Introducing Copilot+ PCs”;
*3: Microsoft, “Cocreator in Paint”;

Until recently the exclusive domain of high-end devices, the NPU is rapidly becoming a fixture in the modern computing landscape. Consider that Intel has already shipped over 100 million processors with an integrated NPU (*1), while AMD and Qualcomm offer comparable solutions in their own latest-generation chips. Microsoft has accelerated this transition with Copilot+ PCs, setting a minimum threshold of 40 TOPS as a requirement to access the advanced AI features of Windows 11 (*2). The NPU, in short, is following the same path as SSDs (*3): from an optional extra to an essential component.
*1: PC Gamer, “Intel says it has shipped nearly 100 million AI PC processors”, gennaio 2026;
*2: Microsoft, “Windows 11 Specifications”;
*3: ‘Solid State Drives’, the successors to traditional mechanical hard drives.

Within the heated debate on the environmental impact of Artificial Intelligence, the NPU introduces a positive and still underappreciated factor. A chip of this kind requires between 5 and 10 watts to handle a ‘lightweight’ (*1) AI task (*2), a figure considerably lower than the 30-40 watts a ‘local’ GPU would need for the same operation, not to mention the energy consumed by data centers when such processing takes place in the cloud. This means, in essence, that every time ‘advanced’ features are carried out directly on one’s own device, thanks to an NPU, rather than on remote servers, overall consumption is reduced. A detail worth noting, and one that points toward the future of this new technology.
*1: A ‘lightweight’ task refers to operations such as facial recognition, speech transcription, or background blur during video calls.
*2: HP, “Why Everyone’s Talking About NPUs”;

The NPU is an exceptionally effective technology and, not by coincidence, one that is evolving rapidly. In terms of processing speed, the current benchmark stands at 40 TOPS for Copilot+ PCs, yet the latest-generation chips already surpass it: Intel Core Ultra Series 3 reaches up to 50 TOPS (*1), AMD Ryzen AI 400 up to 60 (*2) and Qualcomm Snapdragon X2 up to 80 (*3).
Alongside this hardware transformation, AI models are progressively shrinking in size, increasingly optimized to run directly on devices (PCs, smartphones, etc.) without relying on the cloud.
The path ahead is therefore clear: NPUs will become ever more powerful, software will make better use of them, and local AI will cease to be a promise and become an established reality.
Notes:
*1: Intel, “CES 2026: Intel Core Ultra Series 3”;
*2: AMD, “CES 2026 Press Release”;
*3: Newegg Insider, “AI PC News from CES 2026”;
The images on this page were created using generative Artificial Intelligence tools.