AMD Radeon 8040S Graphics

AMD Radeon 8040S Graphics
AMD Radeon 8040S Graphics graphics card review

Radeon 8040S Graphics: Ryzen AI Max without Max Graphics

AMD Radeon 8040S Graphics is the strangest graphics configuration in the Ryzen AI Max family. On paper, it falls under the same platform as the higher-tier Radeon 8050S, 8060S, and 8065S, but in essence, it's a completely different product.

The higher Ryzen AI Max models are interesting because they attempt to replace a lower discrete graphics card with powerful integrated graphics. They feature many compute units (CUs), wide memory, and a shared pool of RAM that can be utilized for graphics, work tasks, and local AI scenarios. Radeon 8040S disrupts this idea: it has only 16 RDNA 3.5 graphics blocks and 128-bit LPDDR5X memory.

Therefore, it's difficult to consider the Radeon 8040S separately from the processor. It is not a standalone graphics card or even a mass-market integrated GPU (iGPU) that AMD typically integrates into various Ryzen processors. Currently, it is effectively tied to the Ryzen AI Max PRO 380. The standard Ryzen AI Max 300 series does not include it, and in the newer Ryzen AI Max PRO 400, the lower-end models already feature Radeon 8050S.

If the Radeon 8040S were positioned within a hypothetical Ryzen 7, it would seem quite appropriate: modern RDNA 3.5 graphics, 16 CUs, clock speeds up to 2800 MHz, a good level for an integrated solution. However, within the Ryzen AI Max, it appears questionable, since the Max series promises more in terms of graphics.

What is Officially Known

AMD specifies that the Radeon 8040S Graphics has 16 graphics blocks and a clock speed of up to 2800 MHz. This graphics solution is used in the Ryzen AI Max PRO 380, the lowest chip in the Ryzen AI Max PRO 300 series. This configuration has 6 CPU cores, 12 threads, 128-bit LPDDR5X memory, and supports a maximum of 64 GB of RAM.

This is more important than it seems. The Radeon 8040S is not only limited in graphics but also in the overall level of the platform. The higher Ryzen AI Max models have more graphics blocks and wider memory, thus better fulfilling the idea of a “compact workstation without a discrete GPU.”

The Radeon 8040S represents a different scenario. Here, the main argument is no longer the graphics but rather the CPU, NPU, PRO capabilities, compactness, and sufficient integrated graphics. In essence, it is more like a working platform without a discrete graphics card than an attempt to replace a discrete graphics card with an integrated GPU.

Where Radeon 8040S Stands in the Ryzen AI Max Graphics Hierarchy

AMD does not delineate 8040S / 8050S / 8060S / 8065S as a separate family of graphics cards, but it is convenient to compare them within the Ryzen AI Max series.

Graphics CU Clock Speed Position
Radeon 8040S 16 up to 2800 MHz junior configuration
Radeon 8050S 32 up to 2800 MHz middle option
Radeon 8060S 40 up to 2900 MHz senior graphics Ryzen AI Max 300
Radeon 8065S 40 up to 3000 MHz senior graphics Ryzen AI Max PRO 400

The difference with the Radeon 8050S is very straightforward: 16 CUs versus 32 CUs. This means that the Radeon 8040S has half the number of graphics blocks. The gap widens even more with the Radeon 8060S and 8065S: 16 CUs versus 40 CUs.

This is why the 8040S cannot be perceived as “almost Ryzen AI Max, just cheaper.” The architecture is the same, but the graphics scale is different. It is not a lower version of a powerful graphics solution, but rather a strong standard integrated graphics option within a more expensive professional platform.

What Makes Sense for Radeon 8040S

The rationale behind the Radeon 8040S becomes clear only when viewed as part of a CPU-first system. Such a chip can make sense in a compact laptop or mini-PC where a discrete graphics card is not desired, but a modern AMD processor with NPU, PRO features, and decent integrated graphics is needed.

In this scenario, the Radeon 8040S does not need to be a second-class 8050S. Its task is simpler: to provide sufficient performance for everyday graphics, photo editing, light video editing, 3D viewing, CAD viewport usage, and light gaming. For a working machine without a discrete GPU, this can be adequate.

However, the competitiveness of such a configuration heavily depends on the price. If a device with the Radeon 8040S is significantly cheaper than models with the 8050S, there is logic in that: the user is opting for the Ryzen AI Max platform for its CPU, NPU, compactness, and corporate features. If the price is close to versions with the Radeon 8050S, the logic quickly diminishes. In that case, the buyer pays for Ryzen AI Max but receives graphics that do not realize the main idea of Max.

How to Compare Radeon 8040S

The Radeon 8040S is closer to strong standard iGPUs than to the higher-end graphics blocks of Ryzen AI Max. In terms of CUs, it resembles the Radeon 880M/890M: it is the same order of graphics scale, rather than the level of the 8050S or 8060S.

GPU Class How to Perceive
Radeon 840M junior standard iGPU significantly lower than Radeon 8040S
Radeon 880M / 890M strong standard iGPU a close reference for graphic class
Radeon 8040S 16 CU RDNA 3.5 junior graphics of Ryzen AI Max
Radeon 8050S 32 CU RDNA 3.5 significantly faster than the 8040S
GeForce RTX 3050 Laptop junior discrete GPU more convenient for gaming, CUDA, and NVIDIA software

The main takeaway: the Radeon 8040S cannot be assessed based on the reputation of the higher Ryzen AI Max models. While the 8050S and 8060S can be discussed as alternatives to lower discrete graphics, the 8040S is more like good integrated graphics for a working platform where the GPU is not the main star.

Estimated Benchmarks of Radeon 8040S

There are fewer Radeon 8040S results than for the Radeon 8050S, so the numbers should be seen as a guideline rather than a guaranteed performance level for any system.

Test Radeon 8040S
Geekbench 6 OpenCL around 36-40k
PassMark G3D around 10400
Blender GPU around 376

These results do not showcase exceptional power but rather illustrate the real standing of the 8040S: above basic integrated GPUs but noticeably below the Radeon 8050S. In computational and graphic tests, it is more at the level of a good modern iGPU than the level of high-end graphics of the Ryzen AI Max.

The practical conclusion is straightforward: the Radeon 8040S is suitable for photo processing, light video editing, 3D viewing, basic graphics tasks, and light gaming. If the task genuinely hinges on the GPU, the higher-end Radeon 8050S / 8060S will be much more interesting.

Gaming: Possible to Play, Not Worth Buying for Games

In gaming, the Radeon 8040S should be seen as strong integrated graphics rather than a replacement for a discrete graphics card. Esports and lighter title projects at 1080p are within its capabilities, but in modern AAA games, settings will often need to be dialed down.

The primary scenario is 1080p at low or medium settings, depending on the game. In well-optimized projects, the FPS can be enjoyable, but heavy games with high textures, ray tracing, and significant VRAM consumption will quickly reveal the limitations of 16 CUs and shared system memory.

Compared to the Radeon 8050S, the gap will be noticeable. The 8050S has twice as many CUs and typically a more robust memory platform. Therefore, if gaming performance is critical, the Radeon 8040S is not the best option within the Ryzen AI Max.

Work Tasks and AI

The Radeon 8040S will be suitable for photo processing, simple video editing, and viewing 3D scenes in Blender or CAD applications. However, for heavy rendering, large 3D scenes, and prolonged GPU loads, 16 CUs are already insufficient.

In AI, it may be considered for local inference and experimentation: PyTorch via ROCm where specific builds and drivers support it, ONNX Runtime with DirectML on Windows, or llama.cpp via Vulkan/HIP. A realistic scenario would involve smaller or quantized models, rather than training or rapid image generation.

If work tasks genuinely depend on GPU power, it's better to look at least at Radeon 8050S / 8060S or a discrete NVIDIA solution.

What Matters Before Purchase

Choosing the Radeon 8040S should not only be based on the association with Ryzen AI Max. The specific configuration is crucial: this is the junior Ryzen AI Max PRO 380, not the senior Max models with robust graphics.

The main limitations include:

  • only 16 CUs versus 32 CUs of the Radeon 8050S;
  • 128-bit memory of the platform instead of the 256-bit of the higher configurations;
  • a maximum of 64 GB of memory in the Ryzen AI Max PRO 380;
  • lack of dedicated video memory;
  • poor predictability in CUDA software compared to NVIDIA;
  • strong dependence on the price of the finished device.

The Radeon 8040S makes sense when the device is purchased as a compact working system without a discrete graphics card, where CPU, NPU, PRO features, and energy efficiency are prioritized over graphical power. However, if the price approaches models with the Radeon 8050S, it’s better to opt for the 8050S immediately: it aligns much better with the core idea of Ryzen AI Max.

Conclusion

Radeon 8040S Graphics represents Ryzen AI Max without full graphics Max capabilities. In a standard Ryzen setup, such integrated graphics would look good: 16 CUs RDNA 3.5, modern architecture, high clock speeds, and adequate performance for everyday graphics. However, within the Ryzen AI Max, the expectations are higher.

This configuration is not useless, but its purpose is narrow: compact working systems without discrete graphics cards, where CPU, NPU, PRO features, quiet operation, and energy efficiency are more important than graphic power.

If a strong GPU is expected from the Ryzen AI Max, the Radeon 8040S is too limited. For gaming, heavy 3D tasks, GPU rendering, and serious local AI tasks, it’s best to look at least at the Radeon 8050S. Otherwise, one ends up with an expensive platform that lacks the very element that gives Ryzen AI Max its true “Max” identity.

Basic

Label Name
AMD
Platform
Integrated
Launch Date
January 2025
Model Name
AMD Radeon 8040S Graphics
Generation
Radeon 8000S
Boost Clock
2800 MHz
Bus Interface
Integrated
RT Cores
16
Compute Units
16
Tensor Cores
?
Tensor Cores are specialized processing units designed specifically for deep learning, providing higher training and inference performance compared to FP32 training. They enable rapid computations in areas such as computer vision, natural language processing, speech recognition, text-to-speech conversion, and personalized recommendations. The two most notable applications of Tensor Cores are DLSS (Deep Learning Super Sampling) and AI Denoiser for noise reduction.
No
TMUs
?
Texture Mapping Units (TMUs) serve as components of the GPU, which are capable of rotating, scaling, and distorting binary images, and then placing them as textures onto any plane of a given 3D model. This process is called texture mapping.
64
Foundry
TSMC
Process Size
4 nm
Architecture
RDNA 3.5

Memory Specifications

Memory Size
System Shared
Memory Type
System Shared LPDDR5x
Memory Bus
?
The memory bus width refers to the number of bits of data that the video memory can transfer within a single clock cycle. The larger the bus width, the greater the amount of data that can be transmitted instantaneously, making it one of the crucial parameters of video memory. The memory bandwidth is calculated as: Memory Bandwidth = Memory Frequency x Memory Bus Width / 8. Therefore, when the memory frequencies are similar, the memory bus width will determine the size of the memory bandwidth.
128-bit
Memory Clock
LPDDR5x-8000
Bandwidth
?
Memory bandwidth refers to the data transfer rate between the graphics chip and the video memory. It is measured in bytes per second, and the formula to calculate it is: memory bandwidth = working frequency × memory bus width / 8 bits.
128 GB/s

Theoretical Performance

Pixel Rate
?
Pixel fill rate refers to the number of pixels a graphics processing unit (GPU) can render per second, measured in MPixels/s (million pixels per second) or GPixels/s (billion pixels per second). It is the most commonly used metric to evaluate the pixel processing performance of a graphics card.
90 GPixel/s
Texture Rate
?
Texture fill rate refers to the number of texture map elements (texels) that a GPU can map to pixels in a single second.
179 GTexel/s
FP16 (half)
?
An important metric for measuring GPU performance is floating-point computing capability. Half-precision floating-point numbers (16-bit) are used for applications like machine learning, where lower precision is acceptable. Single-precision floating-point numbers (32-bit) are used for common multimedia and graphics processing tasks, while double-precision floating-point numbers (64-bit) are required for scientific computing that demands a wide numeric range and high accuracy.
11.47 TFLOPS
FP64 (double)
?
An important metric for measuring GPU performance is floating-point computing capability. Double-precision floating-point numbers (64-bit) are required for scientific computing that demands a wide numeric range and high accuracy, while single-precision floating-point numbers (32-bit) are used for common multimedia and graphics processing tasks. Half-precision floating-point numbers (16-bit) are used for applications like machine learning, where lower precision is acceptable.
179.2 GFLOPS
FP32 (float)
?
An important metric for measuring GPU performance is floating-point computing capability. Single-precision floating-point numbers (32-bit) are used for common multimedia and graphics processing tasks, while double-precision floating-point numbers (64-bit) are required for scientific computing that demands a wide numeric range and high accuracy. Half-precision floating-point numbers (16-bit) are used for applications like machine learning, where lower precision is acceptable.
5.734 TFLOPS

Miscellaneous

Shading Units
?
The most fundamental processing unit is the Streaming Processor (SP), where specific instructions and tasks are executed. GPUs perform parallel computing, which means multiple SPs work simultaneously to process tasks.
1024
OpenCL Version
2.1
OpenGL
4.6
CUDA
No
DirectX
12
Power Connectors
None
ROPs
?
The Raster Operations Pipeline (ROPs) is primarily responsible for handling lighting and reflection calculations in games, as well as managing effects like anti-aliasing (AA), high resolution, smoke, and fire. The more demanding the anti-aliasing and lighting effects in a game, the higher the performance requirements for the ROPs; otherwise, it may result in a sharp drop in frame rate.
32
Shader Model
6.8

Benchmarks

FP32 (float)
Score
5.734 TFLOPS
Blender
Score
376.14
Vulkan
Score
56877
OpenCL
Score
40471

Compared to Other GPU

FP32 (float) / TFLOPS
6.232 +8.7%
5.951 +3.8%
5.59 -2.5%
5.432 -5.3%
Blender
1466 +289.7%
403 +7.1%
45.58 -87.9%
Vulkan
117697 +106.9%
84769 +49%
34145 -40%
13903 -75.6%
OpenCL
84493 +108.8%
63654 +57.3%
23294 -42.4%
11854 -70.7%