Home / AMD / AMD Radeon 8040S Graphics: Performance and Specs

AMD Radeon 8040S Graphics

Name: AMD Radeon 8040S Graphics
Brand: AMD

AMD Radeon 8040S Graphics graphics card review

Radeon 8040S Graphics: Ryzen AI Max without Max Graphics

AMD Radeon 8040S Graphics is the strangest graphics configuration in the Ryzen AI Max family. On paper, it falls under the same platform as the higher-tier Radeon 8050S, 8060S, and 8065S, but in essence, it's a completely different product.

The higher Ryzen AI Max models are interesting because they attempt to replace a lower discrete graphics card with powerful integrated graphics. They feature many compute units (CUs), wide memory, and a shared pool of RAM that can be utilized for graphics, work tasks, and local AI scenarios. Radeon 8040S disrupts this idea: it has only 16 RDNA 3.5 graphics blocks and 128-bit LPDDR5X memory.

Therefore, it's difficult to consider the Radeon 8040S separately from the processor. It is not a standalone graphics card or even a mass-market integrated GPU (iGPU) that AMD typically integrates into various Ryzen processors. Currently, it is effectively tied to the Ryzen AI Max PRO 380. The standard Ryzen AI Max 300 series does not include it, and in the newer Ryzen AI Max PRO 400, the lower-end models already feature Radeon 8050S.

If the Radeon 8040S were positioned within a hypothetical Ryzen 7, it would seem quite appropriate: modern RDNA 3.5 graphics, 16 CUs, clock speeds up to 2800 MHz, a good level for an integrated solution. However, within the Ryzen AI Max, it appears questionable, since the Max series promises more in terms of graphics.

What is Officially Known

AMD specifies that the Radeon 8040S Graphics has 16 graphics blocks and a clock speed of up to 2800 MHz. This graphics solution is used in the Ryzen AI Max PRO 380, the lowest chip in the Ryzen AI Max PRO 300 series. This configuration has 6 CPU cores, 12 threads, 128-bit LPDDR5X memory, and supports a maximum of 64 GB of RAM.

This is more important than it seems. The Radeon 8040S is not only limited in graphics but also in the overall level of the platform. The higher Ryzen AI Max models have more graphics blocks and wider memory, thus better fulfilling the idea of a “compact workstation without a discrete GPU.”

The Radeon 8040S represents a different scenario. Here, the main argument is no longer the graphics but rather the CPU, NPU, PRO capabilities, compactness, and sufficient integrated graphics. In essence, it is more like a working platform without a discrete graphics card than an attempt to replace a discrete graphics card with an integrated GPU.

Where Radeon 8040S Stands in the Ryzen AI Max Graphics Hierarchy

AMD does not delineate 8040S / 8050S / 8060S / 8065S as a separate family of graphics cards, but it is convenient to compare them within the Ryzen AI Max series.

Graphics	CU	Clock Speed	Position
Radeon 8040S	16	up to 2800 MHz	junior configuration
Radeon 8050S	32	up to 2800 MHz	middle option
Radeon 8060S	40	up to 2900 MHz	senior graphics Ryzen AI Max 300
Radeon 8065S	40	up to 3000 MHz	senior graphics Ryzen AI Max PRO 400

The difference with the Radeon 8050S is very straightforward: 16 CUs versus 32 CUs. This means that the Radeon 8040S has half the number of graphics blocks. The gap widens even more with the Radeon 8060S and 8065S: 16 CUs versus 40 CUs.

This is why the 8040S cannot be perceived as “almost Ryzen AI Max, just cheaper.” The architecture is the same, but the graphics scale is different. It is not a lower version of a powerful graphics solution, but rather a strong standard integrated graphics option within a more expensive professional platform.

What Makes Sense for Radeon 8040S

The rationale behind the Radeon 8040S becomes clear only when viewed as part of a CPU-first system. Such a chip can make sense in a compact laptop or mini-PC where a discrete graphics card is not desired, but a modern AMD processor with NPU, PRO features, and decent integrated graphics is needed.

In this scenario, the Radeon 8040S does not need to be a second-class 8050S. Its task is simpler: to provide sufficient performance for everyday graphics, photo editing, light video editing, 3D viewing, CAD viewport usage, and light gaming. For a working machine without a discrete GPU, this can be adequate.

However, the competitiveness of such a configuration heavily depends on the price. If a device with the Radeon 8040S is significantly cheaper than models with the 8050S, there is logic in that: the user is opting for the Ryzen AI Max platform for its CPU, NPU, compactness, and corporate features. If the price is close to versions with the Radeon 8050S, the logic quickly diminishes. In that case, the buyer pays for Ryzen AI Max but receives graphics that do not realize the main idea of Max.

How to Compare Radeon 8040S

The Radeon 8040S is closer to strong standard iGPUs than to the higher-end graphics blocks of Ryzen AI Max. In terms of CUs, it resembles the Radeon 880M/890M: it is the same order of graphics scale, rather than the level of the 8050S or 8060S.

GPU	Class	How to Perceive
Radeon 840M	junior standard iGPU	significantly lower than Radeon 8040S
Radeon 880M / 890M	strong standard iGPU	a close reference for graphic class
Radeon 8040S	16 CU RDNA 3.5	junior graphics of Ryzen AI Max
Radeon 8050S	32 CU RDNA 3.5	significantly faster than the 8040S
GeForce RTX 3050 Laptop	junior discrete GPU	more convenient for gaming, CUDA, and NVIDIA software

The main takeaway: the Radeon 8040S cannot be assessed based on the reputation of the higher Ryzen AI Max models. While the 8050S and 8060S can be discussed as alternatives to lower discrete graphics, the 8040S is more like good integrated graphics for a working platform where the GPU is not the main star.

Estimated Benchmarks of Radeon 8040S

There are fewer Radeon 8040S results than for the Radeon 8050S, so the numbers should be seen as a guideline rather than a guaranteed performance level for any system.

Test	Radeon 8040S
Geekbench 6 OpenCL	around 36-40k
PassMark G3D	around 10400
Blender GPU	around 376

These results do not showcase exceptional power but rather illustrate the real standing of the 8040S: above basic integrated GPUs but noticeably below the Radeon 8050S. In computational and graphic tests, it is more at the level of a good modern iGPU than the level of high-end graphics of the Ryzen AI Max.

The practical conclusion is straightforward: the Radeon 8040S is suitable for photo processing, light video editing, 3D viewing, basic graphics tasks, and light gaming. If the task genuinely hinges on the GPU, the higher-end Radeon 8050S / 8060S will be much more interesting.

Gaming: Possible to Play, Not Worth Buying for Games

In gaming, the Radeon 8040S should be seen as strong integrated graphics rather than a replacement for a discrete graphics card. Esports and lighter title projects at 1080p are within its capabilities, but in modern AAA games, settings will often need to be dialed down.

The primary scenario is 1080p at low or medium settings, depending on the game. In well-optimized projects, the FPS can be enjoyable, but heavy games with high textures, ray tracing, and significant VRAM consumption will quickly reveal the limitations of 16 CUs and shared system memory.

Compared to the Radeon 8050S, the gap will be noticeable. The 8050S has twice as many CUs and typically a more robust memory platform. Therefore, if gaming performance is critical, the Radeon 8040S is not the best option within the Ryzen AI Max.

Work Tasks and AI

The Radeon 8040S will be suitable for photo processing, simple video editing, and viewing 3D scenes in Blender or CAD applications. However, for heavy rendering, large 3D scenes, and prolonged GPU loads, 16 CUs are already insufficient.

In AI, it may be considered for local inference and experimentation: PyTorch via ROCm where specific builds and drivers support it, ONNX Runtime with DirectML on Windows, or llama.cpp via Vulkan/HIP. A realistic scenario would involve smaller or quantized models, rather than training or rapid image generation.

If work tasks genuinely depend on GPU power, it's better to look at least at Radeon 8050S / 8060S or a discrete NVIDIA solution.

What Matters Before Purchase

Choosing the Radeon 8040S should not only be based on the association with Ryzen AI Max. The specific configuration is crucial: this is the junior Ryzen AI Max PRO 380, not the senior Max models with robust graphics.

The main limitations include:

only 16 CUs versus 32 CUs of the Radeon 8050S;
128-bit memory of the platform instead of the 256-bit of the higher configurations;
a maximum of 64 GB of memory in the Ryzen AI Max PRO 380;
lack of dedicated video memory;
poor predictability in CUDA software compared to NVIDIA;
strong dependence on the price of the finished device.

The Radeon 8040S makes sense when the device is purchased as a compact working system without a discrete graphics card, where CPU, NPU, PRO features, and energy efficiency are prioritized over graphical power. However, if the price approaches models with the Radeon 8050S, it’s better to opt for the 8050S immediately: it aligns much better with the core idea of Ryzen AI Max.

Conclusion

Radeon 8040S Graphics represents Ryzen AI Max without full graphics Max capabilities. In a standard Ryzen setup, such integrated graphics would look good: 16 CUs RDNA 3.5, modern architecture, high clock speeds, and adequate performance for everyday graphics. However, within the Ryzen AI Max, the expectations are higher.

This configuration is not useless, but its purpose is narrow: compact working systems without discrete graphics cards, where CPU, NPU, PRO features, quiet operation, and energy efficiency are more important than graphic power.

If a strong GPU is expected from the Ryzen AI Max, the Radeon 8040S is too limited. For gaming, heavy 3D tasks, GPU rendering, and serious local AI tasks, it’s best to look at least at the Radeon 8050S. Otherwise, one ends up with an expensive platform that lacks the very element that gives Ryzen AI Max its true “Max” identity.

Basic

Label Name

AMD

Platform

Integrated

Launch Date

January 2025

Model Name

AMD Radeon 8040S Graphics

Generation

Radeon 8000S

Boost Clock

2800 MHz

Bus Interface

Integrated

RT Cores

Compute Units

Tensor Cores

Tensor Cores are specialized processing units designed specifically for deep learning, providing higher training and inference performance compared to FP32 training. They enable rapid computations in areas such as computer vision, natural language processing, speech recognition, text-to-speech conversion, and personalized recommendations. The two most notable applications of Tensor Cores are DLSS (Deep Learning Super Sampling) and AI Denoiser for noise reduction.

TMUs

Texture Mapping Units (TMUs) serve as components of the GPU, which are capable of rotating, scaling, and distorting binary images, and then placing them as textures onto any plane of a given 3D model. This process is called texture mapping.

Foundry

TSMC

Process Size

4 nm

Architecture

RDNA 3.5

Memory Specifications

Memory Size

System Shared

Memory Type

System Shared LPDDR5x

Memory Bus

The memory bus width refers to the number of bits of data that the video memory can transfer within a single clock cycle. The larger the bus width, the greater the amount of data that can be transmitted instantaneously, making it one of the crucial parameters of video memory. The memory bandwidth is calculated as: Memory Bandwidth = Memory Frequency x Memory Bus Width / 8. Therefore, when the memory frequencies are similar, the memory bus width will determine the size of the memory bandwidth.

128-bit

Memory Clock

LPDDR5x-8000

Bandwidth

Memory bandwidth refers to the data transfer rate between the graphics chip and the video memory. It is measured in bytes per second, and the formula to calculate it is: memory bandwidth = working frequency × memory bus width / 8 bits.

128 GB/s

Theoretical Performance

Pixel Rate

Pixel fill rate refers to the number of pixels a graphics processing unit (GPU) can render per second, measured in MPixels/s (million pixels per second) or GPixels/s (billion pixels per second). It is the most commonly used metric to evaluate the pixel processing performance of a graphics card.

90 GPixel/s

Texture Rate

Texture fill rate refers to the number of texture map elements (texels) that a GPU can map to pixels in a single second.

179 GTexel/s

FP16 (half)

An important metric for measuring GPU performance is floating-point computing capability. Half-precision floating-point numbers (16-bit) are used for applications like machine learning, where lower precision is acceptable. Single-precision floating-point numbers (32-bit) are used for common multimedia and graphics processing tasks, while double-precision floating-point numbers (64-bit) are required for scientific computing that demands a wide numeric range and high accuracy.

11.47 TFLOPS

FP64 (double)

An important metric for measuring GPU performance is floating-point computing capability. Double-precision floating-point numbers (64-bit) are required for scientific computing that demands a wide numeric range and high accuracy, while single-precision floating-point numbers (32-bit) are used for common multimedia and graphics processing tasks. Half-precision floating-point numbers (16-bit) are used for applications like machine learning, where lower precision is acceptable.

179.2 GFLOPS

FP32 (float)

An important metric for measuring GPU performance is floating-point computing capability. Single-precision floating-point numbers (32-bit) are used for common multimedia and graphics processing tasks, while double-precision floating-point numbers (64-bit) are required for scientific computing that demands a wide numeric range and high accuracy. Half-precision floating-point numbers (16-bit) are used for applications like machine learning, where lower precision is acceptable.

5.734 TFLOPS

Miscellaneous

Shading Units

The most fundamental processing unit is the Streaming Processor (SP), where specific instructions and tasks are executed. GPUs perform parallel computing, which means multiple SPs work simultaneously to process tasks.

1024

OpenCL Version

2.1

OpenGL

4.6

CUDA

DirectX

Power Connectors

None

ROPs

The Raster Operations Pipeline (ROPs) is primarily responsible for handling lighting and reflection calculations in games, as well as managing effects like anti-aliasing (AA), high resolution, smoke, and fire. The more demanding the anti-aliasing and lighting effects in a game, the higher the performance requirements for the ROPs; otherwise, it may result in a sharp drop in frame rate.

Shader Model

6.8

Benchmarks

FP32 (float)

Score

5.734 TFLOPS

Blender

Score

376.14

Vulkan

Score

56877

OpenCL

Score

40471

Compared to Other GPU

FP32 (float) / TFLOPS

GeForce GTX 980 Ti

6.181 +7.8%

Radeon RX 580 OEM

5.951 +3.8%

Radeon 8040S Graphics

5.734

Tesla P4

5.59 -2.5%

Arc Pro A60M

5.432 -5.3%

Blender

Quadro RTX 5000

1408.56 +274.5%

Tesla P40

802 +113.2%

P106 100

391 +4%

Radeon 8040S Graphics

376.14

GeForce GT 1030

45.58 -87.9%

Vulkan

Radeon RX 6750 XT

113016 +98.7%

RTX 2000 Ada Generation

84494 +48.6%

Radeon 8040S Graphics

56877

34563 -39.2%

GeForce GTX 1050

17379 -69.4%

OpenCL

Radeon RX 6600 XT

80858 +99.8%

TITAN X Pascal

62379 +54.1%

Radeon 8040S Graphics

40471

FirePro W7100

25000 -38.2%

GeForce MX350

12811 -68.3%

AMD Radeon 8040S Graphics

Radeon 8040S Graphics: Ryzen AI Max without Max Graphics

What is Officially Known

Where Radeon 8040S Stands in the Ryzen AI Max Graphics Hierarchy

What Makes Sense for Radeon 8040S

How to Compare Radeon 8040S

Estimated Benchmarks of Radeon 8040S

Gaming: Possible to Play, Not Worth Buying for Games

Work Tasks and AI

What Matters Before Purchase

Conclusion

Basic

Memory Specifications

Theoretical Performance

Miscellaneous

Benchmarks

Compared to Other GPU

Share in social media