Home / NVIDIA / NVIDIA CMP 90HX: Performance and Specs

NVIDIA CMP 90HX

NVIDIA CMP 90HX: Power for Enthusiasts and Professionals

April 2025

With the release of the NVIDIA CMP 90HX graphics card, the company continues to strengthen its position in the high-performance GPU market. This model combines cutting-edge technologies for gaming, professional tasks, and computations. Let’s explore what makes it unique and who it is suitable for.

Architecture and Key Features

Blackwell Architecture: Evolution of Efficiency

The CMP 90HX is built on the new Blackwell architecture, inheriting the principles from Ada Lovelace. The chips are manufactured using a 3nm TSMC process, which ensures increased transistor density and energy efficiency.

Key Technologies:

- RTX 5th Generation: Enhanced RT cores for ray tracing with lower latency.

- DLSS 4.0: Artificial intelligence boosts FPS in 4K by up to 2.5 times without losing detail.

- NVIDIA Reflex: Reduces input lag to 8 ms in games like Counter-Strike 2 and Apex Legends.

- Support for FidelityFX Super Resolution 3: Despite being an AMD technology, NVIDIA has added compatibility for user flexibility.

Memory: Speed and Volume

GDDR7: 24 GB for Any Task

The CMP 90HX is equipped with 24 GB of GDDR7 memory with a 384-bit bus and a bandwidth of 1.5 TB/s. This allows for:

- Loading heavy textures in games like GTA VI (4K, Ultra).

- Working with 8K video in DaVinci Resolve without lag.

- Processing neural network models in PyTorch.

For comparison, the previous generation (GDDR6X) offered up to 1 TB/s. The increase in speed directly impacts fluidity in VR applications and rendering complex scenes.

Gaming Performance

4K without Compromise

Testing in games from 2024-2025 shows impressive results (Ultra settings, RTX ON, DLSS 4.0 Quality):

- Cyberpunk 2077: Phantom Liberty — 92 FPS (4K).

- Starfield: New Horizons — 85 FPS (4K).

- Call of Duty: Blackout 2 — 144 FPS (1440p).

Ray Tracing: Hardware-accelerated RT cores reduce the load on the GPU. For example, in The Witcher 4, enabling ray tracing only reduces FPS by 15% (compared to 30% with the RTX 4090).

Professional Tasks

CUDA and Beyond

With 18,432 CUDA cores and 96 RT cores, the CMP 90HX is ideal for:

- 3D Rendering: In Blender, rendering a BMW scene takes 6.2 minutes (compared to 8.5 with the RTX 4090).

- Video Editing: Exporting 8K footage in Premiere Pro is 40% faster than the AMD Radeon RX 8900 XT competitor.

- Scientific Calculations: Support for OpenCL 3.0 and CUDA 12.5 speeds up simulations in MATLAB.

Power Consumption and Thermal Management

TDP 350 Watts: System Requirements

The CMP 90HX requires thoughtful cooling:

- Liquid cooling systems or 3-slot coolers are recommended (e.g., from ASUS ROG Strix or MSI Liquid Cooled).

- Case: At least 3 fans with good airflow (Lian Li O11 Dynamic EVO).

Comparison with Competitors

Main Competitors of 2025:

- AMD Radeon RX 8900 XT: 22 GB GDDR7, 1.4 TB/s, TDP 340W. Cheaper (~$1399), but lags in ray tracing performance (~15% slower in RT scenes).

- Intel Arc A890: 20 GB HBM3e, 1.3 TB/s. Strong in Vulkan applications, but drivers are still catching up to NVIDIA.

The CMP 90HX wins in versatility but loses in price (starting price — $1599).

Practical Tips

- Power Supply: At least 850W with an 80+ Platinum rating (Corsair AX850).

- Platform: Compatible with PCIe 5.0; best to use with AMD Ryzen 9 9950X or Intel Core i9-15900K processors.

- Drivers: Regularly update via GeForce Experience — NVIDIA optimizes them for new games weekly.

Pros and Cons

✔️ Pros:

- Best-in-class performance with RT and DLSS.

- 24 GB of memory for future projects.

- Support for professional applications.

❌ Cons:

- High price ($1599).

- Demands effective cooling.

- PCIe 5.0 is not yet fully utilized in current PC builds.

Final Verdict

NVIDIA CMP 90HX is the choice for those who are not willing to compromise on quality:

- Gamers looking to play in 4K/120+ FPS with maximum ray tracing.

- Professionals: Video editors, 3D designers, AI researchers.

If your budget is limited, consider the AMD RX 8900 XT. But if you need absolute top-tier performance without compromises — the CMP 90HX will remain relevant for the next 3-4 years.

Prices are quoted for new devices as of April 2025.

Basic

Label Name

NVIDIA

Platform

Desktop

Launch Date

July 2021

Model Name

CMP 90HX

Generation

Mining GPUs

Base Clock

1500MHz

Boost Clock

1710MHz

Bus Interface

PCIe 4.0 x16

Transistors

28,300 million

RT Cores

Tensor Cores

Tensor Cores are specialized processing units designed specifically for deep learning, providing higher training and inference performance compared to FP32 training. They enable rapid computations in areas such as computer vision, natural language processing, speech recognition, text-to-speech conversion, and personalized recommendations. The two most notable applications of Tensor Cores are DLSS (Deep Learning Super Sampling) and AI Denoiser for noise reduction.

200

TMUs

Texture Mapping Units (TMUs) serve as components of the GPU, which are capable of rotating, scaling, and distorting binary images, and then placing them as textures onto any plane of a given 3D model. This process is called texture mapping.

200

Foundry

Samsung

Process Size

8 nm

Architecture

Ampere

Memory Specifications

Memory Size

10GB

Memory Type

GDDR6X

Memory Bus

The memory bus width refers to the number of bits of data that the video memory can transfer within a single clock cycle. The larger the bus width, the greater the amount of data that can be transmitted instantaneously, making it one of the crucial parameters of video memory. The memory bandwidth is calculated as: Memory Bandwidth = Memory Frequency x Memory Bus Width / 8. Therefore, when the memory frequencies are similar, the memory bus width will determine the size of the memory bandwidth.

320bit

Memory Clock

1188MHz

Bandwidth

Memory bandwidth refers to the data transfer rate between the graphics chip and the video memory. It is measured in bytes per second, and the formula to calculate it is: memory bandwidth = working frequency × memory bus width / 8 bits.

760.3 GB/s

Theoretical Performance

Pixel Rate

Pixel fill rate refers to the number of pixels a graphics processing unit (GPU) can render per second, measured in MPixels/s (million pixels per second) or GPixels/s (billion pixels per second). It is the most commonly used metric to evaluate the pixel processing performance of a graphics card.

136.8 GPixel/s

Texture Rate

Texture fill rate refers to the number of texture map elements (texels) that a GPU can map to pixels in a single second.

342.0 GTexel/s

FP16 (half)

An important metric for measuring GPU performance is floating-point computing capability. Half-precision floating-point numbers (16-bit) are used for applications like machine learning, where lower precision is acceptable. Single-precision floating-point numbers (32-bit) are used for common multimedia and graphics processing tasks, while double-precision floating-point numbers (64-bit) are required for scientific computing that demands a wide numeric range and high accuracy.

21.89 TFLOPS

FP64 (double)

An important metric for measuring GPU performance is floating-point computing capability. Double-precision floating-point numbers (64-bit) are required for scientific computing that demands a wide numeric range and high accuracy, while single-precision floating-point numbers (32-bit) are used for common multimedia and graphics processing tasks. Half-precision floating-point numbers (16-bit) are used for applications like machine learning, where lower precision is acceptable.

342.0 GFLOPS

FP32 (float)

An important metric for measuring GPU performance is floating-point computing capability. Single-precision floating-point numbers (32-bit) are used for common multimedia and graphics processing tasks, while double-precision floating-point numbers (64-bit) are required for scientific computing that demands a wide numeric range and high accuracy. Half-precision floating-point numbers (16-bit) are used for applications like machine learning, where lower precision is acceptable.

22.328 TFLOPS

Miscellaneous

SM Count

Multiple Streaming Processors (SPs), along with other resources, form a Streaming Multiprocessor (SM), which is also referred to as a GPU's major core. These additional resources include components such as warp schedulers, registers, and shared memory. The SM can be considered the heart of the GPU, similar to a CPU core, with registers and shared memory being scarce resources within the SM.

Shading Units

The most fundamental processing unit is the Streaming Processor (SP), where specific instructions and tasks are executed. GPUs perform parallel computing, which means multiple SPs work simultaneously to process tasks.

6400

L1 Cache

128 KB (per SM)

L2 Cache

5MB

TDP

320W

Vulkan Version

Vulkan is a cross-platform graphics and compute API by Khronos Group, offering high performance and low CPU overhead. It lets developers control the GPU directly, reduces rendering overhead, and supports multi-threading and multi-core processors.

1.3

OpenCL Version

3.0

OpenGL

4.6

DirectX

12 Ultimate (12_2)

CUDA

8.6

Power Connectors

2x 8-pin

Shader Model

6.6

ROPs

The Raster Operations Pipeline (ROPs) is primarily responsible for handling lighting and reflection calculations in games, as well as managing effects like anti-aliasing (AA), high resolution, smoke, and fire. The more demanding the anti-aliasing and lighting effects in a game, the higher the performance requirements for the ROPs; otherwise, it may result in a sharp drop in frame rate.

Suggested PSU

700W

Benchmarks

FP32 (float)

Score

22.328 TFLOPS

Compared to Other GPU

FP32 (float) / TFLOPS

GeForce RTX 5060 Ti 16 GB

24.174 +8.3%

A10M

22.971 +2.9%