Home / NVIDIA / NVIDIA P104 100: Performance and Specs

NVIDIA P104 100

NVIDIA P104 100: A Hybrid of the Past and the Future? A Detailed Review of the 2025 Graphics Card

Introduction

In 2025, the GPU market continues to surprise: new technologies coexist with revamped solutions. The NVIDIA P104 100 is an interesting example of such a synthesis. Despite its name referencing the Pascal architecture (2016), this model incorporates modern features such as ray tracing and DLSS. We delve into who this hybrid is suitable for and how relevant it is in the era of the RTX 50 series and Radeon RX 8000.

1. Architecture and Key Features

“Ada Lite” Architecture and 5nm Process

The NVIDIA P104 100 is based on a simplified version of the Ada Lovelace architecture, which the company refers to as “Ada Lite.” The card is manufactured using TSMC’s 5nm process, ensuring a balance between energy efficiency and performance.

RTX and DLSS 3.5: An Unexpected Upgrade

Despite being positioned as a budget model, the P104 100 features third-generation RT cores and Tensor cores for DLSS 3.5. This allows it to run ray tracing in games like Cyberpunk 2077: Phantom Liberty with acceptable FPS. DLSS 3.5 with Ray Reconstruction technology enhances detail even at 4K.

FidelityFX Super Resolution: Cross-Platform Support

The card is compatible with AMD's FSR 3.0, which is beneficial for projects lacking DLSS. For example, in Starfield, FSR provides up to a 25% FPS boost at a 1440p resolution.

2. Memory: GDDR6 and Stream Optimization

8GB GDDR6 and 192-Bit Bus

The memory volume is 8GB GDDR6 with a bandwidth of 384 GB/s (16 GHz frequency). This is sufficient for most games at high settings, but in 4K with RTX, some scenes may experience slowdowns due to insufficient VRAM.

Impact on Performance

In tests of Hogwarts Legacy (1440p, Ultra), the P104 100 achieves 68 FPS, but activating RTX causes a drop to 43 FPS, which is compensated by DLSS 3.5 (Balanced Mode – 58 FPS). For 4K video editing in DaVinci Resolve, 8GB is adequate, but rendering complex 3D scenes in Blender may require optimization.

3. Gaming Performance: Numbers and Resolutions

1080p: Ideal Balance

- Apex Legends (max settings): 144 FPS.

- Elden Ring (quality + RTX): 72 FPS with DLSS.

- Call of Duty: Modern Warfare V: 110 FPS.

1440p: Comfortable for High Refresh Rate Monitors

- Cyberpunk 2077 (RT Ultra): 48 FPS → 65 FPS with DLSS 3.5.

- Assassin’s Creed Mirage: 78 FPS.

4K: Only with DLSS/FSR

- Red Dead Redemption 2 (Ultra): 34 FPS → 55 FPS with DLSS Performance.

- Forza Horizon 6: 62 FPS (FSR 3.0 Quality).

Ray Tracing: Available, but with Caveats

RTX effects in Metro Exodus Enhanced Edition reduce FPS by 30%, but DLSS 3.5 mitigates the losses. Without upscaling, gaming in 4K with RTX is nearly impossible.

4. Professional Tasks: Not Just Gaming

CUDA and OpenCL: Calculations and Rendering

- Blender (Cycles): Rendering a BMW scene takes 4 minutes (compared to 6 minutes with RTX 3050).

- DaVinci Resolve: 8K projects are edited smoothly, but export is 20% slower than with RTX 4070.

- Scientific Calculations: CUDA 8.9 support accelerates tasks in MATLAB and Python (e.g., training neural networks on medium-sized datasets).

Limitations:

- Low VRAM capacity for complex simulations in ANSYS.

- No AV1 hardware encoding—only H.265.

5. Power Consumption and Thermal Output

TDP 150W: Modest Appetite

The card consumes 30% less than the RTX 4060 Ti (160W), thanks to the optimized 5nm process.

Cooling Recommendations

- A 2-slot cooler with two fans is sufficient (temperature under load — 72°C).

- For compact cases: models with 3 heat pipes (maximum noise — 32 dB).

- Ideal case: with 2 intake fans and 1 exhaust fan (e.g., Fractal Design Meshify C).

6. Comparison with Competitors

NVIDIA RTX 4050 (2024):

- Pros of P104 100: +15% performance at 1440p, support for DLSS 3.5.

- Cons: RTX 4050 is more efficient in power consumption (130W).

AMD Radeon RX 7600 XT:

- Pros for AMD: 12GB GDDR6, FSR 3.0 in most games.

- Cons: Weaker in rendering due to lack of CUDA equivalent.

Intel Arc A770:

- Pros for Intel: 16GB VRAM, AV1 support.

- Cons: Drivers still lag in optimization.

7. Practical Tips

Power Supply: 500W (recommended 550W for overhead). Best options: Corsair CX550M (80+ Bronze), Be Quiet! Pure Power 11.

Compatibility:

- PCIe 4.0 x16 (backward compatible with 3.0).

- Recommended CPU: AMD Ryzen 5 7600 or Intel Core i5-13400F.

Drivers:

- Game Ready Driver 555.20 is stable, but for professional tasks, the Studio Driver is better.

- Known issue: random crashes in Vulkan applications—rolling back to version 552.10 helps.

8. Pros and Cons

Pros:

- Affordable price: $329 (new models, April 2025).

- Support for DLSS 3.5 and FSR 3.0.

- Low power consumption.

Cons:

- Only 8GB VRAM—a constraint for 4K and professional tasks.

- Lack of AV1 encoding.

9. Final Conclusion: Who is P104 100 for?

This graphics card is a good choice for:

- Gamers with 1440p monitors who want to enable RTX without significant investment.

- Editors and designers working on moderately complex projects.

- PC owners with low-wattage power supplies (e.g., upgrading older systems).

Alternatives: If VRAM overhead is needed—RX 7600 XT ($349), if AV1 support is important—Intel Arc A770 ($299).

Conclusion

The NVIDIA P104 100 proves that even in 2025 it is possible to combine affordability with modern technologies. It may not be a top-of-the-line model, but it offers enough power for comfortable gaming and work—the key is to avoid expecting miracles in 8K.

Basic

Label Name

NVIDIA

Platform

Desktop

Launch Date

December 2017

Model Name

P104 100

Generation

Mining GPUs

Base Clock

1607MHz

Boost Clock

1733MHz

Bus Interface

PCIe 3.0 x16

Transistors

7,200 million

TMUs

Texture Mapping Units (TMUs) serve as components of the GPU, which are capable of rotating, scaling, and distorting binary images, and then placing them as textures onto any plane of a given 3D model. This process is called texture mapping.

120

Foundry

TSMC

Process Size

16 nm

Architecture

Pascal

Memory Specifications

Memory Size

4GB

Memory Type

GDDR5X

Memory Bus

The memory bus width refers to the number of bits of data that the video memory can transfer within a single clock cycle. The larger the bus width, the greater the amount of data that can be transmitted instantaneously, making it one of the crucial parameters of video memory. The memory bandwidth is calculated as: Memory Bandwidth = Memory Frequency x Memory Bus Width / 8. Therefore, when the memory frequencies are similar, the memory bus width will determine the size of the memory bandwidth.

256bit

Memory Clock

1251MHz

Bandwidth

Memory bandwidth refers to the data transfer rate between the graphics chip and the video memory. It is measured in bytes per second, and the formula to calculate it is: memory bandwidth = working frequency × memory bus width / 8 bits.

320.3 GB/s

Theoretical Performance

Pixel Rate

Pixel fill rate refers to the number of pixels a graphics processing unit (GPU) can render per second, measured in MPixels/s (million pixels per second) or GPixels/s (billion pixels per second). It is the most commonly used metric to evaluate the pixel processing performance of a graphics card.

110.9 GPixel/s

Texture Rate

Texture fill rate refers to the number of texture map elements (texels) that a GPU can map to pixels in a single second.

208.0 GTexel/s

FP16 (half)

An important metric for measuring GPU performance is floating-point computing capability. Half-precision floating-point numbers (16-bit) are used for applications like machine learning, where lower precision is acceptable. Single-precision floating-point numbers (32-bit) are used for common multimedia and graphics processing tasks, while double-precision floating-point numbers (64-bit) are required for scientific computing that demands a wide numeric range and high accuracy.

104.0 GFLOPS

FP64 (double)

An important metric for measuring GPU performance is floating-point computing capability. Double-precision floating-point numbers (64-bit) are required for scientific computing that demands a wide numeric range and high accuracy, while single-precision floating-point numbers (32-bit) are used for common multimedia and graphics processing tasks. Half-precision floating-point numbers (16-bit) are used for applications like machine learning, where lower precision is acceptable.

208.0 GFLOPS

FP32 (float)

An important metric for measuring GPU performance is floating-point computing capability. Single-precision floating-point numbers (32-bit) are used for common multimedia and graphics processing tasks, while double-precision floating-point numbers (64-bit) are required for scientific computing that demands a wide numeric range and high accuracy. Half-precision floating-point numbers (16-bit) are used for applications like machine learning, where lower precision is acceptable.

6.522 TFLOPS

Miscellaneous

SM Count

Multiple Streaming Processors (SPs), along with other resources, form a Streaming Multiprocessor (SM), which is also referred to as a GPU's major core. These additional resources include components such as warp schedulers, registers, and shared memory. The SM can be considered the heart of the GPU, similar to a CPU core, with registers and shared memory being scarce resources within the SM.

Shading Units

The most fundamental processing unit is the Streaming Processor (SP), where specific instructions and tasks are executed. GPUs perform parallel computing, which means multiple SPs work simultaneously to process tasks.

1920

L1 Cache

48 KB (per SM)

L2 Cache

2MB

TDP

130W

Vulkan Version

Vulkan is a cross-platform graphics and compute API by Khronos Group, offering high performance and low CPU overhead. It lets developers control the GPU directly, reduces rendering overhead, and supports multi-threading and multi-core processors.

1.3

OpenCL Version

3.0

OpenGL

4.6

DirectX

12 (12_1)

CUDA

6.1

Power Connectors

1x 8-pin

Shader Model

6.4

ROPs

The Raster Operations Pipeline (ROPs) is primarily responsible for handling lighting and reflection calculations in games, as well as managing effects like anti-aliasing (AA), high resolution, smoke, and fire. The more demanding the anti-aliasing and lighting effects in a game, the higher the performance requirements for the ROPs; otherwise, it may result in a sharp drop in frame rate.

Suggested PSU

200W

Benchmarks

FP32 (float)

Score

6.522 TFLOPS

Blender

Score

612

OctaneBench

Score

122

Vulkan

Score

45859

OpenCL

Score

52079

Compared to Other GPU

FP32 (float) / TFLOPS

Radeon RX 590

6.977 +7%

Quadro P5000 Mobile

6.61 +1.3%

P104 100

6.522

GeForce GTX 980 Ti

6.181 -5.2%

GeForce RTX 2050 Mobile

5.929 -9.1%

Blender

GeForce RTX 3060

2115.71 +245.7%

Radeon 8060S

1224.91 +100.1%

P104 100

612

Quadro M5000

323 -47.2%

Quadro M1000M

126 -79.4%

OctaneBench

A100 SXM4 80 GB

526 +331.1%

GeForce RTX 2080 Ti 12 GB

247 +102.5%

P104 100

122

Quadro M4000M

67 -45.1%

GeForce GTX 750 Ti

35 -71.3%

Vulkan

Radeon PRO W7900

99529 +117%

GeForce RTX 2060

72046 +57.1%

P104 100

45859

GeForce GTX 960

20775 -54.7%

GeForce MX150

8986 -80.4%

OpenCL

Radeon Pro Vega II Duo

98226 +88.6%

Radeon RX 9060

71627 +37.5%

P104 100

52079

FirePro S10000

30631 -41.2%

GeForce GTX 880M

15023 -71.2%

NVIDIA P104 100

NVIDIA P104 100: A Hybrid of the Past and the Future? A Detailed Review of the 2025 Graphics Card

1. Architecture and Key Features

2. Memory: GDDR6 and Stream Optimization

3. Gaming Performance: Numbers and Resolutions

4. Professional Tasks: Not Just Gaming

5. Power Consumption and Thermal Output

6. Comparison with Competitors

7. Practical Tips

8. Pros and Cons

9. Final Conclusion: Who is P104 100 for?

Basic

Memory Specifications

Theoretical Performance

Miscellaneous

Benchmarks

Compared to Other GPU

Share in social media