NVIDIA P104 100

NVIDIA P104 100

NVIDIA P104 100: A Hybrid of the Past and the Future? A Detailed Review of the 2025 Graphics Card

Introduction

In 2025, the GPU market continues to surprise: new technologies coexist with revamped solutions. The NVIDIA P104 100 is an interesting example of such a synthesis. Despite its name referencing the Pascal architecture (2016), this model incorporates modern features such as ray tracing and DLSS. We delve into who this hybrid is suitable for and how relevant it is in the era of the RTX 50 series and Radeon RX 8000.


1. Architecture and Key Features

“Ada Lite” Architecture and 5nm Process

The NVIDIA P104 100 is based on a simplified version of the Ada Lovelace architecture, which the company refers to as “Ada Lite.” The card is manufactured using TSMC’s 5nm process, ensuring a balance between energy efficiency and performance.

RTX and DLSS 3.5: An Unexpected Upgrade

Despite being positioned as a budget model, the P104 100 features third-generation RT cores and Tensor cores for DLSS 3.5. This allows it to run ray tracing in games like Cyberpunk 2077: Phantom Liberty with acceptable FPS. DLSS 3.5 with Ray Reconstruction technology enhances detail even at 4K.

FidelityFX Super Resolution: Cross-Platform Support

The card is compatible with AMD's FSR 3.0, which is beneficial for projects lacking DLSS. For example, in Starfield, FSR provides up to a 25% FPS boost at a 1440p resolution.


2. Memory: GDDR6 and Stream Optimization

8GB GDDR6 and 192-Bit Bus

The memory volume is 8GB GDDR6 with a bandwidth of 384 GB/s (16 GHz frequency). This is sufficient for most games at high settings, but in 4K with RTX, some scenes may experience slowdowns due to insufficient VRAM.

Impact on Performance

In tests of Hogwarts Legacy (1440p, Ultra), the P104 100 achieves 68 FPS, but activating RTX causes a drop to 43 FPS, which is compensated by DLSS 3.5 (Balanced Mode – 58 FPS). For 4K video editing in DaVinci Resolve, 8GB is adequate, but rendering complex 3D scenes in Blender may require optimization.


3. Gaming Performance: Numbers and Resolutions

1080p: Ideal Balance

- Apex Legends (max settings): 144 FPS.

- Elden Ring (quality + RTX): 72 FPS with DLSS.

- Call of Duty: Modern Warfare V: 110 FPS.

1440p: Comfortable for High Refresh Rate Monitors

- Cyberpunk 2077 (RT Ultra): 48 FPS → 65 FPS with DLSS 3.5.

- Assassin’s Creed Mirage: 78 FPS.

4K: Only with DLSS/FSR

- Red Dead Redemption 2 (Ultra): 34 FPS → 55 FPS with DLSS Performance.

- Forza Horizon 6: 62 FPS (FSR 3.0 Quality).

Ray Tracing: Available, but with Caveats

RTX effects in Metro Exodus Enhanced Edition reduce FPS by 30%, but DLSS 3.5 mitigates the losses. Without upscaling, gaming in 4K with RTX is nearly impossible.


4. Professional Tasks: Not Just Gaming

CUDA and OpenCL: Calculations and Rendering

- Blender (Cycles): Rendering a BMW scene takes 4 minutes (compared to 6 minutes with RTX 3050).

- DaVinci Resolve: 8K projects are edited smoothly, but export is 20% slower than with RTX 4070.

- Scientific Calculations: CUDA 8.9 support accelerates tasks in MATLAB and Python (e.g., training neural networks on medium-sized datasets).

Limitations:

- Low VRAM capacity for complex simulations in ANSYS.

- No AV1 hardware encoding—only H.265.


5. Power Consumption and Thermal Output

TDP 150W: Modest Appetite

The card consumes 30% less than the RTX 4060 Ti (160W), thanks to the optimized 5nm process.

Cooling Recommendations

- A 2-slot cooler with two fans is sufficient (temperature under load — 72°C).

- For compact cases: models with 3 heat pipes (maximum noise — 32 dB).

- Ideal case: with 2 intake fans and 1 exhaust fan (e.g., Fractal Design Meshify C).


6. Comparison with Competitors

NVIDIA RTX 4050 (2024):

- Pros of P104 100: +15% performance at 1440p, support for DLSS 3.5.

- Cons: RTX 4050 is more efficient in power consumption (130W).

AMD Radeon RX 7600 XT:

- Pros for AMD: 12GB GDDR6, FSR 3.0 in most games.

- Cons: Weaker in rendering due to lack of CUDA equivalent.

Intel Arc A770:

- Pros for Intel: 16GB VRAM, AV1 support.

- Cons: Drivers still lag in optimization.


7. Practical Tips

Power Supply: 500W (recommended 550W for overhead). Best options: Corsair CX550M (80+ Bronze), Be Quiet! Pure Power 11.

Compatibility:

- PCIe 4.0 x16 (backward compatible with 3.0).

- Recommended CPU: AMD Ryzen 5 7600 or Intel Core i5-13400F.

Drivers:

- Game Ready Driver 555.20 is stable, but for professional tasks, the Studio Driver is better.

- Known issue: random crashes in Vulkan applications—rolling back to version 552.10 helps.


8. Pros and Cons

Pros:

- Affordable price: $329 (new models, April 2025).

- Support for DLSS 3.5 and FSR 3.0.

- Low power consumption.

Cons:

- Only 8GB VRAM—a constraint for 4K and professional tasks.

- Lack of AV1 encoding.


9. Final Conclusion: Who is P104 100 for?

This graphics card is a good choice for:

- Gamers with 1440p monitors who want to enable RTX without significant investment.

- Editors and designers working on moderately complex projects.

- PC owners with low-wattage power supplies (e.g., upgrading older systems).

Alternatives: If VRAM overhead is needed—RX 7600 XT ($349), if AV1 support is important—Intel Arc A770 ($299).


Conclusion

The NVIDIA P104 100 proves that even in 2025 it is possible to combine affordability with modern technologies. It may not be a top-of-the-line model, but it offers enough power for comfortable gaming and work—the key is to avoid expecting miracles in 8K.

Basic

Label Name
NVIDIA
Platform
Desktop
Launch Date
December 2017
Model Name
P104 100
Generation
Mining GPUs
Base Clock
1607MHz
Boost Clock
1733MHz
Bus Interface
PCIe 3.0 x16
Transistors
7,200 million
TMUs
?
Texture Mapping Units (TMUs) serve as components of the GPU, which are capable of rotating, scaling, and distorting binary images, and then placing them as textures onto any plane of a given 3D model. This process is called texture mapping.
120
Foundry
TSMC
Process Size
16 nm
Architecture
Pascal

Memory Specifications

Memory Size
4GB
Memory Type
GDDR5X
Memory Bus
?
The memory bus width refers to the number of bits of data that the video memory can transfer within a single clock cycle. The larger the bus width, the greater the amount of data that can be transmitted instantaneously, making it one of the crucial parameters of video memory. The memory bandwidth is calculated as: Memory Bandwidth = Memory Frequency x Memory Bus Width / 8. Therefore, when the memory frequencies are similar, the memory bus width will determine the size of the memory bandwidth.
256bit
Memory Clock
1251MHz
Bandwidth
?
Memory bandwidth refers to the data transfer rate between the graphics chip and the video memory. It is measured in bytes per second, and the formula to calculate it is: memory bandwidth = working frequency × memory bus width / 8 bits.
320.3 GB/s

Theoretical Performance

Pixel Rate
?
Pixel fill rate refers to the number of pixels a graphics processing unit (GPU) can render per second, measured in MPixels/s (million pixels per second) or GPixels/s (billion pixels per second). It is the most commonly used metric to evaluate the pixel processing performance of a graphics card.
110.9 GPixel/s
Texture Rate
?
Texture fill rate refers to the number of texture map elements (texels) that a GPU can map to pixels in a single second.
208.0 GTexel/s
FP16 (half)
?
An important metric for measuring GPU performance is floating-point computing capability. Half-precision floating-point numbers (16-bit) are used for applications like machine learning, where lower precision is acceptable. Single-precision floating-point numbers (32-bit) are used for common multimedia and graphics processing tasks, while double-precision floating-point numbers (64-bit) are required for scientific computing that demands a wide numeric range and high accuracy.
104.0 GFLOPS
FP64 (double)
?
An important metric for measuring GPU performance is floating-point computing capability. Double-precision floating-point numbers (64-bit) are required for scientific computing that demands a wide numeric range and high accuracy, while single-precision floating-point numbers (32-bit) are used for common multimedia and graphics processing tasks. Half-precision floating-point numbers (16-bit) are used for applications like machine learning, where lower precision is acceptable.
208.0 GFLOPS
FP32 (float)
?
An important metric for measuring GPU performance is floating-point computing capability. Single-precision floating-point numbers (32-bit) are used for common multimedia and graphics processing tasks, while double-precision floating-point numbers (64-bit) are required for scientific computing that demands a wide numeric range and high accuracy. Half-precision floating-point numbers (16-bit) are used for applications like machine learning, where lower precision is acceptable.
6.522 TFLOPS

Miscellaneous

SM Count
?
Multiple Streaming Processors (SPs), along with other resources, form a Streaming Multiprocessor (SM), which is also referred to as a GPU's major core. These additional resources include components such as warp schedulers, registers, and shared memory. The SM can be considered the heart of the GPU, similar to a CPU core, with registers and shared memory being scarce resources within the SM.
15
Shading Units
?
The most fundamental processing unit is the Streaming Processor (SP), where specific instructions and tasks are executed. GPUs perform parallel computing, which means multiple SPs work simultaneously to process tasks.
1920
L1 Cache
48 KB (per SM)
L2 Cache
2MB
TDP
130W
Vulkan Version
?
Vulkan is a cross-platform graphics and compute API by Khronos Group, offering high performance and low CPU overhead. It lets developers control the GPU directly, reduces rendering overhead, and supports multi-threading and multi-core processors.
1.3
OpenCL Version
3.0
OpenGL
4.6
DirectX
12 (12_1)
CUDA
6.1
Power Connectors
1x 8-pin
Shader Model
6.4
ROPs
?
The Raster Operations Pipeline (ROPs) is primarily responsible for handling lighting and reflection calculations in games, as well as managing effects like anti-aliasing (AA), high resolution, smoke, and fire. The more demanding the anti-aliasing and lighting effects in a game, the higher the performance requirements for the ROPs; otherwise, it may result in a sharp drop in frame rate.
64
Suggested PSU
200W

Benchmarks

FP32 (float)
Score
6.522 TFLOPS
Blender
Score
612
OctaneBench
Score
122
Vulkan
Score
45859
OpenCL
Score
52079

Compared to Other GPU

FP32 (float) / TFLOPS
6.977 +7%
6.61 +1.3%
6.522
6.181 -5.2%
Blender
1224.91 +100.1%
612
323 -47.2%
126 -79.4%
OctaneBench
526 +331.1%
122
67 -45.1%
35 -71.3%
Vulkan
101318 +120.9%
72046 +57.1%
45859
20775 -54.7%
8986 -80.4%
OpenCL
102044 +95.9%
72374 +39%
52079
30631 -41.2%
15023 -71.2%