Configuration Parsing
Warning:
Invalid JSON for config file config.json

Elbaz-NVIDIA-Nemotron-3-Nano-30B-A3B-PRISM-NVFP4 (UNCENSORED)
NVFP4 Quantized Version for NVIDIA Blackwell GPUs
Model Description
This is the NVFP4 (FP4) quantized version of Ex0bit/Elbaz-NVIDIA-Nemotron-3-Nano-30B-A3B-PRISM, created using NVIDIA TensorRT Model Optimizer.
Model Size: ~18 GB (4-bit weights)
Requirements
NVFP4 format is optimized for:
- NVIDIA Blackwell GPUs (B100, B200, GB200)
- TensorRT-LLM v0.17+
This format is NOT compatible with:
- Consumer GPUs (RTX 3000/4000/5000 series)
- llama.cpp
- Standard PyTorch inference
Usage
With TensorRT-LLM
Deploying NVIDIA Nemotron-3-Nano with TensorRT LLM
Quantization Details
- Method: NVIDIA ModelOpt NVFP4_DEFAULT_CFG
- Calibration: 512 samples from WikiText-2
- Format: HuggingFace checkpoint with quantized weights
Related Models
- Ex0bit/Elbaz-NVIDIA-Nemotron-3-Nano-30B-A3B-PRISM - Parent model (GGUF quantizations available)
- nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16 - Original base model
Author
Eric Elbaz (Ex0bit)
License
NVIDIA Open Model License
- Downloads last month
- 24
Model tree for Ex0bit/Elbaz-NVIDIA-Nemotron-3-Nano-30B-A3B-PRISM-NVFP4
Base model
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16