Llama 3.2B Quantized (q4_k_m, GGUF)
This repository provides the quantized Llama 3.2B model in GGUF format (q4_k_m) for efficient deployment on resource-constrained environments, including mobile devices.
This model is part of the research published in Springer LNNS (ICT4SD 2025) and openly available on arXiv.
- Springer DOI: https://doi.org/10.1007/978-3-032-06697-8_33
- arXiv: https://arxiv.org/abs/2512.06490
This model is a quantized version of Meta's original Llama 3.2 model. Please refer to the original model card for full details on its capabilities and limitations.
- Base Model: Llama 3.2B (Meta AI)
- Quantization: 4-bit Post-Training Quantization (q4_k_m)
- Format: GGUF (compatible with llama.cpp and Ollama)
- Model File:
llama_3.2_3b_q4_k_m.gguf
Llama 3.2B Quantized (q4_k_m, GGUF)
This repository provides the quantized Llama 3.2B model in GGUF format (q4_k_m) for efficient deployment on resource-constrained environments, including mobile devices.
- Base Model: Llama 3.2B (Meta AI)
- Quantization: 4-bit Post-Training Quantization (nf4 โ q4_k_m)
- Format: GGUF (compatible with llama.cpp and Ollama)
- Model File:
llama_3.2_3b_q4_k_m.gguf
Usage with llama.cpp
# Clone llama.cpp
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
make
# Run with the quantized model
./main -m ./llama_3.2_3b_q4_k_m.gguf -p "Hello, how are you?"
Download
You can download this model directly via:
git lfs install
git clone https://huggingface.co/Cap4ainN3m0/llama-3.2-3b-q4-k-m
Or programmatically:
from huggingface_hub import snapshot_download
snapshot_download(repo_id="Cap4ainN3m0/llama-3.2-3b-q4-k-m", local_dir="models/llama3-quantized")
Project and Code
The full research workflow, Colab notebook, results, and mobile deployment guide are available here:
GitHub Repository
Citation
If you use this model or workflow, please cite:
Yadav, A., & Bhargavi, R.C. (2025).
Optimizing LLMs Using Quantization for Mobile Execution.
Presented at ICT Goa 2025, DOI pending.
- Downloads last month
- 7
4-bit
Model tree for Cap4ainN3m0/llama-3.2-3b-q4-k-m
Base model
meta-llama/Llama-3.2-3B