NTPP: Generative Speech Language Modeling for Dual-Channel Spoken Dialogue via Next-Token-Pair Prediction
Paper
•
2506.00975
•
Published
Authors: Qichao Wang*, Ziqiao Meng*, Wenqian Cui, Yifei Zhang, Pengcheng Wu, Bingzhe Wu, Irwin King, Liang Chen, Peilin Zhao†
Key features:
git clone https://github.com/Chaos96/NTPP.git
cd parrot
python -m venv venv
source venv/bin/activate # On Windows, use `venv\Scripts\activate`
pip install -r requirements.txt
python pretrain.py --input_data path/to/single_channel_datapython finetune.py --input_data path/to/double_channel_datapython inference.py --input_audio path/to/input.wav