SparseLLM
/

BlockFFN-Medium

@@ -1,19 +1,49 @@
 ---
-license: apache-2.0
 language:
 - en
 - zh
 pipeline_tag: text-generation
 ---
 # BlockFFN-Medium
 This is the original 0.5B BlockFFN checkpoint used in the paper *BlockFFN: Towards End-Side Acceleration-Friendly Mixture-of-Experts with Chunk-Level Activation Sparsity* for acceleration tests.
-You can load and use this model simply by using `AutoTokenizer` and `AutoModelForCausalLM`.
 Links: [[Paper](https://arxiv.org/pdf/2507.08771)] [[Codes](https://github.com/thunlp/BlockFFN)]
-### Citation
 If you find our work useful for your research, please kindly cite our paper as follows:
@@ -25,4 +55,4 @@ If you find our work useful for your research, please kindly cite our paper as f
       year={2025},
       url={https://arxiv.org/pdf/2507.08771},
 }
-```

 ---
 language:
 - en
 - zh
+license: apache-2.0
 pipeline_tag: text-generation
+library_name: transformers
 ---
 # BlockFFN-Medium
 This is the original 0.5B BlockFFN checkpoint used in the paper *BlockFFN: Towards End-Side Acceleration-Friendly Mixture-of-Experts with Chunk-Level Activation Sparsity* for acceleration tests.
 Links: [[Paper](https://arxiv.org/pdf/2507.08771)] [[Codes](https://github.com/thunlp/BlockFFN)]
+## Usage
+You can load and use this model simply by using `AutoTokenizer` and `AutoModelForCausalLM` from the `transformers` library.
+```python
+from transformers import pipeline, AutoTokenizer, AutoModelForCausalLM
+import torch
+# Assuming the model ID is "SparseLLM/BlockFFN-Medium"
+model_id = "SparseLLM/BlockFFN-Medium"
+# Load tokenizer and model
+tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
+model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, trust_remote_code=True)
+# Create a text generation pipeline
+pipe = pipeline(
+    "text-generation",
+    model=model,
+    tokenizer=tokenizer,
+    torch_dtype=torch.bfloat16,
+    device_map="auto",
+)
+# Example usage
+prompt = "The quick brown fox jumps over the lazy"
+result = pipe(prompt, max_new_tokens=50, do_sample=True, top_p=0.9, temperature=0.7)
+print(result[0]["generated_text"])
+```
+## Citation
 If you find our work useful for your research, please kindly cite our paper as follows:
       year={2025},
       url={https://arxiv.org/pdf/2507.08771},
 }
+```