HowRU-KoELECTRA-Emotion-Classifier

Model Description

KoELECTRA 기반의 한국어(특히 일기/심리 기록) 감정 분류 모델입니다.
텍스트에서 8가지 감정(기쁨, 설렘, 평범함, 놀라움, 불쾌함, 두려움, 슬픔, 분노)을 인식합니다.

Emotion Classes

이 모델은 입력된 한국어 문장의 주요 감정을 아래 8개 클래스 중 하나로 분류합니다.

Emotion (Korean) Emotion (EN)
기쁨 Joy
설렘 Excitement
평범함 Neutral
놀라움 Surprise
불쾌함 Disgust
두려움 Fear
슬픔 Sadness
분노 Anger

How to Get Started with the Model

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
import torch.nn.functional as F

# 1) Load Model & Tokenizer
MODEL_NAME = "LimYeri/HowRU-KoELECTRA-Emotion-Classifier"

tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
model = AutoModelForSequenceClassification.from_pretrained(MODEL_NAME)

# GPU 사용 가능 시 자동 전환
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
model.eval()

# 감정 라벨 매핑 (id2label)
id2label = model.config.id2label


# 2) Inference Function
def predict_emotion(text: str):
    """
    Returns:
        - top1_pred: 예측된 감정 라벨
        - probs_sorted: 감정별 확률(내림차순)
        - top2_pred: 상위 두 개의 감정
    """

    # 토크나이징
    inputs = tokenizer(
        text,
        return_tensors="pt",
        truncation=True,
        padding=True,
        max_length=512
    ).to(device)

    # 추론
    with torch.no_grad():
        logits = model(**inputs).logits
        probs = F.softmax(logits, dim=-1)[0]

    # 정렬된 확률
    probs_sorted = sorted(
        [(id2label[i], float(probs[i])) for i in range(len(probs))],
        key=lambda x: x[1],
        reverse=True
    )

    top1_pred = probs_sorted[0]
    top2_pred = probs_sorted[:2]

    return {
        "text": text,
        "top1_emotion": top1_pred,
        "top2_emotions": top2_pred,
        "all_probabilities": probs_sorted,
    }


# 3) Example
result = predict_emotion("오늘 정말 기분이 좋고 행복한 하루였어!")
print(result)

pipeline

from transformers import pipeline

MODEL_NAME = "LimYeri/HowRU-KoELECTRA-Emotion-Classifier"

classifier = pipeline(
    "text-classification",
    model=MODEL_NAME,
    tokenizer=MODEL_NAME,
    top_k=None   # 전체 감정 확률 반환
)

# 예측
text = "오늘 정말 기분이 좋고 행복한 하루였어!"
result = classifier(text)

result = result[0]

print("입력 문장:", text)
print("\nTop-1 감정:", result[0]['label'], f"({result[0]['score']:.4f})")
print("\n전체 감정 분포:")
for r in result:
    print(f"  {r['label']}: {r['score']:.4f}")

Training Details

Training Data

  1. LimYeri/kor-diary-emotion_v2
  2. qowlsdud/CounselGPT
  • Total(8:2로 분할): 50,000행
  • Train: 40,000행
  • Validation: 10,000행

Training Procedure

Training Hyperparameters

  • num_train_epochs: 3
  • learning_rate: 3e-5
  • weight_decay: 0.02
  • warmup_ratio: 0.15
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 64
  • max_grad_norm: 1.0

Performance

Metric Score
Eval Accuracy 0.95
Eval F1 Macro 0.95
Eval Loss 0.16

Model Architecture

1) ELECTRA Encoder (Base-size)

  • Hidden size: 768
  • Layers: 12 Transformer blocks
  • Attention heads: 12
  • MLP intermediate size: 3072
  • Activation: GELU
  • Dropout: 0.1

2) Classification Head

감정 8개 클래스를 예측하기 위한 추가 분류 헤드:

  • Dense Layer: 768 → 768
  • Activation: GELU
  • Dropout: 0.1
  • Output Projection: 768 → 8

Citation

@misc{HowRUEmotion2025,
  title={HowRU KoELECTRA Emotion Classifier},
  author={Lim, Yeri},
  year={2025},
  publisher={Hugging Face},
  howpublished={\url{https://huggingface.co/LimYeri/HowRU-KoELECTRA-Emotion-Classifier}}
}
Downloads last month
115
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for LimYeri/HowRU-KoELECTRA-Emotion-Classifier

Finetuned
(101)
this model

Datasets used to train LimYeri/HowRU-KoELECTRA-Emotion-Classifier

Collection including LimYeri/HowRU-KoELECTRA-Emotion-Classifier