Tag
Researchers developed the Construction Safety Attitude Framework (CSAF) and validated an LLM-based classifier to measure construction workers' safety attitudes from Reddit discourse, achieving high agreement (κ=0.90) with human expert coders across 10,000+ posts.
Researchers from University of Technology Sydney compare fine-tuned transformers (DistilBERT, RoBERTa) against zero-shot LLMs (Llama variants, Claude, Gemini) for classifying misinformation responses on Reddit, finding that fine-tuned RoBERTa achieves 0.62 macro-F1 versus 0.50 for the best zero-shot model. The study shows that task-specific fine-tuning outperforms larger generalist models, particularly for detecting belief propagation, and that safety-alignment artifacts in frontier models can degrade performance.
This paper proposes a validation framework for using Large Language Models to extract causal relations from social media posts during disasters. It evaluates the effectiveness of LLMs in identifying cause-effect relationships and compares them against expert-grounded reference graphs to assess reliability and risks.
This paper presents an empirical evaluation of LLM-guided semi-supervised learning for classifying social media crisis data. It demonstrates that LG-CoTrain outperforms classical baselines in low-resource settings and highlights the potential of transferring knowledge from LLMs to smaller, deployable models for disaster response.