bert

#bert

BamiBERT: A New BERT-based Language Model for Vietnamese

arXiv cs.CL ↗ · yesterday Cached

BamiBERT is a new BERT-based pre-trained language model for Vietnamese that addresses limitations of PhoBERT, supporting longer context and operating without word segmentation, achieving state-of-the-art results on multiple Vietnamese benchmarks.

0 favorites 0 likes

#bert

@TheTuringPost: A great source to understand or refresh Transformer architecture It explains how transformers process text token by tok…

X AI KOLs Timeline ↗ · yesterday Cached

Promotes an educational resource explaining Transformer architecture, covering token embeddings, self-attention, residual connections, and connections to GPT and BERT.

0 favorites 0 likes

#bert

Hate Speech Detection in Turkish and Arabic Languages: A Comprehensive Study

arXiv cs.CL ↗ · 2d ago Cached

Introduces a comprehensive hate speech dataset for Turkish and Arabic, and develops state-of-the-art BERT-based models for hate speech analysis including classification, intensity prediction, target identification, and span detection.

0 favorites 0 likes

#bert

Majority Vote Silences Minority Values: Annotator Disagreement at the Hate/Offensive Boundary in HateXplain

arXiv cs.CL ↗ · 4d ago Cached

This paper finds that 42.6% of annotator disagreement in HateXplain concentrates at the hate/offensive boundary, demonstrating that majority vote silences minority values and leads to models being wrong but highly confident on contested inputs.

0 favorites 0 likes

#bert

Legal Domain Adaptation of Modern BERT Models

arXiv cs.CL ↗ · 4d ago Cached

This paper explores domain adaptation of ModernBERT models in the legal domain by further pre-training on US court opinions, achieving significant improvements over the vanilla model and releasing the checkpoints publicly.

0 favorites 0 likes

#bert

[NEW MODEL] - SupraSafety-18M · Tiny Content-Moderation Model

Reddit r/LocalLLaMA ↗ · 2026-06-27

SupraLabs released SupraSafety-18M, a tiny 18M-parameter BERT-style content moderation model trained on NVIDIA's Nemotron-3.5 dataset. It achieves 81.2% accuracy and runs efficiently on edge devices.

0 favorites 0 likes

#bert

Comparing BERT Sentence-Pair Classification and Few-Shot LLM Prompting for Detecting Threat and Solution Framing in German Climate News

arXiv cs.CL ↗ · 2026-06-26 Cached

This paper compares fine-tuned BERT (gbert-large) with few-shot LLM prompting (Llama 4 Maverick) for detecting threat and solution framing in German climate news sentences. BERT achieves higher F1 scores (0.83 vs 0.78), and an ablation study shows that providing preceding sentence context improves performance.

0 favorites 0 likes

#bert

Aspect-Based Sentiment Evolution and its Correlation with Review Rounds in Multi-Round Peer Reviews: A Deep Learning Approach

arXiv cs.CL ↗ · 2026-06-24 Cached

This paper investigates the distribution and evolution of aspect-level sentiments in multi-round peer reviews from Nature Communications, using a deep learning approach (LCF-BERT-CDM) to achieve 82.65% Macro-F1, and finds that positive sentiment increases while negative sentiment decreases with more review rounds.

0 favorites 0 likes

#bert

Multilingual Coreference Resolution via Cycle-Consistent Machine Translation

arXiv cs.CL ↗ · 2026-06-05 Cached

This paper proposes a novel pipeline for multilingual coreference resolution that uses cycle-consistent machine translation from English to low-resource languages to generate training data, validated by back-translation and BERT similarity. Experiments on four low-resource languages show significant performance gains, enabling accurate coreference resolution where no prior corpora existed.

0 favorites 0 likes

#bert

Predict and Reconstruct: Joint Objectives for Self-Supervised Language Representation Learning

arXiv cs.CL ↗ · 2026-06-05 Cached

The paper proposes a hybrid pre-training objective combining JEPA latent-space prediction with MLM reconstruction for language models, showing improved embedding uniformity and semantic-lexical balance.

0 favorites 0 likes

#bert

Using Text-Based Causal Inference to Disentangle Factors Influencing Online Review Ratings

arXiv cs.CL ↗ · 2026-06-04 Cached

This paper introduces a text-based causal inference methodology using an enhanced CausalBERT to disentangle the effects of individual aspects (e.g., school administration, academic performance) on overall online review ratings, validated on 600K+ U.S. K-12 school reviews. Key improvements include temperature scaling, hyperparameter optimization, and interpretability methods to reduce confounding bias.

0 favorites 0 likes

#bert

The Word and the Way: Strategies for Domain-Specific BERT Pre-Training in German Medical NLP

arXiv cs.CL ↗ · 2026-06-03 Cached

This paper introduces ChristBERT, a family of domain-specific RoBERTa-based language models for German clinical NLP, and evaluates three domain adaptation strategies (continued pre-training, pre-training from scratch, and vocabulary adaptation) on medical named entity recognition and text classification tasks, achieving state-of-the-art results.

0 favorites 0 likes

#bert

A Fine-Tuned BERT Classifier for Personal-Letter Titles in Late-Ming and Early-Qing Collected Works

arXiv cs.CL ↗ · 2026-05-25 Cached

This paper presents Lepton, a fine-tuned BERT classifier that predicts whether a title in Classical Chinese wenji table-of-contents is a personal letter or a preface, leveraging 5,438 hand-labeled titles from late-Ming and early-Qing literati.

0 favorites 0 likes

#bert

Leveraging Large Language Models for Sentiment Analysis: Multi-Modal Analysis of Decentraland's MANA Token

arXiv cs.CL ↗ · 2026-05-21 Cached

This paper uses a BERT-based large language model for sentiment analysis of Decentraland's Discord community to enhance MANA token price prediction, demonstrating that a multi-modal LSTM incorporating sentiment, trading volume, and market capitalization outperforms a price-only baseline.

0 favorites 0 likes

#bert

Shortcut Solutions Learned by Transformers Impair Continual Compositional Reasoning

arXiv cs.LG ↗ · 2026-05-08 Cached

This research paper investigates how shortcut solutions learned by Transformer models, specifically BERT, impair their ability to perform continual compositional reasoning. It contrasts BERT with ALBERT, finding that ALBERT's recurrent nature offers better inductive bias for continual learning tasks.

0 favorites 0 likes

#bert

I trained a NER model on 33,000 Indian Supreme Court judgments (1950–2024) CASE_CITATION hits 97.76% F1, +17 points over the only prior baseline [P]

Reddit r/MachineLearning ↗ · 2026-05-07

Released en_legal_ner_ind_trf v0.1, an InLegalBERT model fine-tuned on 33,000 Indian Supreme Court judgments, achieving a 97.76% F1 score on case citations and significantly outperforming previous baselines.

0 favorites 0 likes

#bert

Foundational Study on Authorship Attribution of Japanese Web Reviews for Actor Analysis

arXiv cs.CL ↗ · 2026-04-21

A foundational study on applying stylometric authorship attribution to threat intelligence, using Japanese Rakuten reviews to compare TF-IDF+LR, BERT embedding, BERT fine-tuning, and metric learning methods. BERT-FT performed best overall, but TF-IDF+LR proved more stable and efficient when scaling to hundreds of authors.

0 favorites 0 likes

#bert

The Prose of Proteins - A Lesson in Taste and Vision through the Work of Brian Hie

ML at Berkeley ↗ · 2024-04-11

This article profiles researcher Brian Hie, highlighting how his unique background in literature and computer science informed the development of ESM, a BERT-like model for protein sequences.

0 favorites 0 likes

bert

Submit Feedback