The Future of NLP may not be at NLP Conferences: Scholarly Migration Patterns in Natural Language Processing

arXiv cs.CL Papers

Summary

A study analyzing 142K NLP papers from 2010–2026 finds that both established and new NLP authors are increasingly publishing in general ML venues like NeurIPS and ICLR rather than core NLP conferences like ACL, with a significant citation premium favoring ML venues.

arXiv:2607.02416v1 Announce Type: new Abstract: Natural Language Processing (NLP) has traditionally been published in its core disciplinary venues like ACL. However, advances in Large Language Models (LLMs) has led to a blurring of the disciplinary lines between NLP and general Machine Learning (ML), with authors regularly publishing in venues from both fields. Here, we ask whether the disciplinary center of gravity is shifting. Using NLP research published from 2010 to 2026 and studies of both established and new authors, we find that a migration is taking place. First, comparing the pre- and post-LLM eras, established authors lost 19.2pp of share at flagship *ACL main-conference tracks while gaining 14.8pp in the newer Findings tracks, and general ML venues rose 8.6pp, even when adjusting for parallel growth in the fields. Second, among newer authors who debut with at least three first-author NLP-topic papers, the share whose work appears mostly at *ACL venues fell from 84% (2019) to 74% (2024), while the share appearing mostly at general ML venues rose from 5% to 21%. Using causal inference techniques, we estimate that these general ML venues confer a significant citation premium, which influences venue selection. Together, these results point to a significant shift in where NLP research is published.
Original Article
View Cached Full Text

Cached at: 07/03/26, 05:43 AM

# The Future of NLP may not be at NLP Conferences: Scholarly Migration Patterns in Natural Language Processing
Source: [https://arxiv.org/html/2607.02416](https://arxiv.org/html/2607.02416)
###### Abstract

Natural Language Processing \(NLP\) has traditionally been published in its core disciplinary venues like ACL\. However, advances in Large Language Models \(LLMs\) has led to a blurring of the disciplinary lines between NLP and general Machine Learning \(ML\), with authors regularly publishing in venues from both fields\. Here, we ask whether the disciplinary center of gravity is shifting\. Using NLP research published from 2010 to 2026 and studies of both established and new authors, we find that a migration is taking place\. First, comparing the pre\- and post\-LLM eras, established authors lost 19\.2 pp of share at flagship \*ACL main\-conference tracks while*gaining*14\.8 pp in the newer Findings tracks, and general ML venues rose 8\.6 pp, even when adjusting for parallel growth in the fields\. Second, among newer authors who debut with at least three first\-author NLP\-topic papers, the share whose work appears mostly at \*ACL venues fell from 84% \(2019\) to 74% \(2024\), while the share appearing mostly at general ML venues rose from 5% to 21%\. Using causal inference techniques, we estimate that these general ML venues confer a significant citation premium, which influences venue selection\. Together, these results point to a significant shift in where NLP research is published\.

The Future of NLP may not be at NLP Conferences: Scholarly Migration Patterns in Natural Language Processing

David JurgensSchool of Information, University of Michiganjurgens@umich\.edu

## 1Introduction

The field of Natural Language Processing \(NLP\) has undergone a notable shift in the past decade due to the emergence of Large Language Models \(LLMs\)\. These new models offer powerful language understanding in many contexts for building applications, as well as understanding how language is processed\. These new abilities have led to a dramatically increased growth in the field at its main conferences \(ACL, NAACL, EMNLP\)111We refer to this general family of venues as \*ACL\.: annual \*ACL output grew roughly nine\-fold over the past decade, from about 975 papers in 2015 to over 8,700 in 2025\. However, many of the papers introducing foundational model and techniques have not published in NLP venues and instead appear at general Machine Learning \(ML\) venues; examples include GPT3\(Brownet al\.,[2020](https://arxiv.org/html/2607.02416#bib.bib5)\), InstructGPT\(Ouyanget al\.,[2022](https://arxiv.org/html/2607.02416#bib.bib6)\), Chain of Thought prompting\(Weiet al\.,[2022b](https://arxiv.org/html/2607.02416#bib.bib7)\), and Chinchilla scaling laws\(Hoffmannet al\.,[2022](https://arxiv.org/html/2607.02416#bib.bib10)\), which were published at NeurIPS, and LoRA\(Huet al\.,[2022](https://arxiv.org/html/2607.02416#bib.bib4)\), FLAN\(Weiet al\.,[2022a](https://arxiv.org/html/2607.02416#bib.bib3)\), and the ReAct agent pattern\(Yaoet al\.,[2023](https://arxiv.org/html/2607.02416#bib.bib2)\), which were published at ICLR\. Further, while NLP has been growing, these general ML conferences have been growing faster—over the same decade their NLP\-relevant output grew roughly twenty\-fold\. With this growth has come a sizable number of papers on NLP topics\. As a result, NLP researchers have been anecdotally said to be submitting to ICLR, NeurIPS, and ICML rather than the \*ACL family of conferences, leading some to wondering whether \*ACL is being “left behind\.” Here, we test this anecdote quantitatively to measure whether scholars are migrating\.

To test whether NLP researcher are migrating, this paper offers the following four contributions\. First, through a large\-scale analysis of 142K NLP\-topic papers from 2010–2026 across 23 NLP, ML, and AI venues, we show that the migration is real\. Among established NLP authors, publishing share atGeneral\-MLvenues rose by 8\.6 percentage points after the rise of LLMs while their share at \*ACL venues fell by a comparable margin, even after adjusting for the rapid parallel growth of both fields\. Second, using a Oaxaca\-Blinder decompositions, we show that this movement is due more to venue convention, rather than researchers changing topics\. Third, we demonstrate that new entrants into NLP research \(e\.g,\. PhD students\) are increasingly likely to publish inGeneral\-MLvenues, even when controlling for their advisor’s venue preferences\. Fourth, we show that one motivation for this behavior could be due to the citation premium; using paper matching to generate counterfactual submissions, we show that a paper appearing inGeneral\-MLvenues are more likely to receive over double the citations than it would have received if published in an \*ACL venue\. Together, these results point to significant future changes to where the heart of NLP is and where major advancements are likely to appear\.

## 2Related Work

Science is an evolving process where the development of new techniques has led to new fields or mergers of fields\. While this case study is primarily focused on one disciplined—NLP—the question of scholarly migration relates to multiple work\.

Scholarly Incentives\.Credit in science accrues cumulatively—also known as the Matthew effect, in which recognition flows disproportionately to already\-prominent work and authors\(Merton,[1968](https://arxiv.org/html/2607.02416#bib.bib54); de Solla Price,[1965](https://arxiv.org/html/2607.02416#bib.bib55); Azoulayet al\.,[2014](https://arxiv.org/html/2607.02416#bib.bib56)\)—so a venue that confers a citation advantage can become self\-reinforcing\. Closest to our setting, science\-of\-science work treats a researcher’s field as something that moves: scientists increasingly switch topics over a career\(Zenget al\.,[2019](https://arxiv.org/html/2607.02416#bib.bib52)\), and interests evolve in measurable, heavy\-tailed patterns\(Jiaet al\.,[2017](https://arxiv.org/html/2607.02416#bib.bib53)\)\. We read the NLP\-to\-ML venue shift as one consequential axis of this mobility\. These work establish that venue and impact patterns are heavily author\-specific and shaped by incentives, which motivates our reading of venue migration as a potential response to where the field’s rewards have moved\.

Differential citation rates across venues are well documented, but their interpretation is contested\. Reviews of citing behavior and of citation indicators catalog the many non\-scholarly factors at play in who gets credited for their work\(Bornmann and Daniel,[2008](https://arxiv.org/html/2607.02416#bib.bib22); Tahamtanet al\.,[2016](https://arxiv.org/html/2607.02416#bib.bib81); Waltman,[2016](https://arxiv.org/html/2607.02416#bib.bib80)\)\. These factors contribute to long\-running critiques warning against reading venue\-level averages as paper\-level quality, such as impact factors being poor proxies for individual paper’s citation counts\(Seglen,[1997](https://arxiv.org/html/2607.02416#bib.bib78); Garfield,[2006](https://arxiv.org/html/2607.02416#bib.bib79)\)and preprint posting reshaping when and how citations accrue\(Ginsparg,[2011](https://arxiv.org/html/2607.02416#bib.bib83); Larivièreet al\.,[2014](https://arxiv.org/html/2607.02416#bib.bib82)\)\.

Bibliometrics of NLP and ML\.A long line of work has used the ACL Anthology to study the structure and evolution of the NLP community\. Multiple works have introduced new resources for studying behavior such as the Anthology Reference Corpus and the ACL Anthology Network, which turned the proceedings into a citation\- and collaboration\-graph resource\(Birdet al\.,[2008](https://arxiv.org/html/2607.02416#bib.bib15); Radevet al\.,[2013](https://arxiv.org/html/2607.02416#bib.bib16)\), and the NLP Scholar dataset and explorer\(Mohammad,[2020b](https://arxiv.org/html/2607.02416#bib.bib19),[c](https://arxiv.org/html/2607.02416#bib.bib18)\), which focus more on the scholars\. These resources have been used to chart how the field’s topical composition shifted across research epochs\(Andersonet al\.,[2012](https://arxiv.org/html/2607.02416#bib.bib1)\), though in the pre\-LLM era\.

Building on these resources, diachronic studies have profiled productivity and impact dynamics within NLP such as the structural glass ceiling in the mentor–mentee network\(Schluter,[2018](https://arxiv.org/html/2607.02416#bib.bib17)\), geographic citation gaps\(Rungtaet al\.,[2022](https://arxiv.org/html/2607.02416#bib.bib61)\), the concentration of industry labs in the field\(Abdallaet al\.,[2023](https://arxiv.org/html/2607.02416#bib.bib62)\), and how NLP cites and is cited by neighboring disciplines\(Wahleet al\.,[2023](https://arxiv.org/html/2607.02416#bib.bib63); Mohammad,[2020a](https://arxiv.org/html/2607.02416#bib.bib57)\)\. Others trace the field’s paradigm shifts and self\-image directly\(Jurgenset al\.,[2018](https://arxiv.org/html/2607.02416#bib.bib14); Pramanicket al\.,[2023](https://arxiv.org/html/2607.02416#bib.bib66); Michaelet al\.,[2023](https://arxiv.org/html/2607.02416#bib.bib64); Bollmann and Elliott,[2020](https://arxiv.org/html/2607.02416#bib.bib65)\)\. These studies look inward at the Anthology for how NLP authors behave within \*ACL venues; we, instead, track where Anthology authors publish*outside*it, and we identify a PhD\-debut cohort that separates how much of the shift reflects*who*is publishing from*what*they work on\.

Research practices in NLP and ML\.A parallel literature scrutinizes how NLP and ML conduct and report research, often with concern about pace outrunning rigor\(Sculleyet al\.,[2018](https://arxiv.org/html/2607.02416#bib.bib71); Lipton and Steinhardt,[2019](https://arxiv.org/html/2607.02416#bib.bib70)\)\. It documents under\-powered comparisons\(Cardet al\.,[2020](https://arxiv.org/html/2607.02416#bib.bib60)\), incomplete reporting of experimental results\(Dodgeet al\.,[2019](https://arxiv.org/html/2607.02416#bib.bib73); Houet al\.,[2019](https://arxiv.org/html/2607.02416#bib.bib67)\), gaps in reproducibility\(Gundersen and Kjensmo,[2018](https://arxiv.org/html/2607.02416#bib.bib72)\), and the value commitments embedded in what the field chooses to study\(Birhaneet al\.,[2022](https://arxiv.org/html/2607.02416#bib.bib74); Blodgettet al\.,[2020](https://arxiv.org/html/2607.02416#bib.bib68); Rogerset al\.,[2021](https://arxiv.org/html/2607.02416#bib.bib69)\)\. While not all papers in NLP and ML have such deficits, these issues contribute the perception of what matters to reviewers in the field and what research is considered publishable\.

LLM\-era NLP–ML convergence\.The rise of pretrained language models \(PLMs; e\.g\., BERT\), LLMs, and other foundation models has both accelerated the growth of the field and blurred the NLP/ML boundary\(Bommasaniet al\.,[2021](https://arxiv.org/html/2607.02416#bib.bib75)\), against a backdrop of exponentially expanding AI literature\(Franket al\.,[2019](https://arxiv.org/html/2607.02416#bib.bib76); Krennet al\.,[2023](https://arxiv.org/html/2607.02416#bib.bib77)\)\. Since 2020, NLP and general ML have visibly converged around large pretrained and language models, a shift commentators have framed both methodologically and critically\(Bender and Koller,[2020](https://arxiv.org/html/2607.02416#bib.bib25); Benderet al\.,[2021](https://arxiv.org/html/2607.02416#bib.bib26)\)\. The institutional response is equally visible in new venues that straddle the boundary, including the launch of Transactions on Machine Learning Research\(Transactions on Machine Learning Research,[2022](https://arxiv.org/html/2607.02416#bib.bib28)\)and the Conference on Language Modeling\(Conference on Language Modeling,[2024](https://arxiv.org/html/2607.02416#bib.bib27)\)\. This convergence motivates our 2010–2026 window which straddles the pre\-PLM/LLM era up to the present; our analysis rests on a unified, cross\-linked corpus built from open scholarly infrastructure\(Loet al\.,[2020](https://arxiv.org/html/2607.02416#bib.bib31); Kinneyet al\.,[2023](https://arxiv.org/html/2607.02416#bib.bib32); Priemet al\.,[2022](https://arxiv.org/html/2607.02416#bib.bib33)\)\.

## 3Data

To model potential scholar migration, we first create a longitudinal corpus of NLP\-topic papers across relevant venues with canonical cross\-linked author identifiers, as described next\.

### 3\.1Venue taxonomy and paper coverage

We collect data from publication venues in three categories central to the NLP→\\toML question: \*ACL \(e\.g\., ACL, EMNLP, TACL\),General\-ML\(e\.g\., NeurIPS, ICLR, ICML\), andAI\-Broad\(AAAI, IJCAI\)\. \*ACL itself has a diverse ecosystem, and the addition of Findings papers starting in 2020 potentially also influences the scholar migration\. Therefore, we consider four tiers:NLP\-Mainfor all conferences main proceedings,NLP\-Findings,NLP\-Workshop/Resourcewhich includes WMT and LREC,222We note that WMT is now itself a conference and LREC has always been a conference\. However, both conferences have different norms fromNLP\-Main, with shared tasks for WMT and a higher acceptance rate for LREC, both of which are closer to the norms of workshops\.andNLP\-Journalfor TACL and Computational Linguistics\.

Paper metadata comes from a union of six sources: \(i\) the ACL Anthology dump, \(ii\) Semantic Scholar venue queries, \(iii\) OpenAlex’sprimary\_location, \(iv\) public OpenReview submissions, \(v\) PMLR proceedings, and \(vi\) the DBLP bulk export\. We deduplicate across these sources by normalizing by title within \(venue, year\) and using all associated identifiers for each paper \(e\.g\., its DOI\)\. We additionally retrieve paper metadata for its citations, title, and abstract for each paper from Semantic Scholar and OpenAlex\. In total, we begin with 141,710 distinct papers across the three venue families, which are later classified as NLP\-topic or not; Table[2](https://arxiv.org/html/2607.02416#S3.T2)reports NLP\-topic coverage within these families\. Appendix[A\.1](https://arxiv.org/html/2607.02416#A1.SS1)contains additional details on data composition\.

### 3\.2Author identifier resolution

Depending on its origin, each paper carries Semantic Scholar, OpenAlex, ACL Anthology, DBLP key, and OpenReview\-id metadata where available\. We cross\-link author identities into a singleauthor\_uidvia union\-find over \(a\) exact\-match shared IDs, \(b\) OpenReview profile\-stated DBLP/ORCID/Google Scholar handles, and \(c\) name \+ coauthor block uniqueness when two records share an unambiguous canonical name\. This results in 320,775 distinct authors\. Appendix Table[6](https://arxiv.org/html/2607.02416#A1.T6)reports the percent of authors with each external identifier present\.

### 3\.3Identifying NLP\-topic Papers

Not every paper published inGeneral\-MLorAI\-Broadis on the topic of NLP so we developed a pipeline to label papers\. AGemma\-4\-26B\-A4B\-itjudge is prompted with the title and abstract of the paper and provided a description of possible NLP papers grounded in the call for papers from ACL venues aggregated over multiple years \(details in §[A\.4](https://arxiv.org/html/2607.02416#A1.SS4)\)\. The prompt was refined across multiple iterations with manual evaluation; the final version contains 2\.3K tokens, with descriptions of each NLP subarea/topic, descriptions of out\-of\-scope topics, decision rules, and 16 examples\. The model is asked to generate a terse rationale \(up to 12 words\), and then assign a YES/NO label of whether the paper is on an NLP topic\. We note that the venue identity is masked at inference time, so the label is not collapsible to “appeared at \*ACL” in order to prevent potentially confounding the NLP\-topic label with venue in later experiments\.

To evaluate, we sample and label 402 titles and abstracts, equally balanced across \*ACL,General\-ML, andAI\-Broad, and stratified across the Gemma 4 model’s YES/NO predictions\. Table[1](https://arxiv.org/html/2607.02416#S3.T1)shows the pool\-reweighted results\. Performance is highest \(0\.96 F1\) for papers in \*ACL and lowest forGeneral\-ML\(0\.85 F1\)\. The residual errors are largely false positives atGeneral\-MLandAI\-Broad, on papers whose contribution is a generic ML method with an LLM as the application \(e\.g\., quantization or decoding efficiency\); genuine NLP tasks, applications, and agents at those venues are recovered at high recall\. The total number of NLP\-topic papers by venue is shown in Table[2](https://arxiv.org/html/2607.02416#S3.T2)\.

Table 1:Human\-validated performance of theGemma\-4\-26B\-A4B\-itjudge \(NLP = positive class\), broken down by venue\.Table 2:NLP\-topic label coverage of the paper venue\. The non\-\*ACL NLP\-topic papers are the migrating\-paper pool the experiments analyze\.

## 4Are Authors Migrating?

Anecdotal observations suggest that NLP researchers increasingly publish at general ML venues since the LLM era began\. Here, we empirically test this observation to assess whether there is a measurable migration, if and when it began, and whether all sub\-populations of NLP authors migrate\. We quantify potential migration two ways: \(1\) a per\-author yearly trajectory of venue\-category shares, and \(2\) a baseline\-vs\-post regression of the per\-authorΔ\\Deltashare broken out by author seniority\. We formalize the study around three research questions \(RQs\):RQ1When did the migration begin \(if ever\) and is there a pre\-LLM\-era trend?RQ2How large is the shift for established NLP authors when comparing 2015–2020 to 2021–2026? andRQ3Does the shift differ by author seniority?

### 4\.1Experimental setup

Cohort\.To measure migration, we first establish the cohort of authors eligible to migrate\. We apply three sequential filters to our candidate pool of 320,775 authors \(§[3](https://arxiv.org/html/2607.02416#S3)\): \(1\) we restrict to authors with at least one paper at a tracked\-family venue \(\*ACL,General\-ML, orAI\-Broad\) in the 2015–2020 baseline window \(50,552 remain\); \(2\) we further require two joint criteria—at least 3 NLP\-topic papers and an NLP\-topic share≥50%\\geq 50\\%of their papers—yielding a pre\-attrition pool of 5,403 authors; and \(3\) we keep only “research\-active” authors, requiring at least one NLP\-topic paper in 2021–2026, removing those who exited research entirely\. Our final cohort consists ofNN=4,181 unique authors \(Appendix Table[11](https://arxiv.org/html/2607.02416#A1.T11)\)\.

Time windows\.We treat 2015–2020 as our baseline window; these six years span the rise of large pretrained NLP models from early sequence\-to\-sequence models through GPT\-3; 2021–2026 is the LLM era window\. We also extend the trajectory analysis back to 2010 to bracket any pre\-LLM trend in NLP\-venue adherence\.

Estimator\.Our primary estimator is a single*stacked*ordinary\-least\-squares regression over the \(author, venue category\) panel,Δsharei​c∼0\+C\(category\)\[\+C\(category\):C\(stratum\)\],\\Delta\\text\{share\}\_\{ic\}\\;\\sim\\;0\+C\(\\text\{category\}\)\\;\[\\,\+\\,C\(\\text\{category\}\)\\\!:\\\!C\(\\text\{stratum\}\)\\,\],fit with one row per authoriiand categoryccand cluster\-robust standard errors by author\. The outcomeΔ​sharei​c\\Delta\\text\{share\}\_\{ic\}is the change in authorii’s share of papers at categoryccfrom the 2015–20 baseline to the 2021–26 post window\. In this two\-period collapse, the author fixed effect is period\-invariant and cancels, so eachC​\(category\)C\(\\text\{category\}\)coefficient is the mean*within\-author*pre→\\topost change in that category’s venue\-mix share\. Stacking the categories into one regression yields a single joint Wald test of the null “venue mix unchanged” and shares the author\-clustered covariance across an author’s category rows\. We fit two variants of the model: \(1\) the venue mix share for authors regardless of their position in the author order and \(2\) the venue mix share only for an author’s first\-or\-last\-author papers\. This latter model is a smaller set of papers \(54,257 vs\. 90,400 author–paper records\) but likely focuses the mix on papers where the author had greater influence of where a paper was published\.

The two model terms answer our two estimation questions directly\.RQ2 \(aggregate shift\)is read off theC​\(category\)C\(\\text\{category\}\)*main effects*—the average within\-author share change in each venue category, pooled over all established authors \(Table[13](https://arxiv.org/html/2607.02416#A1.T13)\)\.RQ3 \(heterogeneity\)adds theC​\(category\):C​\(stratum\)C\(\\text\{category\}\)\\\!:\\\!C\(\\text\{stratum\}\)*interaction*terms, which let each category’s shift vary by author stratum \(Appendix Table[15](https://arxiv.org/html/2607.02416#A1.T15)\)\.

Strata\.To test for heterogeneity in behavior, we use four strata: \(a\) career age at 2020 \(junior≤3\\leq 3y / mid 4–8y / senior 9–15y\); \(b\) paper\-count quartile \(computed on baseline papers\); \(c\)hh\-index quartile from S2 / OpenAlex histories; \(d\) hybrid \(career age×\\timesrole inferred from OpenReview history when available, otherwise affiliation heuristics\)\.

### 4\.2Results: Yearly trajectory \(RQ1\)

![Refer to caption](https://arxiv.org/html/2607.02416v1/x1.png)Figure 1:Per\-year mean share of papers at each of the three tracked venue families per cohort author, renormalized to sum to one \(topic\-active cohort,N=4,181N=4\{,\}181\)\. The cohort’s \*ACL share holds near 76–80% through 2020 and then declines to 63% by 2025;General\-MLrises monotonically from∼\\sim4% in 2015 to 31% in 2025, overtakingAI\-Broad\(which falls from 19% to 7%\) around 2022–2023\.Among research\-active NLP researchers, a large shift is underway\. As seen in Figure[1](https://arxiv.org/html/2607.02416#S4.F1), in the pre\-LLM era, the \*ACL venues \(\*ACL\) had a relatively stable share of∼\\sim80% from 2015 through 2020, which then declines steadily to 63% by 2025\. Over the same windowGeneral\-MLrises monotonically from≈\\approx4% to 31%, overtakingAI\-Broad\(which also falls from 19% to 7%\) around 2022–2023\. This shift in destinations is accelerating, pointing to a future in whichGeneral\-MLvenues may become the predominant source of NLP\-topic papers\. The cut off year of 2020 was motivated by the release of the initial LLMs\. However, we performed an additional cutoff\-sensitivity analysis in Appendix[A\.2](https://arxiv.org/html/2607.02416#A1.SS2), which places the inflection robustly between 2020 and 2022: sliding the pre/post boundary across candidate split years from 2017 to 2022 leaves the \*ACL decline andGeneral\-MLrise significant at each one \(Table[14](https://arxiv.org/html/2607.02416#A1.T14)\), so the migration is not an artifact of the specific 2020 cutoff\.

### 4\.3Result: Per\-authorΔ\\Deltashare \(RQ2\)

![Refer to caption](https://arxiv.org/html/2607.02416v1/x2.png)Figure 2:Mean*within\-author*change in renormalized venue\-mix share \(percentage points\) from 2015–20 to 2021–26, per venue family, with 95% cluster\-robust CIs; any\-authorship vs\. first\-or\-last\-author\. Full coefficients in Appendix Table[13](https://arxiv.org/html/2607.02416#A1.T13)\.Both NLP and ML\-general venues have seen a surge in submissions and publications; ML\-general conferences are growing faster, so a potential explanation for the overall trend in Figure[1](https://arxiv.org/html/2607.02416#S4.F1)is that there are simply more ML\-general NLP\-topic papers being produced\. To rule this out, we fit the stacked regression, which isolates individual authors’ behavior\. Figure[2](https://arxiv.org/html/2607.02416#S4.F2)shows the forest plot for both model specifications \(coefficients in Appendix Table[13](https://arxiv.org/html/2607.02416#A1.T13)\), revealing sharp publication preferences by authors\. The shift away from NLP venues is most pronounced when restricting to an author’s first/last\-authored papers, suggesting strategic behavior\.

### 4\.4Results: Intra\-NLP Shifts

Within \*ACL venues, authors have different tiers of publishing, with the ACL Organization adding a “Findings” venue to collect pairs that were publishable but not at the level of the associated Main venue\. It could be that the growth ofGeneral\-MLis due to authors receiving a Findings decision when committing a paper to a conference \(which is binding\) and, instead, withdrawing to improve the paper and resubmitting it to aGeneral\-MLvenue\. Disaggregating \*ACL intoMain,Findings,Workshop/Resource, andJournal, seen in Table[3](https://arxiv.org/html/2607.02416#S4.T3), we see this is not the case\. Instead, the disaggregation shows a stark picture for \*ACL: the substantial growth of Findings papers accounts for the bulk of NLP\-topic papers in \*ACL venues\. Without Findings papers,General\-MLwould account for an even larger share of papers published on NLP topics\.

Table 3:NLP\-tier split: within\-author meanΔ\\Deltashare \(2015–20→\\to2021–26\), in percentage points, for the topic\-active cohort \(N=4,181N=4,181\)\. The NLP category is disaggregated into Main / Findings / Workshop / Journal tiers; the denominator is the tracked venue universe\. Stars use Holm–Bonferroni FWER\-adjustedpp\(within each authorship column\):p∗<0\.05\{\}^\{\*\}p\{<\}0\.05,p∗∗<0\.01\{\}^\{\*\*\}p\{<\}0\.01,p∗⁣∗∗<0\.001\{\}^\{\*\*\*\}p\{<\}0\.001\(HC1\)\.
### 4\.5Results: Author Heterogeneity \(RQ3\)

Do all types of authors migrate equally? We re\-fit the stacked regression separately within author strata; the two most informative—career age andhh\-index quartile—are reported in Appendix Table[15](https://arxiv.org/html/2607.02416#A1.T15)with Holm–Bonferroni–correctedpp\-values\. The within\-author \*ACL decline andGeneral\-MLgain hold in every stratum, so the migration is broad rather than confined to one group\. It is also essentially uniform across career age: no career\-age deviation survives correction\. The one substantial exception is by citation impact—the most\-cited authors are the least likely to leave \*ACL\. Relative to the lowesthh\-index quartile, top\-quartile authors show a\+8\.4\+8\.4pp smaller \*ACL decline \(p<0\.001p<0\.001\) and shedAI\-Broad6\.36\.3pp faster \(p<0\.001p<0\.001\), while theirGeneral\-MLgain is unchanged\.

## 5Potential Mechanism of Migration

While authors may have multiple motivations for changing their primary venue, here we examine two potential mechanisms driving the migration: topic and author\. As NLP and ML increasingly intersect, new opportunities to combine previously\-disparate ideas may lead authors to pursue novel research directionsUzziet al\.\([2013](https://arxiv.org/html/2607.02416#bib.bib40)\)\. Under the*topic\-led*hypothesis \(H1\), the topical content of NLP research has shifted toward areas \(e\.g\., LLM scaling, RLHF, alignment\) whose natural venue differs from traditional \*ACL: it is not the authors who moved, but the topics they study, and the venue change follows\. Under the*author\-led*hypothesis \(H2\), the same authors working on the same topics increasingly choose ML\-general venues over NLP venues: the topical mix between venues is stable, but the venue selection is what changed\.

### 5\.1Experimental setup

We split each cohort venue\-share change into a composition part \(the cohort changed which topics it works on\) and a convention part \(the field changed where a given topic is published\) using a Oaxaca–Blinder decomposition\(Oaxaca,[1973](https://arxiv.org/html/2607.02416#bib.bib11); Blinder,[1973](https://arxiv.org/html/2607.02416#bib.bib12)\), which is designed to disentangle gaps seen between two groups \(here, the pre/post\-LLM eras\)\. For each tracked categoryccand topictt, letwtprew\_\{t\}^\{\\text\{pre\}\}andwtpostw\_\{t\}^\{\\text\{post\}\}be the fraction of cohort papers in topictt, andsc,tpres\_\{c,t\}^\{\\text\{pre\}\}andsc,tposts\_\{c,t\}^\{\\text\{post\}\}be the share of topic\-ttpapers landing in categoryccin each period\. The observed cohortΔ\\Deltashare forccdecomposes as

Δ​sc=\\displaystyle\\Delta s\_\{c\}\\;=\\;∑t\(wtpost−wtpre\)​sc,tpre⏟composition \(H1\)\\displaystyle\\underbrace\{\\sum\_\{t\}\(w\_\{t\}^\{\\text\{post\}\}\-w\_\{t\}^\{\\text\{pre\}\}\)\\,s\_\{c,t\}^\{\\text\{pre\}\}\}\_\{\\text\{composition \(H1\)\}\}\+∑twtpre​\(sc,tpost−sc,tpre\)⏟convention \(H2\)\\displaystyle\+\\underbrace\{\\sum\_\{t\}w\_\{t\}^\{\\text\{pre\}\}\(s\_\{c,t\}^\{\\text\{post\}\}\-s\_\{c,t\}^\{\\text\{pre\}\}\)\}\_\{\\text\{convention \(H2\)\}\}\+∑t\(Δ​wt\)​\(Δ​sc,t\)⏟interaction\.\\displaystyle\+\\underbrace\{\\sum\_\{t\}\(\\Delta w\_\{t\}\)\(\\Delta s\_\{c,t\}\)\}\_\{\\text\{interaction\}\}\.We report both pre\-weight and post\-weight references and average the two for stability\. Convention measures within\-topic venue substitution \(supports H2 when large and signed correctly\); composition measures the contribution of changes in topic mix \(supports H1\)\.

Topic taxonomies\.We run the decomposition on two granularities of topic: \(i\) the OpenAlexprimary\_subfieldfor each paper \(∼\\sim250 labels, which acts as a population\-wide subfield assignments, and \(ii\) data\-driven categories where we embed each title/abstract using SPECTER2\(Singhet al\.,[2023](https://arxiv.org/html/2607.02416#bib.bib13)\)usingkk\-means \(kk=50\)\. The two taxonomies are sensitivity checks on each other: they should agree on which component dominates\.

Migrator vs\. stayer test\.As an independent confirmation of H1 vs\. H2, we partition the cohort into migrators \(Δ​shareNLP,i≤−10​pp\\Delta\\text\{share\}\_\{\\text\{NLP\},i\}\\leq\-10\\text\{pp\}\) and stayers \(\|Δ​shareNLP,i\|≤5​pp\|\\Delta\\text\{share\}\_\{\\text\{NLP\},i\}\|\\leq 5\\text\{pp\}\)\. For each author we build a baseline and a post topic\-mix vector over the chosen taxonomy and compute the cosine similarity between the two\. Under H1, migrators should have lower self\-similarity \(their topics changed\); under H2, similarities should be comparable\.

### 5\.2Decomposition results

![Refer to caption](https://arxiv.org/html/2607.02416v1/x3.png)Figure 3:Results from the Oaxaca–Blinder decompositions of cohortΔ\\Deltashare into its components for composition \(topic\-mix, H1; shown in red\) and convention \(within\-topic venue, H2; shown in blue\)\. The top uses OpenAlex subfield and bottom uses thekk=50 clusters\.Note that the NLP\-share movement magnitude \(−0\.014\-0\.014\) is small because it is paper\-weighted and dominated by the heaviest NLP publishers; the per\-author within\-author NLP shift in §[4](https://arxiv.org/html/2607.02416#S4)is much larger because it weights authors equally\.

Using the OpenAlex subfield taxonomy, the cohort \*ACL share change of \-0\.014 shows diverging effects: the cohort’s topic mix shifted*away*from NLP\-friendly subfields \(the larger, composition term\), while within those subfields authors leaned slightly back*toward*NLP venues, leaving a small net decline\. TheGeneral\-MLgain is the topic mix moving into ML\-suited subfields, but the majority is authors changing their within\-topic venue convention toward ML\-general\. TheAI\-Broadloss is almost entirely convention; authors did not stop working on AAAI/IJCAI topics; they stopped sending those topics to AAAI/IJCAI\.

Both cross\-family movements are convention\-driven—within\-topic venue substitution rather than topic change\. TheGeneral\-MLgain is majority convention \(58%58\\%\) and theAI\-Broadloss is essentially all convention \(≈\\approx100%\): on the same topics, the cohort increasingly choseGeneral\-MLvenues over AAAI/IJCAI\. The \*ACL decline is the exception—it is composition\-driven \(the topic mix drifted out of NLP\-heavy subfields\), with convention pulling slightly the other way\.

A similar result is seen with the bottom\-up topic clustering\.AI\-Broadloss continues to be convention\-driven\. The \*ACL decline is almost entirely*composition*indicating the cohort moved into content clusters that inherently under\-publish at \*ACL venues\. Together, the results suggest the cohort largely kept its topic mix and shifted venue convention, with finer subfield resolution pointing to a smaller topic\-mix shift as well\.

### 5\.3Migrators vs\. stayers

![Refer to caption](https://arxiv.org/html/2607.02416v1/x4.png)Figure 4:Median per\-author cosine similarity of pre\-vs\-post LLM era topic\-mix, stratified by migration group; error bars show bootstrapped confidence intervals and \* denote p<<0\.01\. At subfield resolution, migrators and stayers are nearly identical \(consistent with H2: the typical migrator kept their general subfield while switching venues\); with the topically finer\-grainedK=50K\{=\}50clusters, migrators’ topic mixes drift noticeably more, supporting H1, that authors are now working in topics more aligned withGeneral\-MLvenues\.We find partial support for H1 \(topic\-led change\) examining the topic\-mixes for migrators and stayers, shown in Figure[4](https://arxiv.org/html/2607.02416#S5.F4)\. With the OpenAlex subfield taxonomy, both groups have a median cosine similarity of≈\\approx0\.95 to their earlier subfield mix, whereas with the bottom\-up paper clusters the migrator–stayer gap is large and highly significant in the H1 direction \(p=4×10−11p=4\\times 10^\{\-11\}\)\. These differences reflect the granularity of the taxonomies: OpenAlex subfields are coarse \(e\.g\.,∼\\sim74% of papers fall under the single “Artificial Intelligence” subfield\), while the bottom\-up clusters capture finer, more track\-like groupings\. At that finer level, migrators’ topic mixes drifted more than stayers’, consistent with a topic\-level component to the migration\. Appendix Figure[6](https://arxiv.org/html/2607.02416#A1.F6)shows how the most common clusters’ \*ACL shares changed over time\. , with declines for clusters covering LLMs, reasoning, and LLM efficiency\.

## 6Where do new NLP Authors Debut?

The previous study was focused on established NLP researchers\. However, new researchers regularly enter into the community, and their venue preferences also drive the movement of the field\. As a complementary analysis, we ask whether these new entrants to the field show the same pattern: when a researcher publishes their first three NLP\-topic papers, where are these likely to appear, and has that destination changed over 2019–2024?

### 6\.1Experimental Setup

##### Identifying New NLP Entrants\.

Identifying new entrants is itself a measurement choice\. We report two cohorts and hold the statistical design fixed across them\. \(1\) The*publication\-record*cohort \(NN=3,568\) includes every author with at least three first\-author NLP\-topic papers at a tracked venue whose first such paper falls in the study window; the author’s entry year is the year of that first NLP\-topic first\-author paper\. \(2\) The*declared\-PhD*cohort \(NN=1,228\) includes researchers whose OpenReview, ORCID, or DBLP\-thesis profile parses to a doctoral start year in the window and who likewise have at least three first\-author NLP\-topic papers; entry year is the declared PhD start\. Requiring at least three first\-author NLP\-topic papers in both cohorts ensures we study where committed NLP newcomers*choose*to publish, not whether they research in NLP, and it fixes the number of analyzed papers at three per author\. Full details of the cohort construction process are in Appendix[A\.1\.1](https://arxiv.org/html/2607.02416#A1.SS1.SSS1)\.

The two cohorts each have their own trade\-offs\. The declared\-PhD cohort has relatively precise entry markers, access to richer metadata through OpenReview, and, because these are PhD students, likely reflects the individuals who will make up the future composition of the field\. However, the declared\-PhD cohort inherits the coverage biases of those platforms; OpenReview was primarily anGeneral\-MLplatform prior to its adoption by ACL in 2023, while ORCID is more common among European researchers, which both introduce selection\-effect biases\. In contrast, the publication\-record cohort avoids platform biases\. However, it includes authors who may not continue on in NLP \(e\.g\., undergraduate and masters students\) and therefore may be less likely to generalize\.

##### Controlling for Advisor Influence

Advisors likely shape the path of early\-career researchers and therefore influence which venue a student submits to\. Within our data, the students’ advisor is recovered only from the advisor field of an OpenReview profile, which we resolve to that advisor’s publication record\. To control for the advisor’s venue preference, we include a control variable for the advisor’s share of NLP\-topic papers in the five years before the student’s entry \(together with an indicator for whether any advisor record was found\)\. A computable advisor record is available for 374 of the 1,228 declared\-PhD students \(31%\)\.

##### Unit of analysis and dependent variable\.

We fit separate student\-level logit models \(one per cohort\), each row a researcher, sharing the*same*dependent variable:Yi=𝟏Y\_\{i\}=\\mathbf\{1\}if a majority of researcherii’s first three NLP\-topic first\-author papers are at \*ACL venues\. Both models use the entry cohort year as the regressor of interest\. Because every author contributes exactly three papers, no paper\-count control is needed\. The declared\-PhD model additionally includes the advisor effects\. Because the two models share the dependent variable, the cohort\-year regressor, and the estimator, their cohort\-year coefficients are directly comparable\.

### 6\.2Results

Table 4:Student\-level logits models for whether the majority of a student’s first three NLP\-topic first\-author papers are at \*ACL venues\. Both models show the same significant decline in NLP\-venue debut across cohort years, and in the declared\-PhD model that decline is essentially unchanged when the advisor controls are dropped \(−0\.125\-0\.125\)\.p∗<0\.05\{\}^\{\*\}p\{<\}0\.05,p∗∗<0\.01\{\}^\{\*\*\}p\{<\}0\.01,p∗⁣∗∗<0\.001\{\}^\{\*\*\*\}p\{<\}0\.001\.The results from both cohort definitions, seen in Table[4](https://arxiv.org/html/2607.02416#S6.T4), indicate a consistent decline in the odds that the majority of the new entrant’s first three NLP\-topic papers appear in \*ACL venues\. In the publication\-record cohort the share debuting mostly at \*ACL falls from 84% \(2019\) to 74% \(2024\) while theGeneral\-MLshare rises from 5% to 21%\. Looking at the declared\-PhD cohort, we see that this effect survives the influence of the advisor’s prior \*ACL paper distribution; even though having an established \*ACL\-publishing advisor makes a student more likely to publish in that venue, there is still a general trend towardsGeneral\-ML\. The advisor’s NLP rate is a strong positive predictor \(β=\+1\.38\\beta=\+1\.38,p<10−5p<10^\{\-5\}\): net of the cohort\-year decline, students advised by predominantly\-NLP faculty are markedly more likely to debut at NLP venues, indicating that the generational shift operates on top of—not merely through—advisor topic inheritance\.

## 7The Citation Premium

If multiple venues are a topical fit, why might a scholar move? Prior work suggests that authors seek out status and reputation rewards in choosing where to publish, with one common driver being the impact\-factor of the venue\(Tenopiret al\.,[2016](https://arxiv.org/html/2607.02416#bib.bib9); Salinas and Munch,[2015](https://arxiv.org/html/2607.02416#bib.bib8)\); indeed, the most\-cited authors in an area migrate the least topically\(Petersenet al\.,[2014](https://arxiv.org/html/2607.02416#bib.bib47)\)\. One natural, observable reward to model is the number of citations a paper receives\. If publishing at anGeneral\-MLvenue produces meaningfully more citations than publishing the same paper at an \*ACL venue, the migration could be partly explained by a rational author response to differing citation rewards\. Testing this experimentally is near impossible as two identical papers would need to be published in different venues to see which attracts more citations; however, here, we draw from causal inference techniques to match similar papers across venues in order to estimate the citation premium by venue category, if any\.

### 7\.1Experimental Setup

We fit two versions of regression models where the outcome islog⁡\(1\+cites\)\\log\(1\+\\text\{cites\}\)using Semantic Scholar counts\. To control for paper topic, we include a fixed effect for its cluster membership from thekk=50 SPECTER2 topic\-cluster embedding; we also account for overall growth in the field with publication\-year fixed effects\. We restrict our analysis to \*ACL andGeneral\-MLfirst\- or last\-authored papers, and compare their citations*within cluster and year*\. We fit both models for citation premium that differ only in how matching is handled: \(1\)*All papers*: we match eachGeneral\-MLpaper to its three same\-year nearest \*ACL neighbors corpus\-wide \(median cosine similarity0\.920\.92\), pool the matched papers \(controls down\-weighted by1/31/3\), and regress on anGeneral\-MLindicator with year fixed effects\. The nearest\-neighbor match serves content control, i\.e\., a counterfactual of how theGeneral\-MLpaper would have done if published in an \*ACL venue\. \(2\)*Within\-author*: for authors with publications in both \*ACL andGeneral\-ML, we match each author’sGeneral\-MLpaper to that same author’s own nearest \*ACL paper, and add author fixed effects\. The model is fit only from the papers of the 4,015 authors who publish at both venue families\. Because a single same\-author match is looser than the corpus\-wide one, we also add K==50 SPECTER2 cluster and year fixed effects to absorb residual topic differences\. The two regressions bound the role of author selection\. One potential explanation is that researchers who publish atGeneral\-MLvenues are systematically higher\-ceiling to begin with \(more coauthors, bigger followings, more cited in general\), so their papers get more citations regardless of venue\. The within\-author model tests this by estimating the premium with author identity held fixed; if the premium were due to higher\- or lower\-impact authors sorting by venue, the gap would collapse in this model\.

Table 5:Fits of he two pooled citation\-premium regressions\. Publication\-year effects are relative to 2015; the K==50 cluster dummies are included but summarized here\. SEs are in parentheses;p∗<0\.05\{\}^\{\*\}p\{<\}0\.05,p∗∗<0\.01\{\}^\{\*\*\}p\{<\}0\.01,p∗⁣∗∗<0\.001\{\}^\{\*\*\*\}p\{<\}0\.001\.##### The premium is large, and author selection does not explain it\.

Table[5](https://arxiv.org/html/2607.02416#S7.T5)reports the pooled estimates\. Matched to their content\-nearest ⁢ACL papers,General\-MLpapers earn\+0\.557\+0\.557log units more citations \(ppeffectively zero;≈\+75%\\approx\\\!\+75\\%\) across all papers\. When the comparison is held*within author*—each switcher’s ownGeneral\-MLpaper against their own nearest \*ACL paper—the premium is\+0\.777\+0\.777\(≈\+118%\\approx\\\!\+118\\%\)\. Holding author identity fixed does not shrink the premium; if anything it is larger, so the gap is not an artifact of higher\-reach authors sorting intoGeneral\-ML\. The citation premium is a content\-conditional venue effect, not a property of who publishes where\.

## 8Conclusion

Where should NLP research be published? Over the past decade authors working in NLP have gained an expanded set of venues to choose from as research on LLMs has blurred the lines between NLP, ML, and AI\. Across our analyses, we find a major shift is underway: established NLP authors are increasingly moving their work from \*ACL venues to more general ML venues like ICLR and NeurIPS\. Decomposing this movement, we find it is driven primarily by*convention*—the same authors, working on largely the same topics, increasingly choosingGeneral\-MLvenues—rather than by authors leaving NLP topics behind; a smaller, finer\-grained shift toward content that is more common at ML venues accounts for the remainder\. The shift is also generational: new NLP researchers increasingly debut their work atGeneral\-MLvenues rather than \*ACL\. And it is plausibly reinforced by a citation premium at general\-ML venues, where content\-matched work attracts more citations\. Read through a science\-of\-science lens, the NLP\-to\-ML shift is one axis of the field\-level mobility that researchers display over their careers\(Zenget al\.,[2019](https://arxiv.org/html/2607.02416#bib.bib52); Jiaet al\.,[2017](https://arxiv.org/html/2607.02416#bib.bib53)\): not authors abandoning their topics so much as a community reassigning where those topics are published, with the next generation entering at the venues the rewards now favor\. Our findings inform open questions on the future of NLP: how \*ACL venues might respond to the shift; how generational replacement interacts with the venue prestige hierarchy; and whether the citation premium is durable or an artifact of the LLM era’s outsized attention to ML\-general venues\.

## Limitations

##### Topic\-judge precision is asymmetric across venues\.

The venue\-blind LLM judge \(Table[1](https://arxiv.org/html/2607.02416#S3.T1)\) is most accurate on \*ACL papers \(precision0\.940\.94, recall0\.980\.98\) and slightly weaker where NLP\-topic papers are rare, with precision and recall of0\.910\.91and0\.880\.88at AI\-broad venues and0\.790\.79and0\.920\.92at ML\-general venues\. The cohort filters in §[4\.1](https://arxiv.org/html/2607.02416#S4.SS1)and §[6](https://arxiv.org/html/2607.02416#S6)rely on baseline\-window labels where high recall protects against attriting genuine NLP authors, but post\-period NLP\-topic counts at the rare\-positive outcome categories are inflated by the residual over\-firing \(lower precision\) at ML\-general venues\. We have not yet applied an imperfect\-classifier measurement\-error correction to the primary Δshare estimates; this is the most important outstanding robustness check \(§[A\.2](https://arxiv.org/html/2607.02416#A1.SS2)\)\. Approximately9%9\\%of the cohort papers lack an abstract and fall back to a title\-only judge call\.

##### Structural venue changes\.

Multiple venues with no pre\-LLM\-era equivalent appear in the post window\. Most notabliy the \*ACL “Findings” venue accounts for a substantial volume of papers, and, newer General\-ML venues such as “COLM” \(2024\) and “TMLR” \(2022\) also appear\. These are sources of asymmetric supply expansion within our NLP and ML\-general categories\. The Findings expansion biases the post\-2020 NLP supply upward \(more \*ACL slots, especially relevant for the trajectory in §[4\.2](https://arxiv.org/html/2607.02416#S4.SS2)\); the COLM and TMLR launches and the NeurIPS D&B track bias the post\-2022 ML\-general supply upward\. The sliding\-cutoff robustness \(Appendix[A\.2](https://arxiv.org/html/2607.02416#A1.SS2)\) bounds the sensitivity to these structural changes\.

##### PhD Cohort Platform Bias\.

The PhD\-student cohort is constructed primarily from OpenReview profiles, which over\-represents authors who have submitted to OpenReview\-managed venues \(ICLR, NeurIPS, COLM, TMLR\)\. Students whose first first\-author submissions are exclusively to \*ACL prior to 2023 \(when ACL also switched to OpenReview\) or to non\-OpenReview\-managed venues are under\-represented in our cohort\. The DBLP and ORCID\-education supplement we include mitigates but does not eliminate this bias; the magnitude of the NLP\-debut shift may therefore be biased toward smaller values\.

##### Causal vs\. Descriptive Analysis\.

This paper documents empirical patterns of scholar behavior\. The mechanisms behind those behaviors are complex and not easily or precisely quantified through causal analysis\. While we use the agentic verb “migrate,” we are only observing venue\-share movement, not author intent\. Further, while we have adopted matching procedures from causal inference, we do not make a causal claim that publishing a paper in aGeneral\-MLwill increase citation counts\.

## Ethics

This work analyses aggregated bibliometric data about publication venues, authorship, and citation counts of academic papers\. All data is derived from public sources and we do not identify individual researchers in our claims\. Individual paper records were used solely for content\-matching \(SPECTER2 embeddings\) and aggregate\-statistics construction\.

All public data was collected from sources designed for open access through official releases or their API: ACL Anthology, Semantic Scholar \(under authenticated API terms\), OpenAlex, OpenReview public profile/paper records, PMLR proceedings, DBLP bulk export, and ORCID educations where the researcher elected to make them public\.

## Acknowledgments

The author would like to especially thank all the researchers and engineering teams behind the ACL Anthology, Semantic Scholar, OpenAlex, OpenReview, and DBLP\. Maintaining these systems is real work—especially with the growing number of papers and submissions—so I’m grateful for the effort to make this data open data to enable studies like this one\.

## References

- M\. Abdalla, J\. P\. Wahle, T\. Ruas, A\. Névéol, F\. Ducel, S\. Mohammad, and K\. Fort \(2023\)The elephant in the room: analyzing the presence of big tech in natural language processing research\.InProceedings of the 61st Annual Meeting of the Association for Computational Linguistics \(Volume 1: Long Papers\),A\. Rogers, J\. Boyd\-Graber, and N\. Okazaki \(Eds\.\),Toronto, Canada,pp\. 13141–13160\.External Links:[Link](https://aclanthology.org/2023.acl-long.734/),[Document](https://dx.doi.org/10.18653/v1/2023.acl-long.734)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p5.1)\.
- A\. Anderson, D\. Jurafsky, and D\. A\. McFarland \(2012\)Towards a computational history of the ACL: 1980\-2008\.InProceedings of the ACL\-2012 Special Workshop on Rediscovering 50 Years of Discoveries,R\. E\. Banchs \(Ed\.\),Jeju Island, Korea,pp\. 13–21\.External Links:[Link](https://aclanthology.org/W12-3202/)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p4.1)\.
- P\. Azoulay, T\. Stuart, and Y\. Wang \(2014\)Matthew: effect or fable?\.Management Science60\(1\),pp\. 92–109\.External Links:[Document](https://dx.doi.org/10.1287/mnsc.2013.1755)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p2.1)\.
- E\. M\. Bender, T\. Gebru, A\. McMillan\-Major, and S\. Shmitchell \(2021\)On the dangers of stochastic parrots: can language models be too big?\.InProceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency \(FAccT ’21\),New York, NY, USA,pp\. 610–623\.External Links:[Document](https://dx.doi.org/10.1145/3442188.3445922)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p7.1)\.
- E\. M\. Bender and A\. Koller \(2020\)Climbing towards NLU: On meaning, form, and understanding in the age of data\.InProceedings of the 58th Annual Meeting of the Association for Computational Linguistics,D\. Jurafsky, J\. Chai, N\. Schluter, and J\. Tetreault \(Eds\.\),Online,pp\. 5185–5198\.External Links:[Link](https://aclanthology.org/2020.acl-main.463/),[Document](https://dx.doi.org/10.18653/v1/2020.acl-main.463)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p7.1)\.
- S\. Bird, R\. Dale, B\. Dorr, B\. Gibson, M\. Joseph, M\. Kan, D\. Lee, B\. Powley, D\. Radev, and Y\. F\. Tan \(2008\)The ACL Anthology reference corpus: a reference dataset for bibliographic research in computational linguistics\.InProceedings of the Sixth International Conference on Language Resources and Evaluation \(LREC’08\),N\. Calzolari, K\. Choukri, B\. Maegaard, J\. Mariani, J\. Odijk, S\. Piperidis, and D\. Tapias \(Eds\.\),Marrakech, Morocco\.External Links:[Link](https://aclanthology.org/L08-1005/)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p4.1)\.
- A\. Birhane, P\. Kalluri, D\. Card, W\. Agnew, R\. Dotan, and M\. Bao \(2022\)The values encoded in machine learning research\.In2022 ACM Conference on Fairness, Accountability, and Transparency \(FAccT ’22\),pp\. 173–184\.External Links:[Document](https://dx.doi.org/10.1145/3531146.3533083)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p6.1)\.
- A\. S\. Blinder \(1973\)Wage discrimination: reduced form and structural estimates\.Journal of Human resources,pp\. 436–455\.Cited by:[§5\.1](https://arxiv.org/html/2607.02416#S5.SS1.p1.11)\.
- S\. L\. Blodgett, S\. Barocas, H\. Daumé III, and H\. Wallach \(2020\)Language \(technology\) is power: a critical survey of “bias” in NLP\.InProceedings of the 58th Annual Meeting of the Association for Computational Linguistics,D\. Jurafsky, J\. Chai, N\. Schluter, and J\. Tetreault \(Eds\.\),Online,pp\. 5454–5476\.External Links:[Link](https://aclanthology.org/2020.acl-main.485/),[Document](https://dx.doi.org/10.18653/v1/2020.acl-main.485)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p6.1)\.
- M\. Bollmann and D\. Elliott \(2020\)On forgetting to cite older papers: an analysis of the ACL Anthology\.InProceedings of the 58th Annual Meeting of the Association for Computational Linguistics,D\. Jurafsky, J\. Chai, N\. Schluter, and J\. Tetreault \(Eds\.\),Online,pp\. 7819–7827\.External Links:[Link](https://aclanthology.org/2020.acl-main.699/),[Document](https://dx.doi.org/10.18653/v1/2020.acl-main.699)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p5.1)\.
- R\. Bommasani, D\. A\. Hudson, E\. Adeli, R\. Altman, S\. Arora, S\. von Arx,et al\.\(2021\)On the opportunities and risks of foundation models\.arXiv preprint arXiv:2108\.07258\.External Links:[Document](https://dx.doi.org/10.48550/arXiv.2108.07258)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p7.1)\.
- L\. Bornmann and H\. Daniel \(2008\)What do citation counts measure? a review of studies on citing behavior\.Journal of Documentation64\(1\),pp\. 45–80\.External Links:[Document](https://dx.doi.org/10.1108/00220410810844150)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p3.1)\.
- T\. Brown, B\. Mann, N\. Ryder, M\. Subbiah, J\. D\. Kaplan, P\. Dhariwal, A\. Neelakantan, P\. Shyam, G\. Sastry, A\. Askell,et al\.\(2020\)Language models are few\-shot learners\.Advances in neural information processing systems33,pp\. 1877–1901\.Cited by:[§1](https://arxiv.org/html/2607.02416#S1.p1.1)\.
- D\. Card, P\. Henderson, U\. Khandelwal, R\. Jia, K\. Mahowald, and D\. Jurafsky \(2020\)With little power comes great responsibility\.InProceedings of the 2020 Conference on Empirical Methods in Natural Language Processing \(EMNLP\),B\. Webber, T\. Cohn, Y\. He, and Y\. Liu \(Eds\.\),Online,pp\. 9263–9274\.External Links:[Link](https://aclanthology.org/2020.emnlp-main.745/),[Document](https://dx.doi.org/10.18653/v1/2020.emnlp-main.745)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p6.1)\.
- Conference on Language Modeling \(2024\)COLM: the conference on language modeling\.Note:[https://colmweb\.org/](https://colmweb.org/)Inaugural conference, University of Pennsylvania, October 2024Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p7.1)\.
- D\. J\. de Solla Price \(1965\)Networks of scientific papers\.Science149\(3683\),pp\. 510–515\.External Links:[Document](https://dx.doi.org/10.1126/science.149.3683.510)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p2.1)\.
- J\. Dodge, S\. Gururangan, D\. Card, R\. Schwartz, and N\. A\. Smith \(2019\)Show your work: improved reporting of experimental results\.InProceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing \(EMNLP\-IJCNLP\),K\. Inui, J\. Jiang, V\. Ng, and X\. Wan \(Eds\.\),Hong Kong, China,pp\. 2185–2194\.External Links:[Link](https://aclanthology.org/D19-1224/),[Document](https://dx.doi.org/10.18653/v1/D19-1224)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p6.1)\.
- M\. R\. Frank, D\. Wang, M\. Cebrian, and I\. Rahwan \(2019\)The evolution of citation graphs in artificial intelligence research\.Nature Machine Intelligence1\(2\),pp\. 79–85\.External Links:[Document](https://dx.doi.org/10.1038/s42256-019-0024-5)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p7.1)\.
- E\. Garfield \(2006\)The history and meaning of the journal impact factor\.JAMA295\(1\),pp\. 90–93\.External Links:[Document](https://dx.doi.org/10.1001/jama.295.1.90)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p3.1)\.
- P\. Ginsparg \(2011\)ArXiv at 20\.Nature476\(7359\),pp\. 145–147\.External Links:[Document](https://dx.doi.org/10.1038/476145a)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p3.1)\.
- O\. E\. Gundersen and S\. Kjensmo \(2018\)State of the art: reproducibility in artificial intelligence\.Proceedings of the AAAI Conference on Artificial Intelligence32\(1\)\.External Links:[Document](https://dx.doi.org/10.1609/aaai.v32i1.11503)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p6.1)\.
- J\. Hoffmann, S\. Borgeaud, A\. Mensch, E\. Buchatskaya, T\. Cai, E\. Rutherford, D\. de Las Casas, L\. A\. Hendricks, J\. Welbl, A\. Clark,et al\.\(2022\)Training compute\-optimal large language models\.InProceedings of the 36th International Conference on Neural Information Processing Systems,pp\. 30016–30030\.Cited by:[§1](https://arxiv.org/html/2607.02416#S1.p1.1)\.
- Y\. Hou, C\. Jochim, M\. Gleize, F\. Bonin, and D\. Ganguly \(2019\)Identification of tasks, datasets, evaluation metrics, and numeric scores for scientific leaderboards construction\.InProceedings of the 57th Annual Meeting of the Association for Computational Linguistics,A\. Korhonen, D\. Traum, and L\. Màrquez \(Eds\.\),Florence, Italy,pp\. 5203–5213\.External Links:[Link](https://aclanthology.org/P19-1513/),[Document](https://dx.doi.org/10.18653/v1/P19-1513)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p6.1)\.
- E\. J\. Hu, Y\. Shen, P\. Wallis, Z\. Allen\-Zhu, Y\. Li, S\. Wang, L\. Wang, W\. Chen,et al\.\(2022\)Lora: low\-rank adaptation of large language models\.\.Iclr1\(2\),pp\. 3\.Cited by:[§1](https://arxiv.org/html/2607.02416#S1.p1.1)\.
- T\. Jia, D\. Wang, and B\. K\. Szymanski \(2017\)Quantifying patterns of research\-interest evolution\.Nature Human Behaviour1\(4\),pp\. 0078\.External Links:[Document](https://dx.doi.org/10.1038/s41562-017-0078)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p2.1),[§8](https://arxiv.org/html/2607.02416#S8.p1.1)\.
- D\. Jurgens, S\. Kumar, R\. Hoover, D\. McFarland, and D\. Jurafsky \(2018\)Measuring the evolution of a scientific field through citation frames\.Transactions of the Association for Computational Linguistics6,pp\. 391–406\.Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p5.1)\.
- R\. Kinney, C\. Anastasiades, R\. Authur, I\. Beltagy, J\. Bragg, A\. Buraczynski, I\. Cachola, S\. Candra, Y\. Chandrasekhar, A\. Cohan, M\. Crawford, D\. Downey, J\. Dunkelberger, O\. Etzioni, R\. Evans, S\. Feldman, J\. Gorney, D\. Graham, F\. Hu, R\. Huff, D\. King, S\. Kohlmeier, B\. Kuehl, M\. Langan, D\. Lin, H\. Liu, K\. Lo, J\. Lochner, K\. MacMillan, T\. Murray, C\. Newell, S\. Rao, S\. Rohatgi, P\. Sayre, Z\. Shen, A\. Singh, L\. Soldaini, S\. Subramanian, A\. Tanaka, A\. D\. Wade, L\. Wagner, L\. L\. Wang, C\. Wilhelm, C\. Wu, J\. Yang, A\. Zamarron, M\. Van Zuylen, and D\. S\. Weld \(2023\)The Semantic Scholar open data platform\.arXiv preprint arXiv:2301\.10140\.External Links:[Document](https://dx.doi.org/10.48550/arXiv.2301.10140)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p7.1)\.
- M\. Krenn, L\. Buffoni, B\. Coutinho, S\. Eppel, J\. G\. Foster, A\. Gritsevskiy, H\. Lee, Y\. Lu, J\. P\. Moutinho, N\. Sanjabi, R\. Sonthalia, N\. M\. Tran, F\. Valente, Y\. Xie, R\. Yu, and M\. Kopp \(2023\)Forecasting the future of artificial intelligence with machine learning\-based link prediction in an exponentially growing knowledge network\.Nature Machine Intelligence5\(11\),pp\. 1326–1335\.External Links:[Document](https://dx.doi.org/10.1038/s42256-023-00735-0)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p7.1)\.
- V\. Larivière, C\. R\. Sugimoto, B\. Macaluso, S\. Milojević, B\. Cronin, and M\. Thelwall \(2014\)ArXiv e\-prints and the journal of record: an analysis of roles and relationships\.Journal of the Association for Information Science and Technology65\(6\),pp\. 1157–1169\.External Links:[Document](https://dx.doi.org/10.1002/asi.23044)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p3.1)\.
- Z\. C\. Lipton and J\. Steinhardt \(2019\)Troubling trends in machine learning scholarship\.Queue17\(1\),pp\. 45–77\.External Links:[Document](https://dx.doi.org/10.1145/3317287.3328534)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p6.1)\.
- K\. Lo, L\. L\. Wang, M\. Neumann, R\. Kinney, and D\. Weld \(2020\)S2ORC: the semantic scholar open research corpus\.InProceedings of the 58th Annual Meeting of the Association for Computational Linguistics,D\. Jurafsky, J\. Chai, N\. Schluter, and J\. Tetreault \(Eds\.\),Online,pp\. 4969–4983\.External Links:[Link](https://aclanthology.org/2020.acl-main.447/),[Document](https://dx.doi.org/10.18653/v1/2020.acl-main.447)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p7.1)\.
- R\. K\. Merton \(1968\)The Matthew effect in science\.Science159\(3810\),pp\. 56–63\.External Links:[Document](https://dx.doi.org/10.1126/science.159.3810.56)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p2.1)\.
- J\. Michael, A\. Holtzman, A\. Parrish, A\. Mueller, A\. Wang, A\. Chen, D\. Madaan, N\. Nangia, R\. Y\. Pang, J\. Phang, and S\. R\. Bowman \(2023\)What do NLP researchers believe? results of the NLP community metasurvey\.InProceedings of the 61st Annual Meeting of the Association for Computational Linguistics \(Volume 1: Long Papers\),A\. Rogers, J\. Boyd\-Graber, and N\. Okazaki \(Eds\.\),Toronto, Canada,pp\. 16334–16368\.External Links:[Link](https://aclanthology.org/2023.acl-long.903/),[Document](https://dx.doi.org/10.18653/v1/2023.acl-long.903)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p5.1)\.
- S\. M\. Mohammad \(2020a\)Examining citations of natural language processing literature\.InProceedings of the 58th Annual Meeting of the Association for Computational Linguistics,D\. Jurafsky, J\. Chai, N\. Schluter, and J\. Tetreault \(Eds\.\),Online,pp\. 5199–5209\.External Links:[Link](https://aclanthology.org/2020.acl-main.464/),[Document](https://dx.doi.org/10.18653/v1/2020.acl-main.464)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p5.1)\.
- S\. M\. Mohammad \(2020b\)NLP scholar: a dataset for examining the state of NLP research\.InProceedings of the Twelfth Language Resources and Evaluation Conference,Marseille, France,pp\. 868–877\.External Links:[Link](https://aclanthology.org/2020.lrec-1.109/),ISBN 979\-10\-95546\-34\-4Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p4.1)\.
- S\. M\. Mohammad \(2020c\)NLP scholar: an interactive visual explorer for natural language processing literature\.InProceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations,A\. Celikyilmaz and T\. Wen \(Eds\.\),Online,pp\. 232–255\.External Links:[Link](https://aclanthology.org/2020.acl-demos.27/),[Document](https://dx.doi.org/10.18653/v1/2020.acl-demos.27)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p4.1)\.
- R\. Oaxaca \(1973\)Male\-female wage differentials in urban labor markets\.International economic review,pp\. 693–709\.Cited by:[§5\.1](https://arxiv.org/html/2607.02416#S5.SS1.p1.11)\.
- L\. Ouyang, J\. Wu, X\. Jiang, D\. Almeida, C\. Wainwright, P\. Mishkin, C\. Zhang, S\. Agarwal, K\. Slama, A\. Ray,et al\.\(2022\)Training language models to follow instructions with human feedback\.Advances in neural information processing systems35,pp\. 27730–27744\.Cited by:[§1](https://arxiv.org/html/2607.02416#S1.p1.1)\.
- A\. M\. Petersen, S\. Fortunato, R\. K\. Pan, K\. Kaski, O\. Penner, A\. Rungi, M\. Riccaboni, H\. E\. Stanley, and F\. Pammolli \(2014\)Reputation and impact in academic careers\.Proceedings of the National Academy of Sciences111\(43\),pp\. 15316–15321\.External Links:[Document](https://dx.doi.org/10.1073/pnas.1323111111)Cited by:[§7](https://arxiv.org/html/2607.02416#S7.p1.1)\.
- A\. Pramanick, Y\. Hou, S\. Mohammad, and I\. Gurevych \(2023\)A diachronic analysis of paradigm shifts in NLP research: when, how, and why?\.InProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing,H\. Bouamor, J\. Pino, and K\. Bali \(Eds\.\),Singapore,pp\. 2312–2326\.External Links:[Link](https://aclanthology.org/2023.emnlp-main.142/),[Document](https://dx.doi.org/10.18653/v1/2023.emnlp-main.142)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p5.1)\.
- J\. Priem, H\. Piwowar, and R\. Orr \(2022\)OpenAlex: a fully\-open index of scholarly works, authors, venues, institutions, and concepts\.arXiv preprint arXiv:2205\.01833\.External Links:[Document](https://dx.doi.org/10.48550/arXiv.2205.01833)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p7.1)\.
- D\. R\. Radev, P\. Muthukrishnan, V\. Qazvinian, and A\. Abu\-Jbara \(2013\)The ACL anthology network corpus\.Language Resources and Evaluation47\(4\),pp\. 919–944\.External Links:[Document](https://dx.doi.org/10.1007/s10579-012-9211-2)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p4.1)\.
- A\. Rogers, T\. Baldwin, and K\. Leins \(2021\)‘Just what do you think you’re doing, dave?’ a checklist for responsible data use in NLP\.InFindings of the Association for Computational Linguistics: EMNLP 2021,M\. Moens, X\. Huang, L\. Specia, and S\. W\. Yih \(Eds\.\),Punta Cana, Dominican Republic,pp\. 4821–4833\.External Links:[Link](https://aclanthology.org/2021.findings-emnlp.414/),[Document](https://dx.doi.org/10.18653/v1/2021.findings-emnlp.414)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p6.1)\.
- M\. Rungta, J\. Singh, S\. M\. Mohammad, and D\. Yang \(2022\)Geographic citation gaps in NLP research\.InProceedings of the 2022 Conference on Empirical Methods in Natural Language Processing,Y\. Goldberg, Z\. Kozareva, and Y\. Zhang \(Eds\.\),Abu Dhabi, United Arab Emirates,pp\. 1371–1383\.External Links:[Link](https://aclanthology.org/2022.emnlp-main.89/),[Document](https://dx.doi.org/10.18653/v1/2022.emnlp-main.89)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p5.1)\.
- S\. Salinas and S\. B\. Munch \(2015\)Where should i send it? optimizing the submission decision process\.PLoS one10\(1\),pp\. e0115451\.Cited by:[§7](https://arxiv.org/html/2607.02416#S7.p1.1)\.
- N\. Schluter \(2018\)The glass ceiling in NLP\.InProceedings of the 2018 Conference on Empirical Methods in Natural Language Processing,E\. Riloff, D\. Chiang, J\. Hockenmaier, and J\. Tsujii \(Eds\.\),Brussels, Belgium,pp\. 2793–2798\.External Links:[Link](https://aclanthology.org/D18-1301/),[Document](https://dx.doi.org/10.18653/v1/D18-1301)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p5.1)\.
- D\. Sculley, J\. Snoek, A\. Wiltschko, and A\. Rahimi \(2018\)Winner’s curse? on pace, progress, and empirical rigor\.In6th International Conference on Learning Representations, ICLR 2018, Workshop Track Proceedings,External Links:[Link](https://openreview.net/forum?id=rJWF0Fywf)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p6.1)\.
- P\. O\. Seglen \(1997\)Why the impact factor of journals should not be used for evaluating research\.BMJ314\(7079\),pp\. 498–502\.External Links:[Document](https://dx.doi.org/10.1136/bmj.314.7079.497)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p3.1)\.
- A\. Singh, M\. D’Arcy, A\. Cohan, D\. Downey, and S\. Feldman \(2023\)Scirepeval: a multi\-format benchmark for scientific document representations\.InProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing,pp\. 5548–5566\.Cited by:[§5\.1](https://arxiv.org/html/2607.02416#S5.SS1.p2.3)\.
- I\. Tahamtan, A\. Safipour Afshar, and K\. Ahamdzadeh \(2016\)Factors affecting number of citations: a comprehensive review of the literature\.Scientometrics107\(3\),pp\. 1195–1225\.External Links:[Document](https://dx.doi.org/10.1007/s11192-016-1889-2)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p3.1)\.
- C\. Tenopir, E\. Dalton, A\. Fish, L\. Christian, M\. Jones, and M\. Smith \(2016\)What motivates authors of scholarly articles? the importance of journal attributes and potential audience on publication choice\.Publications4\(3\),pp\. 22\.Cited by:[§7](https://arxiv.org/html/2607.02416#S7.p1.1)\.
- Transactions on Machine Learning Research \(2022\)Announcing the Transactions on Machine Learning Research\.Note:[https://jmlr\.org/tmlr/news/2022/launch\.html](https://jmlr.org/tmlr/news/2022/launch.html)Journal launch announcement; ISSN 2835\-8856Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p7.1)\.
- B\. Uzzi, S\. Mukherjee, M\. Stringer, and B\. Jones \(2013\)Atypical combinations and scientific impact\.Science342\(6157\),pp\. 468–472\.External Links:[Document](https://dx.doi.org/10.1126/science.1240474)Cited by:[§5](https://arxiv.org/html/2607.02416#S5.p1.1)\.
- J\. P\. Wahle, T\. Ruas, M\. Abdalla, B\. Gipp, and S\. Mohammad \(2023\)We are who we cite: bridges of influence between natural language processing and other academic fields\.InProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing,H\. Bouamor, J\. Pino, and K\. Bali \(Eds\.\),Singapore,pp\. 12896–12913\.External Links:[Link](https://aclanthology.org/2023.emnlp-main.797/),[Document](https://dx.doi.org/10.18653/v1/2023.emnlp-main.797)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p5.1)\.
- L\. Waltman \(2016\)A review of the literature on citation impact indicators\.Journal of Informetrics10\(2\),pp\. 365–391\.External Links:[Document](https://dx.doi.org/10.1016/j.joi.2016.02.007)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p3.1)\.
- J\. Wei, M\. Bosma, V\. Zhao, K\. Guu, A\. W\. Yu, B\. Lester, N\. Du, A\. M\. Dai, and Q\. V\. Le \(2022a\)Finetuned language models are zero\-shot learners\.InInternational Conference on Learning Representations \(ICLR\),Cited by:[§1](https://arxiv.org/html/2607.02416#S1.p1.1)\.
- J\. Wei, X\. Wang, D\. Schuurmans, M\. Bosma, F\. Xia, E\. Chi, Q\. V\. Le, D\. Zhou,et al\.\(2022b\)Chain\-of\-thought prompting elicits reasoning in large language models\.Advances in neural information processing systems35,pp\. 24824–24837\.Cited by:[§1](https://arxiv.org/html/2607.02416#S1.p1.1)\.
- S\. Yao, J\. Zhao, D\. Yu, N\. Du, I\. Shafran, K\. Narasimhan, and Y\. Cao \(2023\)ReAct: synergizing reasoning and acting in language models\.InInternational Conference on Learning Representations \(ICLR\),Cited by:[§1](https://arxiv.org/html/2607.02416#S1.p1.1)\.
- A\. Zeng, Z\. Shen, J\. Zhou, Y\. Fan, Z\. Di, Y\. Wang, H\. E\. Stanley, and S\. Havlin \(2019\)Increasing trend of scientists to switch between topics\.Nature Communications10\(1\),pp\. 3439\.External Links:[Document](https://dx.doi.org/10.1038/s41467-019-11401-8)Cited by:[§2](https://arxiv.org/html/2607.02416#S2.p2.1),[§8](https://arxiv.org/html/2607.02416#S8.p1.1)\.

## Appendix ASupplemental analyses

This appendix gathers the material referenced from the main text: further detail on the data \(§[A\.1](https://arxiv.org/html/2607.02416#A1.SS1)\), robustness checks on the migration result \(§[A\.2](https://arxiv.org/html/2607.02416#A1.SS2)\), and additional mechanism analyses \(§[A\.3](https://arxiv.org/html/2607.02416#A1.SS3)\)\.

### A\.1Data Details

Because the cross\-linked author graph underpins every cohort, we report how often each external identifier resolves\. Table[6](https://arxiv.org/html/2607.02416#A1.T6)gives the share of authors carrying each handle: a Semantic Scholar identifier is present for nearly all authors, while OpenReview, DBLP, ORCID, and Google Scholar handles are progressively sparser\.

Table 6:Cross\-link coverage for authors of≥\\geq1 NLP\-topic paper \(N=134,891N=134,891of 320,775 total authors; venue\-blind LLM topic label\)\. This is the population the study analyzes; coverage rates exclude corpus co\-authors who never published an NLP\-topic paper\.Table[7](https://arxiv.org/html/2607.02416#A1.T7)lists the full venue catalog and its assignment to the \*ACL,General\-ML, andAI\-Broadfamilies, and Table[8](https://arxiv.org/html/2607.02416#A1.T8)reports per\-venue paper counts across the three windows used throughout \(2010–14, 2015–20, 2021–26\)\. Coverage is dense from 2015 onward for every tracked venue\.

Table 7:Venue taxonomy used throughout the paper\. The three tracked categories cover \*ACL \(NLP\), the main ML\-general venues \(NeurIPS, ICLR, ICML, COLM, TMLR\), and AAAI/IJCAI \(AI\-broad\)\. The*Years*column is each venue’s active span \(launch year to present, or to its final year\); the study analyzes papers published from 2015 onward\.Table 8:Papers per venue, bucketed into three windows\. Counts derive frompapers\_unifiedafter the union of ACL\-Anthology, S2, OpenAlex, OpenReview, PMLR, and DBLP sources\.![Refer to caption](https://arxiv.org/html/2607.02416v1/x5.png)Figure 5:Per\-venue/year paper coverage on a log color scale\. Coverage is dense across the 2015–2026 window for all venues; the extension back to 2010 \(used by §[4](https://arxiv.org/html/2607.02416#S4)for pre\-LLM\-era baseline\) is solid for AAAI, IJCAI, NeurIPS, ICML, and the \*ACL family\. ICLR begins in 2013 by construction; COLM in 2024\.The citation\-premium analysis \(§[7](https://arxiv.org/html/2607.02416#S7)\) relies on paper abstracts \(for the SPECTER2 embeddings\) and Semantic Scholar citation counts; both are available for the large majority of papers in all three families \(Table[9](https://arxiv.org/html/2607.02416#A1.T9)\)\.

Table 9:Abstract and citation coverage by venue category\. Citation counts come from Semantic Scholar; abstracts from S2, OpenAlex \(inverted\-index reconstruction\), the OpenReview/PMLR/ACL Anthology raw shards, and a tiered Crossref\+publisher\-landing page scrape for IEEE/ACM/Springer DOIs\.The pool of141,710141\{,\}710distinct papers is assembled from complementary sources, deduplicated to a canonical paper identifier \(DOI\>\>S2\>\>OpenAlex\>\>Anthology\>\>DBLP\>\>arXiv\>\>OpenReview\): the ACL Anthology \(49,34249\{,\}342papers; \*ACL ground truth\), DBLP venue extracts \(62,17262\{,\}172; the primary source for AAAI/IJCAI and much of NeurIPS/ICLR where Anthology coverage is absent\), PMLR \(14,19914\{,\}199; ICML\), OpenReview \(10,94610\{,\}946; NeurIPS/ICLR/COLM/TMLR\), and Semantic Scholar’s venue\-paper endpoint as a backfill \(5,0515\{,\}051\)\. Almost all papers carry a Semantic Scholar identifier, though DOIs are sparse for the ML\-general venues \(which are ingested from OpenReview/PMLR without DOIs\)\.

Table[10](https://arxiv.org/html/2607.02416#A1.T10)traces the established\-author cohort from the full author pool down to the 4,181 research\-active authors we study\.

Table 10:Established\-author cohort selection funnel, on the venue\-blind LLM topic label \(prompt P07\_v2\)\.#### A\.1\.1New Entrant Cohort Identification

Table 11:Cohort definitions and sizes\. Topic\-active is the primary established\-author cohort; the no\-activity\-floor variant is reported in the appendix as a robustness check\. The two new\-entrant cohorts \(publication\-record and declared\-PhD\) each restrict to new entrants with≥3\\geq 3first\-author NLP\-topic papers; the all\-PhD\-students row is the pre\-restriction declared\-PhD population\.##### Publication\-record Cohort\.

For every author in our corpus we identify their first\-author \(position=0=0\) NLP\-topic papers at the three tracked venue families\. The publication\-record cohort consists of every author with at least three such papers whose first one falls in\[2019,2024\]\[2019,2024\]; the entry \(cohort\) year is the year of that first NLP\-topic first\-author paper \(N=3,568N=3\{,\}568after author\-disambiguation\)\. Widening the entry window to\[2017,2024\]\[2017,2024\]yieldsN=4,259N=4\{,\}259; Table[4](https://arxiv.org/html/2607.02416#S6.T4)reports the\[2019,2024\]\[2019,2024\]window\.

This definition removes the platform\-coverage selection that affects the PhD\-student cohort: every author who accumulates three first\-author NLP\-topic papers in the corpus is included, regardless of whether they have an OpenReview, ORCID, or DBLP\-thesis profile\.

Declared\-PhD Cohort\.A researcher enters the declared\-PhD cohort iff all four criteria below hold\. The criteria are applied in order; \(C1\)–\(C3\) build the*all\-PhD\-students*cohort and \(C4\) restricts it to the cohort used in experiments\.

1. C1A PhD\-equivalent education record\.Detected from a three\-source resolver \(OpenReview, ORCID, and DBLP\-thesis records\) applied in strict priority \(a higher source wins; a lower one only adds researchers not already matched\):*\(a\) OpenReview*through anycontent\.historyorcontent\.educationentry whosepositionstring contains, case\-insensitively, one of \{phd,ph\.d,doctoral,doctorate\};*\(b\) ORCID*through aneducationsentry whose role title matches\\b\(ph\\\.?d\|doctoral\|doctorate\)\\bor contains “phd student”/“phd candidate”;*\(c\) DBLP\-thesis*when the author has a registry<phdthesis\>record, which is treated as a supplemental backfill only\.
2. C2Resolvable to a unified author\.The profile links to anauthor\_uidin our author table \(OpenReview viaopenreview\_id, ORCID viaorcid, DBLP via the thesis record’sauthor\_uid\); profiles that do not link are dropped\.
3. C3PhD start year in 2019–2024\.An integer start year parsed from the education entry \(OpenReview/ORCID: the self\-reported education start; DBLP: inferred asdissertation\_year−5\\text\{dissertation\\\_year\}\-5\) falls in\[2019,2024\]\[2019,2024\]\. When a researcher has several PhD entries \(which is very rare\) the qualifying one is used\.
4. C4At least three first\-author NLP\-topic papers\.The author has≥3\\geq 3papers as a first author on NLP topics at a tracked venue in 2019–2026; the first three by year are the analysis set\.

Note that the DBLP thesis records contain the*completion*year rather than a start year; we infer a likely starting date for the PhD using the national average degree time in China and North America of 5 years333Students from institutions in these regions account for the majority of the authors, though we recognize that European students often have a much shorter time to degree of∼\\sim3 years\.

Criteria \(C1\)–\(C3\) yield the full PhD\-student cohort \(the all\-PhD\-students cohort\); adding \(C4\) yields the declared\-PhD cohort, which containsNN=1,228 unique students \(by resolving source: OpenReview 1,010, ORCID 150, DBLP\-thesis backfill 68\)\. Table[12](https://arxiv.org/html/2607.02416#A1.T12)shows the step\-by\-step funnel from population to the final cohort\.

Table 12:Declared\-PhD cohort selection funnel\. The first four rows describe the all\-PhD\-students population \(25,521 students with at least one first\-author paper\); the last row produces the declared\-PhD cohort \(1,228 students with≥3\\geq 3first\-author NLP\-topic papers\)\.

### A\.2Robustness

Table[13](https://arxiv.org/html/2607.02416#A1.T13)gives the full coefficients behind the forest plot in §[4\.3](https://arxiv.org/html/2607.02416#S4.SS3): the mean within\-author change in each venue family’s share, under any\-authorship and under first\-or\-last\-authorship, with cluster\-robust standard errors and a joint Wald test against the null of an unchanged venue mix\.

Estimator: stacked OLSΔ​sharei​c∼0\+C​\(category\)\\Delta\\text\{share\}\_\{ic\}\\sim 0\+C\(\\text\{category\}\), one row per \(author, venue category\), cluster\-robust SE by author \(N=4,181N=4,181authors\)\. Each coefficient is the mean*within\-author*change in venue\-mix share from 2015–20 to 2021–26, in percentage points\. Left block: any\-authorship; right block: first\-or\-last\-author\. Joint Wald test ofH0H\_\{0\}: venue mix unchanged — any\-authorshipp=8\.9×10−128p=8\.9\\times 10^\{\-128\}, first/lastp=2\.8×10−99p=2\.8\\times 10^\{\-99\}\. Per\-categorypHolmp\_\{\\text\{Holm\}\}is Holm–Bonferroni FWER\-adjusted within each authorship column\. Per\-stratum coefficients are in Appendix Table[15](https://arxiv.org/html/2607.02416#A1.T15)\.

Table 13:Per\-author venue\-mix shift \(2015–20→\\to2021–26\)\. Any\-authorship \(left\) and first\-or\-last\-author \(right\)\.##### Is 2020 a reasonable pre/post cutoff?

We split the study window at 2020 because it brackets the arrival of the first large language models \(GPT3\), but the venue\-mix shift should not depend on that exact year choice\. We therefore recompute the per\-authorΔ\\Deltashare while sliding the split yearkkfrom 2017 to 2022, each time pairing an equal three\-year baseline\[k−2,k\]\[k\{\-\}2,k\]with a post window\[k\+1,k\+3\]\[k\{\+\}1,k\{\+\}3\]and rebuilding the cohort \(Table[14](https://arxiv.org/html/2607.02416#A1.T14)\); this range brackets the inflection and, at a three\-year window \(rather than five\-year\) so that it is fully covered by the 2010–2025 data\. The \*ACL decline and theGeneral\-MLrise appear at*every*cutoff, so neither is an artifact of splitting at 2020\. TheGeneral\-MLgain also sharpens as the split moves later, growing monotonically from\+1\.7\+1\.7pp atk=2017k\{=\}2017to\+9\.0\+9\.0pp atk=2022k\{=\}2022, whileAI\-Broadstays negative throughout\. This mirrors the trajectory inflection in §[4\.2](https://arxiv.org/html/2607.02416#S4.SS2)and places the acceleration in the post\-2020 window, so 2020 is a reasonable boundary, if slightly conservative\.

Table 14:Sliding pre/post\-cutoff sensitivity\. Each row rebuilds the cohort with an equal three\-year baseline\[k−2,k\]\[k\{\-\}2,k\]and post window\[k\+1,k\+3\]\[k\{\+\}1,k\{\+\}3\]and reports the mean within\-author change in venue\-mix share \(percentage points\), renormalized within the three tracked families\. We sweep the split yearkkacross 2017–2022, the range that brackets the LLM inflection and that the 2010–2025 data fully support at this window width\. The \*ACL decline andGeneral\-MLrise hold at every split year, and theGeneral\-MLgain grows monotonically withkkwhileAI\-Broadstays negative throughout, so neither trend is an artifact of splitting at any particular year\.∗marks a 95% CI excluding zero\.
##### Author heterogeneity\.

Table[15](https://arxiv.org/html/2607.02416#A1.T15)re\-fits the venue\-mix shift within author strata—career age andhh\-index quartile, as two separate stacked regressions with Holm–Bonferroni–correctedpp\-values\. The \*ACL decline andGeneral\-MLgain persist in every stratum, so the migration is not confined to any one seniority or impact group\. Across career age no stratum deviation survives correction \(the shift is essentially uniform\); the one sizeable, correction\-surviving exception is by citation impact \(Panel B\), where the highest\-hh\-index authors retain \*ACL much more than the lowest\. Paper\-count quartile mirrors thehh\-index pattern more weakly and the hybrid career\-×\\times\-role split is too sparse to interpret, so both are omitted\.

Table 15:Author heterogeneity of the venue\-mix shift \(2015–20→\\to2021–26\), in percentage points\. Panels A \(top\) and B \(bottom\) are*two separate*stacked OLS regressions \(Δ​share∼0\+C​\(category\)\+C​\(category\):C​\(stratum\)\\Delta\\text\{share\}\\sim 0\+C\(\\text\{category\}\)\+C\(\\text\{category\}\)\{:\}C\(\\text\{stratum\}\)\), each with its own reference stratum; the rows without×\\timesare the reference\-stratum category means and the remaining rows are stratum\-level deviations\. Allpp\-values are Holm–Bonferroni corrected within each panel and authorship column\. The \*ACL decline andGeneral\-MLgain hold in every stratum; the only sizeable, correction\-surviving heterogeneity is byhh\-index \(Panel B\): the highest\-impact authors retain in \*ACL venues more\.

### A\.3Mechanism supplements

Table[16](https://arxiv.org/html/2607.02416#A1.T16)gives the full Oaxaca–Blinder decomposition summarized in §[5\.2](https://arxiv.org/html/2607.02416#S5.SS2), splitting each family’s cohortΔ\\Deltashare into a composition term \(the cohort changed which topics it works on\) and a convention term \(the field changed where a given topic is published\), under both the OpenAlex\-subfield and theK=50K\{=\}50cluster taxonomies\. As in the main text, theGeneral\-MLgain andAI\-Broadloss are convention\-dominated, while the \*ACL decline is carried by composition\. Figure[6](https://arxiv.org/html/2607.02416#A1.F6)shows this at the topic level as the within\-topic \*ACL share over time for the most common subfields, where a subfield whose share falls contributes negative convention\.

Table 16:Oaxaca–Blinder decomposition of cohortΔ\\Deltashare \(post 2021–26 vs\. baseline 2015–20\) into composition \(topic\-mix shift, H1\) and convention \(within\-topic venue shift, H2\) under two taxonomies\. Average of pre/post weight references\.![Refer to caption](https://arxiv.org/html/2607.02416v1/x6.png)Figure 6:Share of each topic cluster’s papers that appear at \*ACL venues, pooled into two\-year periods \(2016–2025; the partial 2026 is dropped, and binning removes the even/odd sawtooth created by biennial and cyclic ⁢ACL venues such as LREC, EACL, NAACL, and COLING\)\. We highlight the ten largest clusters with a substantial \*ACL presence\. The two large language\-model clusters \(c45: language models/LLMs; c6: LLM evaluation\) are the only ones whose \*ACL share*falls*\(red\-ish colors\), bold\)—these are the topics migrating toGeneral\-MLvenues—while classic NLP tasks such as machine translation, question answering, summarization, and information extraction hold or consolidate at \*ACL \(blue\-ish colors\)\. Grey lines are the remaining clusters\. Error bars are bootstrapped 95% CIs\.##### Citation premium: observable controls and author fixed effects\.

We refit the citation premium two further ways: \(i\) an observable\-controls regression with a venue×\\timesOpenAlex\-field interaction, paper age, and a first/last\-author flag, and \(ii\) an author fixed\-effects spec that identifies off authors who publish in more than one venue category\.General\-MLvenues lack OpenAlex topics \(they have no DOI match\), so we backfill their field labels from Semantic Scholar fields\-of\-study \(overwhelmingly Computer Science\), which makes the venue×\\timesfield interaction identifiable\.

![Refer to caption](https://arxiv.org/html/2607.02416v1/x7.png)Figure 7:Citation premium \(vs\. ⁢ACL/NLP\) by venue category under the observable\-controls and author\-fixed\-effects specifications\. TheGeneral\-MLpremium is large and positive under both; the within\-author estimate is close to the between\-author one, indicating little author\-selection bias\.For Computer Science papers \(the bulk of the cohort\), the observable\-controlsGeneral\-MLpremium is\+0\.47\+0\.47log units \(≈\\approx60%;p<10−200p<10^\{\-200\}\), and the within\-author premium is\+0\.54\+0\.54\(≈\\approx71%;95%95\\%CI\[0\.50,0\.58\]\[0\.50,0\.58\],p<10−100p<10^\{\-100\}\)\. The within\-author estimate closely tracks the\+0\.51\+0\.51matched\-pair estimate, so holding author identity fixed barely changes the premium\. As a author selection by venue explains little of the citation gap\.

### A\.4NLP\-topic judge: full prompt

The NLP\-topic label \(§[3\.3](https://arxiv.org/html/2607.02416#S3.SS3)\) is produced by Gemma\-4\-26B\-A4B\-it under prompt variant P07\_v2 — the production CFP\-grounded judge: a call\-for\-papers area union \(2015–2024\), hardened out\-of\-scope negatives and decision rules, and 16 few\-shot examples, designed to be applied venue\-blind without an external ⁢ACL\-membership override\. The prompt is*venue\-blind*: the model sees only the paper title and abstract, never the venue\. It is reproduced below verbatim —\{TITLE\}and\{ABSTRACT\}are substituted with the paper’s text at inference time, the model is queried at temperature0, and a regular expression parses the trailingLABEL: YES/NOfrom its single\-line reply\. Em\- and en\-dashes are shown as\-\-/\-for display; the prompt is otherwise identical to the deployed template\.

TASK:Decidewhetherapaper’sPRIMARYresearchcontributionfallswithin

NaturalLanguageProcessing/ComputationalLinguistics,definedasthe

topicalscopeofthemainACL,EMNLP,andNAACLconferences\(theACL

Anthologycore\),unionedacrossthe2015\-2024Call\-for\-Papersareasets\.

Usethetopic,notwhereitwaspublished:apaperisinscopeifitwould

naturallyfitoneoftheACL/EMNLP/NAACLareatracksbelow,whatevervenue

itactuallyappearedin\.

IN\-SCOPEAREAS\(ACL/EMNLP/NAACLCall\-for\-Papersareas,2015\-2024\-\-thesethas

grownovertime;allofthefollowingcount,oldandnew\):

\-Syntax:tagging,chunking,parsing;phonology,morphology,wordsegmentation

\-Semantics\(lexical,sentence\-level,textualinference\);discourse,coreference,pragmatics

\-Machinetranslation;multilingualityandlanguagediversity

\-Informationextraction;textmining/IRaslanguageunderstanding;questionanswering

\-Summarization;naturallanguagegeneration

\-Dialogueandinteractivesystems

\-Sentimentanalysis,stylisticanalysis,argumentmining

\-Languageresources,datasets,evaluationandbenchmarkingforlanguagetasks

\-Machinelearning/statisticalmethodsFORNLP;representationlearningandlanguagemodelsstudiedaslanguagesystems

\-InterpretabilityandanalysisofNLPmodels;ethics,bias,fairnessinNLP

\-Efficient/low\-resourcemethodsforNLP

\-Multimodalityandlanguagegroundingtovision/robotics;languageandvision

\-Speech/spokenlanguageunderstandingwherelanguage\(notjustacoustics\)iscentral

\-Computationalsocialscience/culturalanalyticsviatext;NLPforwebandsocialmedia;linguistic,psycholinguisticandcognitivemodelingoflanguage

\-NLPapplications\(clinical,legal,scientific,educational,\.\.\.\)wherethecontributionisthelanguagemethod

\-Large\-language\-modelworkISinscopewhenthecontributionconcernslanguage\(capabilities,behavior,training/evaluation,analysis,generation,multilingualorreasoning\-in\-languageaspects\)

\-Instructiontuning,RLHF,alignment,agenticLLMs,tooluse\-\-inscopewhenthecontributionisalanguagecapability

\-Inference/trainingefficiencyforLLMsandlanguagemodels\-\-inscope\(themodelstudiedISalanguagemodel\)

OUTOFSCOPE\(answerNOeveniftext,anLLM,oralinguistictermappears\):

\-Ageneralmachine\-learning/optimization/statistics/theorycontributionwhosenoveltyistheMLmethoditself,merelydemonstratedonatextdataset,orthatliststextasoneofseveralapplicationdomains\(thetypicalNeurIPS/ICLR/ICMLmethodspaper\)\.

\-Computervision/image/videowheretextorcaptionsareincidental,andtaskswherenaturallanguageisonlyaninstructionorcontrolinterfaceforanon\-languagegoal\(imageorchartediting,GUIcontrol,robotactuation\)\.

\-Information\-retrieval,recommender,databaseordata\-miningsystemsworkcenteredonindexing,ranking,scalability,click\-throughorefficiencyratherthanlanguageunderstanding\.

\-SPEECHthatisacoustic/signalprocessing:text\-to\-speechandspeechsynthesis,voiceconversion,vocoders,speakerverification/identification/diarization,anti\-spoofing,audioenhancement\-\-thecontributionthereisaudio,notlanguage\.\(ASR,spoken\-languageunderstanding,spokenQA/dialogue,speechtranslation,phonology,wordsegmentation,andspokenlanguageidentificationAREinscope:languageiscentralthere\.\)

\-ApaperthatUSESalargelanguagemodelorLLMagentmerelyasatoolorcomponent,whileitsactualtaskandcontributionisnon\-linguistic\(recommendation,point\-of\-interestortrajectoryprediction,forecasting,tabularprediction,numericregression,planning,optimization,scientificdiscovery\)\.

\-Robotics,networks,security,HCI,bioinformatics,etc\.,wherelanguageisnotthecentralobject\.

DECISIONRULES:

1\.Ask"isthecentralnoveltyalanguage/linguisticcapabilityorunderstanding\(YES\),orageneralmethod/anotherfieldthatmerelyusestextoranLLM\(NO\)?"

2\.AmethodwhosecontributionisAPPLIEDTOlanguageisYES\(instructiontuningofanLLM,RLforreasoningexpressedinlanguage,efficiencyforalanguagemodel,adatasetorbenchmarkforanNLPtask\)\.Butagenericmethodthatmerelyrunsonatextdataset,andapaperthatusesanLLMtoperformanon\-languagetask,areNO\-\-judgewhatthecontributionIS,notwhichtoolsituses\.

3\.Visionorspeechwithlanguage:YESonlyifalanguage/linguisticcapabilityisthecentralcontribution;NOiflanguageisjustaconditioningsignalorinstructioninterface,orifthecontributionisacoustic,visual,orsystems\.

4\.Multilingual,low\-resource,andappliedNLP\(clinical,legal,educational,social\-media\)\-\-YESwhenthelanguagemethod,linguisticanalysis,orlanguageresourceisthecontribution\.

5\.DoNOTdefaulttoYESwhenuncertain:ifyoucannotidentifyaconcretelanguage/linguisticcontribution,answerNO\.JudgeONLYfromthetitleandabstract;ifgenuinelymixed,decidebythePRIMARYcontribution\.

EXAMPLES:

Example1:

Title:NeuralMachineTranslationbyJointlyLearningtoAlignandTranslate

Abstract:Weproposeaneuralencoder\-decoderfortranslationwithanattentionmechanismthatalignssourceandtargetwords\.

REASON:Machinetranslation,acoreNLParea\|LABEL:YES

Example2:

Title:SharperGeneralizationBoundsforStochasticGradientDescent

Abstract:Wederivenewhigh\-probabilitygeneralizationboundsforSGDonconvexlosses;experimentsincludeatextclassificationbenchmark\.

REASON:GeneralMLtheory\-\-textdatasetisincidental\|LABEL:NO

Example3:

Title:VisualQuestionAnsweringwithQuestion\-AwareImageFeatures

Abstract:WestudyVQA,learningamultimodalrepresentationthatgroundsnatural\-languagequestionsinimagesandgeneratesafree\-formanswer\.

REASON:Languagegroundingtovision\-\-multimodalNLP\|LABEL:YES

Example4:

Title:ImprovingNaturalnessinUnit\-SelectionText\-to\-SpeechwithNeuralAcousticEmbeddings

Abstract:Welearnacousticunitembeddingsthatimprovewaveformconcatenation,raisingthenaturalnessmean\-opinion\-scoreofatext\-to\-speechsystem\.

REASON:Text\-to\-speech\-\-contributionisacousticnaturalness,notlanguage\|LABEL:NO

Example5:

Title:End\-to\-EndSpeechRecognitionforFourLow\-ResourceBantuLanguages

Abstract:Webuildanend\-to\-endASRsystemforunder\-resourcedBantulanguagesandstudycross\-lingualtransferoflexicalandacousticunits\.

REASON:Speechrecognition/spoken\-languageunderstanding\-\-languageiscentral\|LABEL:YES

Example6:

Title:Self\-SupervisedPretrainingforObjectDetection

Abstract:Wepretrainabackboneonunlabeledimagesusingcontrastivelearningandfine\-tuneforobjectdetection;imagecaptionsprovideweaksupervision\.

REASON:Visioncontribution\-\-captionsareincidental\|LABEL:NO

Example7:

Title:HowDoIn\-ContextExamplesAffectLLMReasoning?

Abstract:Weprobeinstruction\-tunedLLMsacrossreasoningbenchmarksandanalyzesensitivitytoexampleorder,demonstratingsystematiceffects\.

REASON:LLManalysis\-\-language\-modelbehavior\|LABEL:YES

Example8:

Title:AnLLM\-AgentFrameworkforNextPoint\-of\-InterestRecommendation

Abstract:WeorchestrateanLLMagentoverusercheck\-intrajectoriestopredictthenextpointofinterest,improvingrecommendationaccuracyonmobilitydata\.

REASON:LLMisonlyatool\-\-thetaskislocationrecommendation,notlanguage\|LABEL:NO

Example9:

Title:ReinforcementLearningfromHumanFeedbackforOpen\-EndedGeneration

Abstract:Wetrainarewardmodelfrompairwisehumanpreferencesandfine\-tuneanLLMviaPPO,improvinghelpfulnessandsafetyofgeneration\.

REASON:RLHFforlanguagegeneration\|LABEL:YES

Example10:

Title:ScalableIndexCompressionforWeb\-ScaleRetrieval

Abstract:Weproposeacompressedinvertedindexwithdeltaencodingthatreducesstorage3xandimprovesquerythroughputonwebsearch\.

REASON:IRsystems/indexing\-\-notlanguageunderstanding\|LABEL:NO

Example11:

Title:DetectingCommunityMental\-HealthSignalsfromLongitudinalSocial\-MediaPosts

Abstract:Wemodelthelanguageofusers’social\-mediatimelinestotrackcommunity\-levelmental\-healthsignalsovertime\.

REASON:Computationalsocialscienceviasocial\-mediatext\|LABEL:YES

Example12:

Title:RobustSpeakerVerificationwithAnti\-SpoofingCountermeasures

Abstract:Weproposeacousticcountermeasuresthatmakespeakerverificationrobusttoreplayandsynthetic\-speechspoofingattacks\.

REASON:Acoustic/biometricspeech\-\-nolanguagecontent\|LABEL:NO

Example13:

Title:AMultilingualBenchmarkforCross\-LingualQuestionAnswering

Abstract:Wereleasea12\-languageQAbenchmarkwithnative\-speakerannotationsandevaluatestate\-of\-the\-artmultilingualencoders\.

REASON:Languageresource/benchmarkforNLP\|LABEL:YES

Example14:

Title:RefiningWeak\-SupervisionLabelingFunctionswithLimitedLabeledData

Abstract:Weproposeamethodthatiterativelyrefineslabelingfunctionsforprogrammaticweaksupervision,evaluatedontext,tabular,andimagebenchmarks\.

REASON:GenericMLmethod\-\-textisoneofseveralapplicationdomains\|LABEL:NO

Example15:

Title:ExtractingMedicationEventsfromClinicalNoteswithaSpan\-BasedTagger

Abstract:Wepresentaspan\-basedinformation\-extractionmodelthatidentifiesmedications,dosages,andtemporalrelationsinfree\-textclinicalnotes\.

REASON:ClinicalNLP\-\-informationextractionisthelanguagecontribution\|LABEL:YES

Example16:

Title:Click\-Through\-RatePredictionwithPretrainedTextFeaturesforE\-Commerce

Abstract:Wefeedpretrainedembeddingsofproductdescriptionsintoaclick\-through\-ratemodel,improvingrecommendationranking\.

REASON:Recommendationtask\-\-textisanincidentalinputfeature\|LABEL:NO

AnswerwithEXACTLYoneline,noextratext,inthisformat:

REASON:<=12words\|LABEL:YESorNO

Title:\{TITLE\}

Abstract:\{ABSTRACT\}

Similar Articles

Is ACL now irrelevant? [D]

Reddit r/MachineLearning

A discussion on whether the ACL conference is losing relevance in the NLP community compared to larger AI venues like NeurIPS and ICML, questioning if an ACL first author paper still helps PhD applications.