A Computational Operationalisation of Competing Maturational Theories of Syntactic Development via Statistical Grammar Induction

arXiv cs.CL Papers

Summary

This paper presents a computational framework to test competing maturational theories of syntactic development in children, specifically comparing bottom-up versus inward accounts using statistical grammar induction.

arXiv:2605.08476v1 Announce Type: new Abstract: This paper is concerned with what intermediate syntactic categories children acquire during first language development, and in what order. Maturational theories make different predictions. Bottom-up accounts (GROWING) propose that lexical and inflectional structure emerges first, while inward accounts (INWARD) predict early access to discourse-related categories. We computationally operationalise these hypotheses of staged syntactic emergence using statistical grammar induction, asking what each proposed ordering makes learnable when input and learning algorithm are held constant. Our framework makes category acquisition explicit and allows us to explore how different maturational orderings shape the structure that can be learned under identical conditions. Based on this operationalisation, the GROWING account significantly outperforms the INWARD account across three evaluation metrics.
Original Article
View Cached Full Text

Cached at: 05/12/26, 06:49 AM

# A computational operationalisation of competing maturational theories of syntactic development via statistical grammar induction
Source: [https://arxiv.org/html/2605.08476](https://arxiv.org/html/2605.08476)
Mila Marcheva\-NashSuchir SalhanWeiwei Sunmmm67@cam\.ac\.uksas2450@cam\.ac\.ukws390@cam\.ac\.uk Department of Computer Science & Technology, University of Cambridge

###### Abstract

This paper is concerned with what intermediate syntactic categories children acquire during first language development, and in what order\. Maturational theories make different predictions\. Bottom\-up accounts \(Growing\) propose that lexical and inflectional structure emerges first, while inward accounts \(Inward\) predict early access to discourse\-related categories\. We computationally operationalise these hypotheses of staged syntactic emergence using statistical grammar induction, asking what each proposed ordering makes learnable when input and learning algorithm are held constant\. Our framework makes category acquisition explicit and allows us to explore how different maturational orderings shape the structure that can be learned under identical conditions\. Based on this operationalisation, theGrowingaccount significantly outperforms theInwardaccount across three evaluation metrics\.

Keywords:language acquisition; syntactic development; grammar induction; maturation

## Introduction

A central question in first language acquisition \(FLA\) is how children develop an adult\-like grammatical system\(?, ?\)\. Linguistic theories differ on whether grammatical categories are predetermined by biology or emerge gradually from experience\. Continuity approaches within the generative tradition assume that all categories are innate and available from birth, consistent with the idea of Universal Grammar \(UG\)\(?, ?, ?, ?\)\. Maturational accounts assume that certain syntactic categories are innate but become accessible only at specific points in development, shaping the order of acquisition\(?, ?, ?, ?\)\. By contrast, the emergentist perspective emphasises that categories emerge from patterns in the input under cognitive constraints\(?, ?, ?\)\. Functionalist, usage\-based, and constructivist theories are in line with emergentism\(?, ?, ?, ?, ?, ?, ?\)\. What is common across maturational accounts and emergentist accounts is the rejection of continuity, i\.e\. both theories agree that adult grammatical competence is not available from the outset\(?, ?, ?\)\. What maturation and emergentism disagree on is the source for the staged development: under maturation, the order in which the categories appear is innately encoded, whereas under emergentism, the order is determined by the interplay of input and cognitive constraints\.

Within the maturational tradition theories differ sharply on which categories appear first\. Bottom\-up proposals, such as theGrowingTrees Hypothesis\(?, ?\), suggest that lexical and inflectional categories \(N,V,T\) emerge first, followed by discourse\-related functional categories\. Inward maturation proposals, like theInwardGrowing Spine Hypothesis\(?, ?\), predict early access to higher discourse\-related categories, with lower categories appearing later\. These hypotheses formalise patterns observed in corpora of child and child\-directed speech\. Computational modelling can complement traditional methods by allowing us to explore whether the proposed developmental orderings make different grammatical structures recoverable from the same input\.

In this paper, we introduce a computational framework for testing competing theories of staged syntactic category acquisition under developmentally plausible constraints\. In grammar induction, the statistical learner receives strings as input and infers explicit hierarchical structures, a grammar, making the order of category acquisition observable rather than stipulated\(?, ?\)\. By controlling the order in which categories become accessible, we simulate the developmental trajectories predicted by different maturational theories, while holding constant the input and the learning algorithm\. Comparing the resulting induced grammars across controlled conditions reveals how different maturational orderings constrain what is learnable from identical input and under identical conditions\.

We present an experimental suite to compare two maturational hypotheses,GrowingandInward\. We use morphemically tokenised child\-directed speech as developmentally plausible input, reflecting linguistic units to which children are known to be sensitive\(?, ?, ?\)\. Across a wide range of conditions, we find empirical advantages for staged category acquisition, particularly under theGrowingcurriculum, relative to Continuity approaches\. While bothGrowingandInwardcurricula converge to comparable global performance,Growingsignificantly outperformsInwardon all queried metrics \(F1, Jensen\-Shannon divergence, and child speech sentence log\-likelihood\)\. At the level of the learning process,Growingfavours earlier stabilisation of phrasal structure, whilstInwardyields lower divergence for certain clause\-level categories\. Together, these results align with the qualitative predictions of the corresponding maturational accounts and demonstrate the utility of our method for systematically exploring alternative trajectories of staged syntactic development\. The framework and the tested staged grammars are available on GitHub111[https://github\.com/milamarcheva/maturational\_grammar\_induction](https://github.com/milamarcheva/maturational_grammar_induction)\. While we pilot the framework on two maturational hypotheses, the framework could be extended to emergentist hypotheses, given appropriate staged curricula are developed\.

1\. Prepare staged grammar OraclePCFG: symbolic component \(rules\) \+ probabilistic componentSplit symbolic component into maturational stages2\. Initial stage Set inter\-stage transfer parameters,sp,sℓ,ηs\_\{p\},s\_\{\\ell\},\\eta; Run VB on first stageOutput:estimated probabilities for first\-stage rulesEvaluation:G1G\_\{1\}, grammar up to stage 13\. Repeat for each following stage Add newly available symbolic rulesUpdate priors: existing rulesPk↦αk\+1P\_\{k\}\\mapsto\\alpha\_\{k\+1\}; new rules receive mass viaη\\etaRun VB re\-estimation for current staged grammarOutput:probabilities for all rules available so farEval\.:GiG\_\{i\}, grammar up to stageiinextstage4\. After all stages Output:probabilities for the entire symbolic grammarEval\.:G, final grammar

Figure 1:Pipeline for statistical learner modelling staged syntactic development, explained in detail in[Methodology](https://arxiv.org/html/2605.08476#Sx3)\.
## Background

### Syntactic development

A long tradition in FLA characterises syntactic development as staged, following a trajectory of single\-word utterance, then unmarked two\-word phrases, and ultimately phrases marked with increasingly complex morphosyntactic structure\(?, ?\)\. Generative accounts of syntactic acquisition centre on when and how functional categories are acquired\. There are two main schools of thought, with regards to when functional categories, assumed to be part of Universal Grammar \(UG\), become available\.Continuity\(?, ?\) poses that all information in UG is available from the start\. Thus, the functional structure of children’s initial grammar is not significantly different from adult’s grammars\.Maturation\(?, ?, ?\) poses that grammatical categories and principles from UG are not fully available to children initially, but instead are incrementally accessed\. Thus, UG specifies not only the hierarchical structure and functional categories, but the order in which they become available to the acquirer, too\. Under maturation the UG\-given hierarchical structure is fixed, but children gradually gain more access to parts of the structure\. All of the above hypotheses assume that functional categories have fixed granularity\. Thus, for completion we must also mention neo\-emergentism, a hypothesis of syntactic acquisition which poses that increasing \(flexible\) granularity of the functional categories is a key aspect of syntactic acquisition\(?, ?, ?, ?\)\.

With the distinction of order of acquisition, maturation can be further broken down into bottom\-up and inward orders \(w\.r\.t\. to the UG predefined spine\)\. Bottom\-up maturation poses that the categories which become available first are the ones closer to the leaves\. The most recent bottom up approach, which also incorporates cartography, is theGrowingTrees Hypothesis\(?, ?\)\. It distinguishes between 3 stages: in stage 1 only IP/TP and VP are available \(allowing for inflection and A\(rgument\)\-movement\); in stage 2 the lower left\-periphery \(e\.g\. allowing for wh\- questions\) becomes available; finally in stage 3 the entire cartographic hierarchy becomes available, including topicalisations and embeddings\. Inward maturation, of which an example is theInwardGrowing Spine Hypothesis\(?, ?\), postulates the early development of CP, which appears at the middle of the cartographic spine\.

Table 1:Syntactic stages are defined as cumulative sets of Penn Treebank categories\. Note that in the original PTB some tags are PTs, but here are NTs due to the morphemic tokenisation \(e\.g\. NNS, VBG\)\.
### Grammar induction

Grammar induction \(GI\) is the process of learning the hierarchical structure which is latent in language data\. A grammar is composed of asymbolic component\(the rules\) and aprobabilistic component\(the probabilities assigned to the rules\)\. A subtype of GI is grammar reestimation\(?, ?\), where the symbolic component is provided, and the probabilistic component needs to be learned\. Foundational work in GI relies on statistical methods\(?, ?, ?, ?, ?, ?\)\. In recent years neural grammar induction has been put forward as a method which induces grammars with unprecedented F1 scores from raw data\(?, ?\)\. However, neural methods are less interpretable than the foundational statistical models\. Thus, for this paper we rely on statistical GI, specifically grammar reestimation, with a PCFG\.

Using GI as an approximation for language acquisition, and specifically syntax acquisition, is well motivated in the literature\(?, ?\); examples include i\.a\.? \(?, ?, ?\)\)\. A key advantage of GI over traditional linguistic analyses, which often target isolated phenomena, is that it provides a unified account of all sentences in a corpus within a single model\. Note that GI remains an exploratory tool and does not capture the full complexity of first language acquisition, as it abstracts away from non\-language cues\(?, ?\)\.

#### Probabilistic Context\-Free Grammar \(PCFG\)

A PCFG is defined asG=\(N​T,Σ,R,S,F\)G=\(NT,\\Sigma,R,S,F\), whereN​TNTis the set of non\-terminals,Σ\\Sigmathe vocabulary,RRthe rules,SSthe start symbol, andFFthe rule\-probability function\. We distinguish production rules from lexicalisations, since the former are the main object of staged syntactic access\.

#### Variational Bayes \(VB\)

Statistical GI standardly uses one of two estimation methods: Expectation Maximisation \(EM\)\(?, ?\) or Variational Bayes \(VB\)\(?, ?\)\. EM is a frequentist procedure that estimates rule probabilities solely from expected counts in the data, whereas VB is a Bayesian extension of EM that introduces prior distributions over rule probabilities, most commonly Dirichlet priors\(?, ?, ?\)\. In VB, each grammar rule is associated with a pseudocount parameter \(α\\alpha\) that influences learning at every iteration, encoding prior confidence in a rule\. Higher values bias the model toward retaining a rule, while lower values allow unsupported rules to shrink in probability\. Pseudocounts allow probabilities learned in earlier stages to be carried forward as priors, while still permitting new rules to be learned\.

## Methodology

[Figure 1](https://arxiv.org/html/2605.08476#Sx1.F1)illustrates our computational operationalisation of staged syntactic development: maturational hypotheses are translated into curricula comprised of rule subsets of a symbolic grammar, and acquisition is approximated as Bayesian grammar induction\.

### Acquisition\-inspired curricula

In order to translate the theoretical claims about maturation and the timing of CP into a curriculum\-based grammar induction problem, we define two curricula approximating the incremental access to syntactic categories stipulated byGrowingandInwardmaturation hypotheses\. Previously? \(?\) have approached this problem using universal part\-of\-speech \(UPOS\) tags\(?, ?\), however, we instead rely on the Penn Treebank \(PTB\) tagset\(?, ?\), which is more detailed\. The PTB tagset encodes both phrase type \(e\.g\. VP, NP, PP, INTJ\) and the presence of functional material \(e\.g\. AUX, MD, TO, COP, complementisers, wh\-phrases\), which allows us to define more fine\-grained cognitively\-inspired curricula\. Stages are defined cumulatively over Penn Treebank tags in[Table 1](https://arxiv.org/html/2605.08476#Sx2.T1)\. Note that the stages we define are an approximation of the maturational theories, and alternative definitions of the stages can be substituted within the same pipeline\.

The two maturational curricula, constructed by ordering the stages from[Table 1](https://arxiv.org/html/2605.08476#Sx2.T1)are listed below, as well as an explicit statement of theContinuitycondition:

- •Growing:baseGrowing,VP,TP,CP,INTJ
- •Inward:baseInward,baseGrowing,CP,TP,VP
- •Continuity: all rules are available from the start

### Maturational syntactic development via VB

Maturational syntactic development poses that syntactic knowledge becomes accessible incrementally in distinct stages \(refer to[Table 1](https://arxiv.org/html/2605.08476#Sx2.T1)for a PTB approximation of these stages\)\. Using VB we can perform learning in stages, where increasingly more rules become available\. As new rules are being introduced at every stage, pseudocounts allow probabilities learned in earlier stages to be carried forward as priors, while still permitting new rules to be learned\.

We formulate the pseudocount of existing and new rules in Eq\.[1](https://arxiv.org/html/2605.08476#Sx3.E1), which allows information to be carried across stages as the learning space expands\. After the completion of stagekk, the posterior mean rule probabilitiesℙk\\mathds\{P\}^\{k\}are converted into a Dirichlet prior vectorαk\+1\\alpha^\{k\+1\}for production rulesx∈Xx\\in Xin stagek\+1k\{\+\}1according to:

αi,xk\+1=\{Nk​sp​pik\+0\.1,1≤i≤NkNk​sp​ηXk\+1−Xk\+0\.1,Nk<i≤Nk\+1\\text\{$\\alpha$\}^\{k\+1\}\_\{i,x\}=\\begin\{cases\}N^\{k\}\\,s\_\{p\}\\,\{p\}^\{k\}\_\{i\}\+0\.1,&1\\leq i\\leq N^\{k\}\\\\ \\frac\{N^\{k\}\\,s\_\{p\}\\,\\text\{$\\eta$\}\}\{X^\{k\+1\}\-X^\{k\}\}\+0\.1,&N^\{k\}<i\\leq N^\{k\+1\}\\\\ \\end\{cases\}\(1\)
whereNkN^\{k\}is the number of sentences successfully parsed at stagekk,sps\_\{p\}is a scaling parameter for production rule priors carried forward from the previous stage, andη\\etais a dial for how much probabilistic mass can be allotted to newly introduced rules in a stage\.XkX^\{k\}is the number of rules which share the same NT on the LHS asxxin stagekk, so the mass for new rules is distributed across the newly available expansions for that NT\. Lexicalisation priors are treated analogously, using a separate scaling parametersℓs\_\{\\ell\}\.222Treating lexicalisations and productions with separate pseudocount values allows us to focus specifically on the production rules as they are more pertinent to the topic of syntactic development\.Thus,sps\_\{p\}andsℓs\_\{\\ell\}control the degree of inter\-stage memory, whileη\\etacontrols how readily the learner assigns probability mass to newly available rules\. Whens=0s=0, learning from the previous stage is ignored, whereas larger values increasingly bias the learner toward retaining previously learned distributions\.

![Refer to caption](https://arxiv.org/html/2605.08476v1/figs/growingvsinwards_f1_mll_jsd.png)Figure 2:GrowingvsInwardlearning progression across the 5 learning stages, compared via F1,log⁡p¯sent\\overline\{\\log p\}\_\{\\text\{sent\}\}, and JSD\.Growingconsistently outperformsInward\. For reference, theOraclegrammar baseline forlog⁡p¯sent=−6\.1823\\overline\{\\log p\}\_\{\\text\{sent\}\}=\-6\.1823is provided\. Higher F1 andlog⁡p¯sent\\overline\{\\log p\}\_\{\\text\{sent\}\}, and lower JSD, indicate better performance\.

## Experimental setup

### System

We use the “Inside\-Outside algorithm for estimating PCFGs from terminal strings”\(?, ?\), which iteratively estimates rule probabilities by optimising likelihood\-based objectives from unlabelled strings input\. Our setup is based on Mark Johnson’s implementation, available on his website333[https://web\.science\.mq\.edu\.au/~mjohnson/Software\.htm](https://web.science.mq.edu.au/~mjohnson/Software.htm)\. This is an example of a grammar reestimation system, which requires as input the symbolic component of a PCFG, i\.e\. the rules, initialised with a weight and a pseudocount, as well as a list of sentences based on which the probability distributions of the available rules are induced\. We manipulate the pseudocount as discussed in theMethodologysection\. We select to run experiments with 20 iterations because hyperparameter tuning shows plateauing around this point\.

### Data

We use the child\-directed speech sentences and parses from the morphemically tokenised version of CHILDES\-TB\(?, ?, ?\), which we restrict tos​e​n​t​e​n​c​e​l​e​n​g​t​h\>1sentence~length~\>1\. The dataset amounts to 126,152 sentences and their corresponding parses\. CHILDES\-TB444[https://sites\.socsci\.uci\.edu/~lpearl/CoLaLab/CHILDESTreebank/](https://sites.socsci.uci.edu/~lpearl/CoLaLab/CHILDESTreebank/)is based on the child\-directed speech from five corpora, covering 110 children aged from 6 months to 6 years\.

### Oraclegrammar

We extract a PCFG from the morphemically tokenised CHILDES\-TB parses which we use as anOracle\(gold\-standard\) grammar in evaluation, whilst its symbolic component serves as the initial grammar for the GI system\. TheOraclegrammar is a PTB\-style PCFG approximating adult competence\. We do not claim that this PCFG is an absolute representation of adult mental grammar; rather, it provides a linguistically interpretable target against which different staged access hypotheses can be compared\. We restrict production rules by minimum frequency and select the smallest grammar that preserves 100% parse coverage\. Atfm=7f\_\{m\}=7, theOraclegrammar is comprised of 1,387 production rules and 8,563 lexicalisations over a vocabulary of 6,273 words\.

### Evaluation

In addition to using the CHILDES\-TB parses as a gold standard, based on which we calculate unlabelled F1 score, we also present distributional metrics which allow for the comparison of the distributions of the learned grammars to the distribution of theOraclegrammar\.

We use an induced grammar to parse 1,000 randomly selected parses from CHILDES\-TB\. Then we calculateunlabelled F1, implemented in line with PARSEVAL\(?, ?\), on parses from the induced grammar and the gold parses from CHILDES\-TB\.

We evaluate the induced grammars as generative language models using mean length\-normalisedmarginal log\-likelihood\(see Eq\.[2](https://arxiv.org/html/2605.08476#Sx4.E2)\) over 91,901 child speech sentences from the corpora used in CHILDES\-TB, restricted to the lexicon of theOraclegrammar\.555A larger child\-speech test set could be constructed without this lexical restriction\. The age range and number of children match CHILDES\-TB\.Scores are computed only for utterances licensed by at least one complete parse; larger values are better\. For a PCFGGG, let𝒯​\(𝐰\(i\)\)\\mathcal\{T\}\(\\mathbf\{w\}^\{\(i\)\}\)be the set of all parse trees for an utterance𝐰\(i\)\\mathbf\{w\}^\{\(i\)\}:

log⁡p¯sent=1N​∑i=1NℓiTi,ℓi=ln​∑t∈𝒯​\(𝐰\(i\)\)PG​\(t\)\\overline\{\\log p\}\_\{\\text\{sent\}\}=\\frac\{1\}\{N\}\\sum\_\{i=1\}^\{N\}\\frac\{\\ell\_\{i\}\}\{T\_\{i\}\},\\,\\,\\,\\ell\_\{i\}=\\ln\\sum\_\{t\\in\\mathcal\{T\}\(\\mathbf\{w\}^\{\(i\)\}\)\}P\_\{G\}\(t\)\(2\)
We also compare the rule expansion distributions ofOracleand induced grammars for each non\-terminalAAusingJensen\-Shannon divergence \(JSD\)\(?, ?\), as formulated in Eq\.[3](https://arxiv.org/html/2605.08476#Sx4.E3)\. A result ofJ​S​D=0JSD=0indicates identical distributions\.

JSD​\(pA,qA\)\\displaystyle\\mathrm\{JSD\}\(p\_\{A\},q\_\{A\}\)=12​KL​\(pA∥mA\)\+12​KL​\(qA∥mA\),\\displaystyle=\\tfrac\{1\}\{2\}\\mathrm\{KL\}\(p\_\{A\}\\\|m\_\{A\}\)\+\\tfrac\{1\}\{2\}\\mathrm\{KL\}\(q\_\{A\}\\\|m\_\{A\}\),\(3\)wheremA\\displaystyle\\text\{where\}\\quad m\_\{A\}=12​\(pA\+qA\)\.\\displaystyle=\\tfrac\{1\}\{2\}\(p\_\{A\}\+q\_\{A\}\)\.

## Results

### Continuity vs\. Maturation

Under theContinuitycondition, all grammar rules are available from the outset\. In contrast, undermaturation, rules are introduced in discrete stages, and the probabilities learned at each stage are carried forward to the next via pseudocount transfer, as defined in Eq\.[1](https://arxiv.org/html/2605.08476#Sx3.E1)\.[Table 2](https://arxiv.org/html/2605.08476#Sx5.T2)provides an overview of the F1 scores achieved by systems approximating the three syntactic development hypotheses we operationalised:Continuity,Growing,Inward\.

Continuityexperiments are run for 20 Variational Bayes iterations under a Dirichlet prior with production and lexicalisation pseudocountsαp,αℓ∈\{0\.1,0\.3,0\.5\}\\alpha\_\{p\},\\alpha\_\{\\ell\}\\in\\\{0\.1,0\.3,0\.5\\\}\. Across these settings, performance is stable: mean F1 \(see also[Table 2](https://arxiv.org/html/2605.08476#Sx5.T2)\) is0\.799±0\.0010\.799\\pm 0\.001\. We adoptαp=αℓ=0\.1\\alpha\_\{p\}=\\alpha\_\{\\ell\}=0\.1as theContinuitybaseline for comparison with thematurationconditions, reflecting our choice of minimal constant pseudocounts \(Eq\.[1](https://arxiv.org/html/2605.08476#Sx3.E1)\)\. Under this setting, performance isF​1=0\.8000F1=0\.8000,log⁡p¯sent=−6\.0285\\overline\{\\log p\}\_\{\\text\{sent\}\}=\-6\.0285, andJ​S​D=0\.1928JSD=0\.1928\. With appropriate hyperparameter settings, theGrowingcurriculum outperforms theContinuitycondition \(see[Table 2](https://arxiv.org/html/2605.08476#Sx5.T2)\), whereas theInwardcurriculum consistently underperforms\.

Table 2:F1 scores comparingContinuityandMaturation\. Neural GI baseline from Table 2 in Marcheva et al \(2025\) is provided for comparison\.
### Growingvs\.Inward

#### Final grammar comparison

We compare final grammars, after stage 5, induced under theGrowingandInwardcurricula\. We report the top 10 hyperparameter settings \(ranked byGrowingF1\) in[Table 4](https://arxiv.org/html/2605.08476#Sx5.T4); in all cases,Growingalso exceeds theContinuitybaseline \(αp=αℓ=0\.1\\alpha\_\{p\}=\\alpha\_\{\\ell\}=0\.1,F​1=0\.8000F1=0\.8000\)\. Paired, one\-sided Wilcoxon signed\-rank tests over 72 matched hyperparameter configurations spanningsℓ∈\{0\.001,0\.01,0\.1\}s\_\{\\ell\}\\in\\\{0\.001,0\.01,0\.1\\\},sp∈\{0\.001,0\.005,0\.01,0\.05,0\.1,0\.2\}s\_\{p\}\\in\\\{0\.001,0\.005,0\.01,0\.05,0\.1,0\.2\\\}, andη∈\{0\.001,0\.005,0\.01,0\.05\}\\eta\\in\\\{0\.001,0\.005,0\.01,0\.05\\\}show thatGrowingsignificantly outperformsInwardon all evaluation metrics \(see Table[3](https://arxiv.org/html/2605.08476#Sx5.T3)\)\. The underperformance ofInwardmay be explained either by the early introduction of INTJ, or by the very late introduction of VP\. As observed in[Figure 2](https://arxiv.org/html/2605.08476#Sx3.F2), the final stage forInward, corresponding to the VP stage in[Table 1](https://arxiv.org/html/2605.08476#Sx2.T1), leads to the largest improvement in performance across all three metrics\. This makes it more likely that the underperformance ofInwardis due to the late introduction of VP\. This implies thatGrowingbetter captures utterances requiring predicate, argument, and modifier structure, whereas anyInwardadvantage is restricted to interactional or clause\-level material\.

Table 3:Results from a paired one\-sided Wilcoxon signed\-rank test comparing 72 pairs of final grammars induced underGrowingandInward\.Table 4:Top\-performing hyperparameter setups forGrowingandInwardcurricula\.sℓ=0\.1s\_\{\\ell\}=0\.1for all of these\. Higher F1 andlog⁡p¯sent\\overline\{\\log p\}\_\{\\text\{sent\}\}, and lower JSD, indicate better performance\.
#### Learning progression

The learning progression through the stages of the two maturational curricula is illustrated in[Figure 2](https://arxiv.org/html/2605.08476#Sx3.F2)\. The F1 underGrowingis consistently higher throughout the stages\. Furthermore, the learning progression of theGrowingcurriculum for all metrics indicates monotonic improvement, in contrast withInwardwhere some stages decrease performance \(e\.g\. stage 1\-2 forlog⁡p¯sent\\overline\{\\log p\}\_\{\\text\{sent\}\}or stage 2\-3 for mean JSD\)\. ForInward’s F1 score, the last stage is primarily responsible for the final F1 achieved\. This corresponds to the VP stage, which introduces argument movement\. The mean JSD reflects a similar pattern to F1\. The plot of the mean normalised sentence log\-likelihood,log⁡p¯sent\\overline\{\\log p\}\_\{\\text\{sent\}\}, shows that both maturational conditions lead to improvement over the oracle \(initial grammar\)\. This result affirms that the maturational hypotheses produce grammars more favourable to explaining child productions\.

#### JSD per phrase NT

![Refer to caption](https://arxiv.org/html/2605.08476v1/figs/individualNTs.png)Figure 3:Mean JSD for instrumental phrase\-level NTs\.Growingis in blue andInwardis in orange, but the different NTs are illustrated with different figure markers and line styles\. Low scores of JSD indicate similarity with theOracledistribution\.We examine the stage\-wise progression of JSD for selected phrase\-level NTs introduced at different points under theGrowingandInwardcurricula\. Tracking individual categories allows us to compare how alternative maturational orderings affect the stabilisation of specific components of the grammar\.

UnderGrowing, core phrasal categories such as NP and VP show gradual, largely monotonic reductions in JSD, with a similar pattern for VP\-stage modifier phrases \(PP, ADVP, ADJP\)\. UnderInward, these VP\-related categories, which are introduced only at the final stage, exhibit higher JSD relative toGrowing, suggesting as expected, that the more stages where the category is active would lead to a more refined result\. For clause\-level categories, including INTJ and WH\-related phrases, the two curricula show more mixed trajectories, with no uniform advantage for either ordering\. By making the emergence of NTs explicit, we provide a quantitative tool for exploring how different developmental orderings shape learning dynamics\.

## Discussion

While our experiments showcase a flexible computational framework for comparing syntactic development theories under controlled conditions, several limitations of the present study should be acknowledged\. First, our evaluation relies on CHILDES\-TB, which provides manually annotated parses for English CDS\. Although the GI framework itself is not language\-specific, extending this approach to other languages requires the appropriately annotated resources\. Second, the curricula may differ in the number of rules available at each developmental stage\. Future work could further control for this factor by testing more fine\-grained curricula\. Third, the type of GI our framework relies on is grammar reestimation\. The symbolic part of the grammar needs to be provided, and learning consists of estimating the probabilistic component of the grammar from the input, which consists of unlabelled sentences\. This design choice ensures a controlled comparison between maturational hypotheses but limits the model’s ability to discover novel structures\.

Although we operationalise two maturational hypotheses,GrowingandInward, our pipeline for staged syntactic development, illustrated in[Figure 1](https://arxiv.org/html/2605.08476#Sx1.F1)is not tied to maturation as the source for ordering in syntactic development\. Usage\-based and constructivist theories also reject continuity, i\.e\. that children have adult grammatical knowledge from the outset\(?, ?, ?, ?\)\. The main difference between maturational and usage\-based accounts is the source of the order of syntactic development: under maturation, the order of categories is innate, whereas under usage\-based accounts it emerges constrained by the input and domain\-general cognitive capacities of the child\. Our pipeline requires that the stages of syntactic development are pre\-specified before training\. Thus, it can be used for non\-generativist accounts, if a curriculum is specified based on item\-specific frames or constructions\. In that way, our staged grammar\-induction paradigm provides a general method for comparing theories that reject theContinuityhypothesis, even if they disagree about why the learner’s hypothesis space changes over time\.

## Conclusion

We present a grammar induction framework for staged syntactic development hypotheses, which makes the emergence of syntactic categories explicit and enables a controlled comparison of alternative developmental orderings\. We operationalise two competing maturational hypotheses of syntactic development,GrowingandInward, and find that under our frameworkGrowingsignificantly outperformsInward\. This pilot comparison illustrates how our framework can serve as a principled tool for investigating alternative staged hypotheses of syntactic development\.

## References

Similar Articles

Logical Grammar Induction via Graph Kolmogorov Complexity: A Neuro-Symbolic Framework for Self-Healing Clinical Data Integrity

arXiv cs.LG

Proposes Logic-GNN, a neuro-symbolic framework that uses temporal graph neural networks and graph Kolmogorov complexity to induce a symbolic grammar for clinical records, enabling detection and correction of data entry errors as grammatical violations. The system achieves an F1-score of 0.94 on a large healthcare dataset, outperforming state-of-the-art methods by 12%.