IMLJD: A Computational Dataset for Indian Matrimonial Litigation Analysis

arXiv cs.CL 05/20/26, 04:00 AM Papers

dataset legal-nlp matrimonial india computational-linguistics litigation-analysis

Summary

The paper introduces IMLJD, a computational dataset designed for analyzing Indian matrimonial litigation, supporting natural language processing and legal analytics research.

arXiv:2605.19346v1 Announce Type: new Abstract: We present IMLJD, an open dataset of 3,613 Indian court judgments covering matrimonial disputes under IPC Section 498A, the Protection of Women from Domestic Violence Act, and CrPC Section 482. The dataset covers the Supreme Court of India from 2000 to 2024 (1,474 cases) and the Karnataka High Court from 2018 to 2024 (2,139 cases), with structured outcome labels, metadata-derived indicators, and a knowledge graph. We find that 57.6% of quashing petitions succeed at the Supreme Court level compared to 39.7% at the Karnataka High Court level. On a matched 2018 to 2024 period, the SC quash rate is 59.3%, widening the differential to 19.6 percentage points and confirming the finding is robust to temporal adjustment. The dataset, code, and knowledge graph are released openly at https://github.com/joyboseroy/imljd and https://huggingface.co/datasets/joyboseroy/imljd.

Original Article

View Cached Full Text

Cached at: 05/20/26, 08:25 AM

# IMLJD: A Computational Dataset for Indian Matrimonial Litigation Analysis
Source: [https://arxiv.org/abs/2605.19346](https://arxiv.org/abs/2605.19346)
Bibliographic Tools

## Bibliographic and Citation Tools

Bibliographic Explorer Toggle

Code, Data, Media

## Code, Data and Media Associated with this Article

Demos

## Demos

Related Papers

## Recommenders and Search Tools

About arXivLabs

## arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website\.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy\. arXiv is committed to these values and only works with partners that adhere to them\.

Have an idea for a project that will add value for arXiv's community?[**Learn more about arXivLabs**](https://info.arxiv.org/labs/index.html)\.

IMLJD: A Computational Dataset for Indian Matrimonial Litigation Analysis

Similar Articles

@tom_doerr: Curated list of instruction and reasoning datasets for LLMs https://github.com/mlabonne/llm-datasets…

LAUKIN: A Multi-jurisdictional Common Law Contract Dataset

RTI-Bench: A Structured Dataset for Indian Right-to-Information Decision Analysis

BIASEDTALES-ML: A Multilingual Dataset for Analyzing Narrative Attribute Distributions in LLM-Generated Stories

We’ve been analyzing how people are using LLMs for legal and compliance tasks (GDPR, AI Act, etc.).

Submit Feedback

Similar Articles

@tom_doerr: Curated list of instruction and reasoning datasets for LLMs https://github.com/mlabonne/llm-datasets…

LAUKIN: A Multi-jurisdictional Common Law Contract Dataset

RTI-Bench: A Structured Dataset for Indian Right-to-Information Decision Analysis

BIASEDTALES-ML: A Multilingual Dataset for Analyzing Narrative Attribute Distributions in LLM-Generated Stories

We’ve been analyzing how people are using LLMs for legal and compliance tasks (GDPR, AI Act, etc.).