Fine-tuning LLMs for Kashmiri→English translation, validated by native speakers through structured human evaluation.
KashmirAI Research is the pioneering open-source platform dedicated to Kashmiri language NLP. We specialize in Kashmiri→English Machine Translation using LoRA fine-tuning on Large Language Models, validated by native speaker evaluation.
Building a large-scale Kashmiri-English parallel corpus from multiple open sources with rigorous quality filtering.
Fine-tuning state-of-the-art large language models for Kashmiri→English translation tasks.
Native Kashmiri speakers evaluate translations via this platform, rating adequacy, fluency, and overall preference — contributing directly to ongoing research.
Discover every Kashmiri NLP dataset, research paper, and ML model in one place — the definitive directory for Kashmiri AI research.
Explore Resources →Kashmiri (کٲشُر) is a low-resource, endangered language spoken by over 7 million people in the Kashmir Valley, written in the Perso-Arabic script. Despite its cultural significance, it has extremely limited NLP resources compared to other Indian languages.
This project investigates whether large language models can be effectively fine-tuned for Kashmiri translation. Our research is ongoing, and detailed findings will be published in an upcoming conference paper.
Read More About Our Research →AI Researcher, Developer & Entrepreneur
Building Kashmir's pioneering NLP research infrastructure. Passionate about applying cutting-edge AI to preserve and empower low-resource languages.
Learn More →Kashmir AI Research is the first dedicated Kashmiri language evaluation platform and NLP research initiative. We are building the foundational AI infrastructure for the Kashmiri language — a low-resource language spoken by over 7 million people that has been largely overlooked by mainstream machine translation systems. Our platform enables structured human evaluation of Kashmiri→English machine translation by native speakers, generating high-quality preference data for model training and benchmarking.
Our work spans AI research in Kashmir, parallel corpus construction, LLM fine-tuning with LoRA, and rigorous inter-annotator reliability analysis. As a pioneer in NLP for low-resource languages, we aim to set a reproducible standard for Kashmiri language technology — with open datasets, published benchmarks, and a growing community of native speaker evaluators contributing to the future of machine translation for Kashmiri.