19-year-old boy from Bihar develops a 5.82B multimodal AI model using Rs 11 lakh from personal savings

Two and a half years ago, Abhinav Anand said he barely knew what AI was beyond hearing about ChatGPT.

Today, the 19-year-old Class 12 student from Bihar claims he has built a 5.82-billion-parameter multimodal AI model after spending nearly Rs 11 lakh from his personal savings and compute grants.

In a post shared on Reddit, Anand said he worked independently without a team, investors or a formal computer science degree. He described years of failed experiments before arriving at what he calls 'ArcleIntelligence,' a multimodal model which he claims is capable of processing text, images, documents, audio and video.

'Every failure taught me something real,' Anand wrote, recalling earlier attempts at building a YouTube analytics app, a voice assistant and an offline AI assistant.

According to Anand, the model supports image generation at 512x512 resolution, 24kHz speech output and a context window of more than 2 million tokens.

He also claimed the system achieved a score of 93.45 on OmniDocBench V1.5 during private testing, though those benchmark results have not been independently verified.

Before beginning work on the multimodal model, Anand said he had trained a text-to-video system on his laptop with no outside funding and later published it publicly through Lightning AI as a studio template. The teenager said the project was funded through personal savings, RunPod compute grants, DigitalOcean credits and GitHub Student Pack benefits.

He estimated that GPU compute alone cost his family about Rs 64,000, which he described as a significant amount for a middle-class household in Bihar.

'My father is a government officer. My mother is a housewife. This is a middle-class family in Bihar,' Anand wrote.

He added that the project is still in training and that he is seeking about $35,000 in funding to complete the pipeline. Anand said he plans to release the model weights on Hugging Face and eventually open-source the full codebase on GitHub.

'The West has OpenAI. The East has DeepSeek. India deserves its own,' he wrote.

19 year old from Bihar, no team, no investors, no CS degree â spent $11,560 of personal savings building a 5.82B multimodal AI. 93.45 on OmniDocBench V1.5 in private testing. Trying to release it open source.
by u/That-Bookkeeper-8316 in indianstartups

Netizens reaction

The claims quickly drew attention online, with reactions ranging from admiration to skepticism. Some users questioned the technical credibility of the project and asked for more transparency around the training process and datasets.

'What proprietary data have you used? Why a multimodal model? The reason I say is, for the amount you are asking, a small multimodal model is not going to be production use on any task,' a user wrote.

'I won't talk about the model but the pitch. 19yo, young talent signalled. Bihar, middle class, poor nr sympathy signalled. No CS degree, hard worker signalled,' another wrote.

Others raised doubts about the public evidence shared online.

'Your Twitter account doesn't say anything about your journey. There's no post with real engagement, and the Lightning AI reach-out you mention is a screenshot which can be easily doctored,' a third wrote.

Several users also accused the project of relying heavily on AI-generated code and 'vibe coding.'

'Just checked the source code, it's purely vibe-coded, nothing new,' another added.

Still, some users praised the ambition behind the effort, stating that even attempting to train a large AI model independently from Bihar was notable in itself.