Should AI Training Require Licensing or Lawful Access? Tech Industry Submissions to DPIIT Reveal Divide


The Business Software Alliance (BSA) argued in its submission to the Department for Promotion of Industry and Internal Trade (DPIIT) committee that training artificial intelligence models on copyrighted material should not require permission or licensing.

Instead, it urged the committee to introduce a statutory text and data mining (TDM) exception into India’s copyright framework, allowing developers to train AI models on copyrighted content once they have lawful access through purchase, subscription or licence.

“The right approach to promoting AI training while protecting the legitimate interests of copyright holders and creators is to allow for lawful text and data mining for AI model training,” the submission said, adding that copyright law should focus on remedies where “AI-generated output infringes their works.”

For context, the BSA is a global software industry group whose members include Adobe, Amazon Web Services, Microsoft, OpenAI, Oracle, IBM, Salesforce, Cisco, Shopify, Zoom, and SAP (Systems, Applications, and Products in Data Processing), among others.

The submission forms part of the government’s ongoing consultation on artificial intelligence and copyright and reflects the position of companies that build, deploy and commercialise large-scale AI systems.

Importantly, the submissions reveal a sharp disagreement over where control and compensation should sit in the AI training process. The question is whether copyright law should regulate access to training data through upfront licensing and royalties or intervene only when AI-generated outputs infringe protected works.

AI training should not require permission, BSA argues

The alliance’s core argument is that training AI models on copyrighted material does not amount to copyright infringement.

According to the BSA, developers do not use copyrighted works for their expressive value. Instead, they apply mathematical and statistical techniques to analyse large volumes of data.

“The core value of AI training lies in uncovering non-copyrightable information — probabilities, relationships, and patterns — across large bodies of data,” the submission said.

Moreover, the alliance said developers typically break content into machine-readable units, or tokens, and analyse them to identify correlations across datasets. “This computational analysis does not use the underlying data for its expressive content and, therefore, does not infringe any copyright in the underlying data,” it said.

On this basis, BSA argued that requiring permission or licences for AI training misunderstands how these systems function and risks introducing legal barriers that could slow development.

Why does the BSA oppose licensing-based models?

The submission also set out a clear rejection of licensing-led approaches to artificial intelligence training.

While acknowledging that licensing has a role in copyright law, the alliance argued that it does not scale for AI development. “Relying solely on direct or statutory licensing for AI training data may be impractical and may not yield the best outcomes,” the submission said.

Moreover, the alliance warned that restricting training datasets to licensed or public-domain material could weaken AI models and “ironically, increase the risk that outputs simply reflect trends and biases of the limited training data sets.”

Additionally, the submission cautioned that mandatory royalties and centralised licensing mechanisms could introduce compliance burdens and operational costs. According to the alliance, these costs could slow AI development and investment, particularly for smaller companies and emerging developers. Instead, it framed broad data access and legal certainty as essential for building high-quality models and sustaining innovation at scale.

How the BSA situates India in the global context

For context, the submission placed India’s policy choices within the broader international landscape.

Advertisements

The alliance pointed to Japan and Singapore, which allow text and data mining on lawfully accessed content, as examples of copyright frameworks that support AI research and development while providing legal certainty.

Moreover, it argued that adopting a similar approach would help India remain competitive as a destination for AI investment and deployment and align with the government’s broader objectives under the IndiaAI Mission.

Where does the BSA accept limits in AI training?

However, while advocating permissive rules at the training stage, the submission acknowledged limits on artificial intelligence use.

“Copyright holders should have full and effective remedies when their rights are infringed,” the submission said, adding that this principle “applies equally to output generated using AI systems and output generated in other ways.”

Additionally, the alliance supported targeted protections against the commercial dissemination of unauthorised AI-generated digital replicas of a person’s name, image, likeness or voice.  However, it cautioned that new policy frameworks in this area should remain “clear, practical, and not too broad”, warning that expansive rules could block legitimate and socially beneficial uses of artificial intelligence.

What This Reveals About the Fault Lines in the AI Copyright Debate?

How the committee draws the line

The committee’s working paper rejects both extremes. It dismisses a blanket TDM exception, arguing that such an approach would allow developers to extract value from copyrighted works without compensating creators. At the same time, it rules out consent-based and opt-out models as unworkable at scale, citing enforcement and transparency constraints.

However, the committee’s hybrid solution introduces risks of its own. By making licensing mandatory and routing compensation through a single, centralised royalty mechanism, the framework allows developers to train on copyrighted works without consent while offering creators payment as a substitute for control.

This shifts copyright from a right to exclude into a right to remuneration, raising unresolved questions about how value will be calculated, how royalties will be distributed across different kinds of creators, and whether compensation alone can sustain creative incentives over time.

Why DID THE industry reject that approach

Industry submissions, including that of the Business Software Alliance, push in the opposite direction. They are treating AI training as the extraction of non-copyrightable patterns and argue that lawful access alone should permit training, with copyright enforcement triggered only at the output stage.

Yet this approach also leaves gaps. It assumes that rights holders can realistically identify and challenge infringing AI outputs. In practice, it is often unclear which works an AI system has drawn from, whether an output copies protected expression or merely resembles it, and how infringement can be proved. The model also leaves unaddressed concerns about market concentration and the cumulative extraction of value by large developers.

Overall, the consultation exposes unresolved trade-offs rather than a settled policy direction. How the government balances scale and predictability for AI development against control and leverage for creators will shape both copyright reform and the structure of India’s AI ecosystem.

Also read:

Support our journalism by subscribing

For You


Source link

Recent Articles

spot_img

Related Stories