Where China’s AI models make their money

Unlike overseas rivals that rely on subscriptions and APIs, Chinese AI vendors are monetizing through cloud platforms, project contracts and compute usage
By LeadLeo Research Institute
As commercialization of large AI models accelerates globally, overseas and Chinese markets have begun to diverge structurally in usage patterns, competitive dynamics and how value is distributed. Overseas markets are built mainly around mature subscription models and direct API payments. China’s market, in contrast, is dominated by enterprise usage, platform-based delivery and free or low-cost customer acquisition. The gap reflects different choices between open-source and closed-source models, and is also shaped by policy, supply concentration and payment culture. The situation is unlikely to reverse in the near term.
Overseas markets start with cases like ChatGPT’s roughly 700 million weekly active users and extend into enterprise APIs and developer subscriptions, creating a full payment ladder from individuals to companies. That gives model developers relatively broad monetization coverage based on usage volume. China’s market is more polarized. Consumer-facing products such as Doubao and Tencent Yuanbao mostly use free models, meaning direct commercial value from individual users is limited. Billable token consumption is concentrated mainly on the enterprise side and is often handled through cloud vendor platforms, meaning much of the actual usage monetization may not flow directly back to the original model developers.
Differences in the value density of use cases are a key reason subscription models have struggled to take shape in China. Overseas mainstream applications are concentrated in high-token knowledge work such as code generation and professional analysis, where a single call can create enough business value to support standardized billing. Large-scale deployments in China, however, are more often efficiency tools such as customer-service Q&A, marketing copywriting and document processing. These tasks usually involve shorter context windows and more fixed output formats, resulting in lower token density and lower spending per task. That leaves subscriptions without enough per-call value to support them.
Open- and closed-source strategies have further reshaped the payment structure. Under a closed-source model, model weights are not accessible, and all usage must pass through authorized billing points. Token consumption can flow back to the original developer, which can also directly accumulate customer data and strengthen its position in renewal negotiations. Under an open-source model, companies can download weights and deploy models themselves. The computing power consumed by private deployments is handled by cloud vendors or enterprises, while the original model developer’s billable touchpoints are largely limited to cloud-hosted inference. Customer relationships accumulate at the cloud platform and systems-integrator layers, making project-based delivery the main payment model.
In China, policy, supply concentration, procurement practices and payment culture have collectively reinforced the project-based model. High-value customers in finance, government and healthcare tend to prefer private deployments because of data-security requirements. The coexistence of several high-quality open-source models has narrowed capability gaps and shifted pricing power toward buyers. Free-use expectations formed during the mobile internet era, combined with aggressive price competition among vendors, have further weakened users’ willingness to pay. Corporate budgets are also commonly approved on a project-by-project basis, making recurring subscription spending a poor fit. Customized projects, by contrast, are often easier to implement.
Overseas markets, in contrast, have moved toward strategic alliances as barriers rise across computing power, cloud infrastructure and model capabilities. Data centers take years to build, Nvidia (NVDA.US) GPUs have made computing power a scarce resource, and AWS, Microsoft Azure and Google Cloud together control more than 60% of the global cloud market. At the same time, the cost of training frontier models continues to climb. No single company can easily achieve full vertical integration, prompting alliances among players such as OpenAI, Microsoft (MSFT.US) and Nvidia. Each controls a different part of the chain — model capabilities, cloud infrastructure, enterprise channels or the GPU ecosystem — creating end-to-end coordination.
Such alliances are also squeezing the resources, channels and pricing for independent platforms. Computing power is allocated first within the alliance, leaving outside platforms at a disadvantage in both cost and timing. Model capabilities are being embedded into cloud platforms, office software and enterprise services, turning them into the default entry point for corporate customers. Bundled pricing for computing power, models and cloud services also makes it difficult for independent platforms to match their cost advantages. As a result, the independent platform is gradually shifting away from basic model access toward higher-value capabilities such as governance, orchestration, security and compliance.
Money in the cloud
For China’s mainstream vendors, model revenue essentially lies in the monetization of infrastructure consumption. Open weights lower the barrier to adoption, bringing more developers and enterprises into the chain of usage, fine-tuning and deployment. As the user base expands, inference, training, storage and network consumption are concentrated in cloud platform resource pools, with most revenue captured by the infrastructure layer. Alibaba Cloud Model Studio, or Bailian, is a typical example. It integrates models such as Qwen, GLM, MiniMax and DeepSeek, and charges based on input and output tokens. Its revenue covers model capabilities, inference computing power, data storage, network access and platform scheduling, rather than model capabilities alone.
Open-source large models can generate revenue. But under the current structure, it is difficult to sustain positive gross margins on model capabilities alone. Instead, APIs, private deployments, commercial licensing and fine-tuning are the main paths to commercialization. But pricing power for hosted inference APIs has been squeezed by subsidies for low-cost supply and cloud vendors. Commercial licensing is not a mainstream revenue source for China’s open-source models. Private deployments offer relatively better margins, but that window is narrowing. Fine-tuning and training services have weaker margins, while customers’ in-house capabilities are improving. As a result, open weights function more as a tool for customer acquisition and ecosystem expansion, while monetization is moving outward to stable usage, dedicated deployments and industry-specific project delivery.
Future competition will shift toward a system-level contest of “model capability x infrastructure capacity.” Model capability will still determine the upper limit for tasks such as complex reasoning, coding and multimodal applications. It will also influence developers’ willingness to try a product and the premium it can command. But the open-source ecosystem will narrow capability gaps in general-purpose tasks. Once enterprises complete their initial vendor selection, renewals and capacity expansion will depend more on service stability, unit call cost, response latency, system integration, compliance and long-term supply assurance. Model capability is the entry ticket, but not necessarily a durable moat.
Infrastructure capacity, by contrast, creates barriers in inference cost, high concurrency, network scheduling, computing power supply, low latency and stability. These barriers require years of development and billions of dollars in capital investment, making it hard for new entrants to catch up quickly. China’s token price war has already spotlighted this trend. As unit prices continue to fall, the focus of competition will shift from model capability to inference costs and scale efficiency. Vendors with self-developed chips and large-scale computing reserves will be better positioned to build structural cost advantages.
LeadLeo Research Institute is an original content platform for research on banks and companies and an innovative digital research service provider with nearly 100 senior analysts. You can contact the platform at CS@leadleo.com
This commentary is the views of the writer and does not necessarily reflect the views of Bamboo Works
To subscribe to Bamboo Works weekly free newsletter, click here