We review vendors based on rigorous testing and research but also take into account your feedback and our affiliate commission with providers. Some providers are owned by our parent company.
Learn more
vpnMentor was established in 2014 to review VPN services and cover privacy-related stories. Today, our team of hundreds of cybersecurity researchers, writers, and editors continues to help readers fight for their online freedom in partnership with Kape Technologies PLC, which also owns the following products: Holiday.com, ExpressVPN, CyberGhost, and Private Internet Access which may be ranked and reviewed on this website. The reviews published on vpnMentor are believed to be accurate as of the date of each article, and written according to our strict reviewing standards that prioritize professional and honest examination of the reviewer, taking into account the technical capabilities and qualities of the product together with its commercial value for users. The rankings and reviews we publish may also take into consideration the common ownership mentioned above, and affiliate commissions we earn for purchases through links on our website. We do not review all VPN providers and information is believed to be accurate as of the date of each article.
Advertising Disclosure

vpnMentor was established in 2014 to review VPN services and cover privacy-related stories. Today, our team of hundreds of cybersecurity researchers, writers, and editors continues to help readers fight for their online freedom in partnership with Kape Technologies PLC, which also owns the following products: Holiday.com, ExpressVPN, CyberGhost, and Private Internet Access which may be ranked and reviewed on this website. The reviews published on vpnMentor are believed to be accurate as of the date of each article, and written according to our strict reviewing standards that prioritize professional and honest examination of the reviewer, taking into account the technical capabilities and qualities of the product together with its commercial value for users. The rankings and reviews we publish may also take into consideration the common ownership mentioned above, and affiliate commissions we earn for purchases through links on our website. We do not review all VPN providers and information is believed to be accurate as of the date of each article.

Leaked AI Dataset Reveals China’s Censorship Ambitions

Leaked AI Dataset Reveals China’s Censorship Ambitions
Husain Parvez First published on March 30, 2025 Cybersecurity Researcher

A leaked dataset has revealed how Chinese entities are training large language models (LLMs) to automate political censorship on a massive scale. The dataset, containing over 133,000 real-world content examples, includes posts about government corruption, rural poverty, Taiwan, and military affairs, topics the Chinese state typically considers sensitive.

According to TechCrunch, the system is designed to flag this content automatically, providing a glimpse into how artificial intelligence is being deployed to refine and scale digital repression. UC Berkeley researcher Xiao Qiang, who examined the dataset, argued that this is clear evidence that the Chinese government or its affiliates want to use LLMs to improve repression.

Unlike older systems that relied on keyword filters and human moderation, this LLM-based approach enables more efficient control over online discourse.

Security researcher NetAskari discovered the unsecured database on a Baidu server, and found that it contained entries as recent as December 2024. Though the creators are unidentified, the dataset is marked for “public opinion work”, a term widely associated with censorship operations led by the Cyberspace Administration of China.

While the TechCrunch report does not name a specific model, separate investigations suggest that DeepSeek AI, one of China’s most prominent open-source LLMs, is already exhibiting censorship behaviors consistent with the leaked system’s goals.

WIRED tested DeepSeek-R1 across platforms and found that the model censors topics like Taiwan and Tiananmen through both app-level filtering and pre-programmed bias. In one case, the model’s internal reasoning noted the need to “avoid mentioning events that could be sensitive,” while emphasizing China’s achievements under the Communist Party.

Adding to the long list of concerns, Feroot Security recently found that DeepSeek’s platform contains hidden code transmitting user data to servers controlled by China Mobile, a state-owned telecom company under US sanctions.

As AI tools become more embedded in everyday platforms, experts warn that state-aligned models could shape global information flows. Several countries, including the US, Italy, and Australia, are now evaluating bans or restrictions on Chinese AI systems. The rise of censorship-enabled LLMs, researchers say, marks a turning point in how digital authoritarianism is executed.

About the Author

Husain Parvez is a Cybersecurity Researcher and News Writer at vpnMentor, focusing on VPN reviews, detailed how-to guides, and hands-on tutorials. Husain is also a part of the vpnMentor Cybersecurity News bulletin and loves covering the latest events in cyberspace and data privacy.

Please, comment on how to improve this article. Your feedback matters!

Leave a comment

This field must contain more than 50 characters

The field content should not exceed 1000 letters

Sorry, links are not allowed in this field!

Name should contain at least 3 letters

The field content should not exceed 80 letters

Sorry, links are not allowed in this field!

Please enter a valid email address