In June 2022, when Simon Gong, one of China’s top celebrities, released a new music video, it received 15 million views on the country’s social media website Weibo. But the event was also notable for a different reason, one that only the most observant viewers would have observed.
The vocalist in the video was not Gong, but a synthetic duplicate made by Baidu, a “artificially intelligent digital human” (AI). Similarly, the words and melodies were created by AI, making this recording China’s first music video with AI-generated material.
Digital people, as defined by Deloitte, are AI-powered virtual individuals capable of producing an extensive array of human body language. In an effort to capture a rising market, firms that provide 24-hour services, as well as the media and entertainment industries, have increasingly used this emerging technology in recent years.
And as digital people become more prevalent in other industries such as retail, healthcare, and finance, Emergen Research predicts that the worldwide market for digital humans will increase from $10 billion in 2020 to about $530 billion in 2030.
Since the notion of “virtual idols” has been around for years, the introduction of Baidu’s digital superstar may not seem like a big deal. Since 2016, the US virtual influencer Lil Miquela has appeared in online ads and television commercials alongside actual human celebrities, earning over three million Instagram followers.
However, there is something unique about the virtual Chinese star: a digital person with unprecedented abilities to listen, talk, and engage with actual people. And Gong’s internet responsibilities extend beyond singing.
On the most recent upgrade of Baidu App, the top search-plus-feed app in China, Gong appears on users’ smartphones, assisting with searches and enquiries using the model-own actor’s voice. Since the debut of this dynamic search experience in 2021, voice searches on the Baidu App have increased by 18.2 percent.
In 2019, in conjunction with Shanghai Pudong Development (SPD) Bank, Baidu AI Cloud initiated the creation of a digital employee.
Consequently, they concentrated their efforts on developing a digital financial adviser to give the same level of assistance as a human bank representative when actual staff were unavailable.
SPD Bank reports that more than 460,000 clients depend on digital people each month for banking services and portfolio management. “Access to digital people outside of normal business hours enables SPD Bank to provide client care 24 hours a day, seven days a week at minimal cost and with great efficiency,” a bank official explains.
A Baidu-created virtual anchor offered deaf viewers with live commentary in sign language at the 2022 Beijing Winter Games.
In addition to having a human-like appearance, the avatar was equipped with voice recognition and sign-language interpretation capabilities to enable speedy and precise input and output.
According to the World Health Organization, around 430 million people worldwide suffer from “disabling” hearing loss, and this technology has the potential to expand their access to a broad variety of material.
A new generation entirely based on an AI platform
Digital people are poised to play a larger part in our everyday lives, from entertainment to governmental services. However, beneath their smooth and seamless look lies a complex network of new and developing technologies that are pushing the limits of AI innovation.
AI generators have been available for quite some time, for example, a tool called “this person does not exist” allows users to generate an image through an algorithm, to make up a face composition that is so believable, you would actually think they existed in real-life. This is why it comes as no surprise that this technology is emerging in our world rapidly.
The digital celebrities and virtual sign-language anchors of Baidu AI Cloud were established using XiLing, a newly released digital platform in 2021.
At the Baidu World 2022 event held on June 21, the business introduced a new capacity on XiLing that enables the construction of synthetic people who can sing, dance, and reply to comments in real-time without ever taking a break.
XiLing is unique in its capacity to assist the full process of building a digital human, from designing a realistic persona to equipping it with conversational and content-generation abilities. Speed is one of its most noticeable characteristics. In one to two weeks, the platform can build a 3D avatar based on a real person, while a 2D avatar may be created in few minutes.
Moreover, utilising intelligent dialogue capabilities, developers may rapidly modify the conversational capacity of a synthetic person, allowing it to adapt and learn over time.
This feature is enabled by Baidu’s PLATO, a one hundred billion-parameter dialogue model that allows artificial people to engage in open-domain discussions, i.e., to comprehend any subject and deliver meaningful replies.
Above-98.5% accurate voice recognition and lip-syncing enables the digital person to have interactions that are more natural and human-like. Li believes that the use of powerful AI technologies will continue to reduce the cost of creating digital people and dramatically enhance their interactions with actual humans.
The next generation of digital people have a unique set of skills and abilities, just as every actual person does.
Due to the recent advancements achieved by huge AI models such as Baidu’s ERNIE, which can write words and make realistic graphics when asked, this may potentially entail providing digital people the capacity to be creative themselves.
Digital humans created to act as brand spokespeople, for instance, are capable of autonomously creating and posting on social media, designing posters, and performing in films.

Parul Mathur has been writing since 2009. That’s when she discovered her love for SEO and how it works. She developed an interest in learning HTML and CSS a couple of years later, and React in 2020. When she’s not writing, she’s either reading, walking her dog, messing up her garden, or doodling.