AI can only ever be as effective as the data used to train it, and AI models need a lot of training – so they need a lot of data. Founded in Beijing in 2005, SpeechOcean is today one of the world’s leading providers of visual, audio, and text training data for artificial intelligence and machine learning. Recognised as Most Innovative AI Data Resource Provider 2022 in this issue of Corporate Vision magazine, we learn more about the company.
The name SpeechOcean may be slightly misleading as the company has expertise in much more than just speech data, but it works with the world’s leading technology and commercial enterprises, as well as academic institutions, to power their AI applications. From scheme design to data collection, labelling, and (most importantly) evaluation and quality control – chances are you’ve interacted with an AI model somewhere in the world that’s been trained using SpeechOcean’s data.
Founder and Chairwoman, He Lin, is one of the world’s leading researchers in machine learning and related fields. Having graduated in computer science and technology from Peking University, she worked for many years in speech recognition, speech synthesis, language understanding and testing at the Chinese Academy of Sciences’ Institute of Acoustics before founding SpeechOcean with the vision of becoming the world’s leading AI data provider, an achievement today recognised many times over, including with this award.
There is still a great deal of misunderstanding surrounding AI technology, though. Concerns abound, ranging from privacy and security, to whether the dystopian futures depicted in so many science fiction movies at the hands of a disgruntled AI might somehow be just around the corner. True sentience in AI is still a long way off, however. It’s about as close as we are as a species to realising time travel, so we can all sleep soundly knowing we’re not about to be invaded by liquid metal robots from our own future any time soon – unless of course there’s a time paradox in that statement…
The reality is far more mundane than most realise. The fact is, the data used to train AI and machine learning systems is usually very broad, generic, and contains no personally identifying information. AI models need to be able to understand an extremely wide range of inputs – from different languages, accents, and dialects to environments and other visual data, including distinguishing unwanted and unintended inputs. It’s a complex academic and technical challenge, but one which SpeechOcean has been at the forefront of for nearly twenty years.
The data SpeechOcean provides covers a wide range of situations, scenarios, participants, and recording devices to teach AI models to recognise these inputs so that when presented with a fresh enquiry they’re able to respond ‘intelligently’.
And because people and social behaviours are constantly changing, data also needs to constantly change, to keep up with these and emerging technological trends. Smartphones were almost non-existent when SpeechOcean started. Now most of us can’t imagine life without them, and many of the AI applications the company helps train use these devices as their primary mode of user input, so they have become integral to its process and systems.
All data SpeechOcean collects also goes through careful labelling or annotation so that target algorithms are able to recognise and understand the patterns within, often including subtle contextual, and emotional cues. It’s a case of the more data an AI system can be trained with, the ‘smarter’ it’s able to appear. Much of this annotation is now automated or semi-automated (some element of human supervision is still always needed) but this has helped the company dramatically improve project delivery times, accuracy, and ultimately, cost.
Looking to the future, autonomous driving is a key area of development for the company as more automakers worldwide embrace this technology. Again, chances are that SpeechOcean’s data has been used in training the autonomous driving vehicles you may have ridden in – or could one day soon.
Helping small and medium enterprises leverage the potential of AI is another key area for the company, wanting to make sure AI is not only the domain of big tech. It has more than 1,000 complete datasets ready-to-use for this purpose, and even algorithms needed to help SMEs get started.
Like every successful enterprise though, SpeechOcean’s business is based on trust, and it takes great pride in ensuring the quality and accuracy of the data it provides to its clients. Evidence of this comes in the number of clients who keep returning to the company as their machine learning needs grow and they look to keep their AI models up to date.
Also like every other successful business, people are the lifeblood of SpeechOcean. The company is blessed with a smart, dedicated team who live to innovate – solving problems that help drive its clients and their businesses forward, and this is a key part of what differentiates it.
While the pandemic certainly presented its challenges, it also helped strengthen awareness and uptake of AI and machine learning technologies as face-to-face interactions diminished everywhere. 2023 promises to be a big year for SpeechOcean because of this, and as it welcomes new senior leadership and expands its presence to serve more markets globally. Having recently listed on the Shanghai Stock Exchange, key investments are being made in acquisitions, talent, systems, and new technologies which have the company poised to break new ground in redefining what AI can be to its clients – and so that it can continue to be their global data partner.