Voice and speech recognition in Fintech|Eliftech

Speech recognition in Fintech is most definitely the next “evolutionary” step after text recognition, as with pretty much any voice and speech tech. According to a TeleSign study, most security professionals believe that two-factor authentication and behavioral biometrics will completely replace passwords. What are speech recognition, voice recognition, and “biometrics,” and how does this tech boost the Fintech landscape? We are going to answer these questions in our article.

The American information platform TeleSign showed that 69% of users consider passwords not secure enough, and 72% are convinced that other protection methods will return them within nine years.

79% of security professionals are very concerned about frequent account hacks. Over the past year, the costs of companies associated with hacking increased; 51% suffered financial losses because of this, and 42% lost customers. In addition, account hacks hurt brands and put more strain on employees in 42% and 45% of companies, respectively. While in 2022, 9 out of 10 companies believe that behavioral biometrics would be a huge advantage for their security system. Furthermore, 54% of companies intend to implement this protection method in the foreseeable future.

What is speech recognition in fintech?

Voice recognition systems are computing systems that can determine the speech of a speaker from a common stream. This technology is related to speech recognition technology, which converts spoken words into digital text signals by running a speech recognition process on machines.

Both of these technologies are used in parallel: on the one hand, to identify the voice of a particular user, and on the other hand, to identify voice commands through speech recognition. Voice recognition is used for biometric security purposes to recognize the voice of a particular person. This technology has become very popular in mobile banking. Its implementation requires a user speech base, voice-based authentication, and other voice commands to help the clients complete transactions.

The global speech recognition segment is one of the fastest-growing in the voice industry. Most of the growth in the market comes from the Americas, followed by Europe, the Middle East and Africa (EMEA), and Asia Pacific (APR). Healthcare, financial services, and the public sector contribute to the segment the most. However, other segments, such as telecommunications and transportation, are expected to see significant increases in growth over the next few years. As a result, the market forecast further increases with a CAGR of 22.07 percent.

Market growth drivers: Biometrics for security

The growth of the global voice recognition landscape segment depends on many factors. One of the main factors is the increase in demand for voice biometrics services. With the rising complexity and frequency of security breaches, security continues to be a major requirement for businesses and government organizations.

The high demand for voice biometrics, unique to each individual, is critical in establishing a person's identity. Some of the major drivers of the global speech recognition market are:

Increasing demand for voice biometrics services
Greater use of speaker identification for forensic purposes
Application of speech recognition for military purposes
High demand for voice recognition in healthcare

How does voice recognition work?

Let's start with the fact that our speech is a sequence of sounds. Sound, in turn, is a superposition of sound vibrations (waves) of different frequencies. As we know from physics, a wave is characterized by two attributes - amplitude and frequency. To save the sound signal on a digital medium, it must be divided into many intervals and take some "averaged" value on each of them. This way, mechanical vibrations are converted into numbers suitable for processing on modern computers. The speech recognition task is reduced to "matching" a set of numerical values (digital signals) and words from a dictionary.

One of the ways customers interact with companies is by email, and, as it turns out, it is not very popular. According to an American Express survey, 48% of people prefer to discuss the problem over the phone with a representative of the organization when a difficult situation arises.

At the same time, a commercial bank would even consider sending an email in an emergency, for example, if a customer's credit card was stolen. Quick response and prompt communication can prevent a small annoyance from becoming a serious problem. This is a new task for modern developers: to develop perfect speech recognition systems capable of identifying words in a given context.

Connecting the human world to the digital one requires a careful step-by-step study of every facet of the process. Here is when machine learning algorithms come into play. ML algorithms look for the most likely phonemes (components of sound) and possible sequences of words that can be extracted from frequency graphs. And then, depending on the configuration of the application, the output receives a response in the required form (for example, text). In the case of a call center, this text response (or its binary equivalent) allows you to redirect the call to the right department instantly. The speech recognition system is a complex and extremely creative thing. One of the essential components of its development is isolated word recognition.

How do companies build speech recognition technology?

Behavioral biometrics (also called passive biometrics) identifies a person by dynamic characteristics, such as handwriting and signature, voice and speech rhythm, electronic device usage, and gait.

Behavioral biometrics and the world's best solutions

Way back in 2014, Israeli company BioCatch invested $10 million to expand a biometric platform that collects and analyzes hundreds of behavioral signals that are then used by banks to detect suspicious online behavior. Today, the company offers customers a patented technology based on artificial intelligence. It allows real-time identification and verification of the client (in particular, by analyzing how users hold their smartphones and how they control them) to confirm or cancel the transaction when signs of fraud are detected. The program also uses a built-in privacy algorithm to protect user data.

The BioCatch product is currently used by the Brazilian banking company Itaú Unibanco, the large British commercial bank NatWest and the American financial company American Express. One of BioCatch's largest clients, the Royal Bank of Scotland, currently uses the company's service to protect more than 18 million accounts of legal entities and individuals. When a user logs into their RBS account, the program records more than 2,000 of their interactive gestures - from the angle at which they hold the mobile device to which fingers they use to control the application. For example, if a client uses the desktop version of the application, the program reads the speed of typing on the keyboard and pressing the mouse buttons.

In 2018, the program registered unusual activity coming from the account of a wealthy client. Once authenticated, the visitor used the mouse scroll wheel, something the client had never done before. The user then entered the numbers at the upper border of the keyboard instead of the side they would normally use. As a result, RBS blocked access to funds in the client's account, which, as it turned out, was hacked. According to the director of innovation at RBS, thanks to the development from BioCatch, the bank managed to prevent a fraudster from stealing a seven-figure sum from the client's account.

As noted by The New York Times, despite the successful prevention of fraud by the Royal Bank of Scotland, this case is perceived ambiguously. In particular, the user's behavior is only sometimes constant since the same person can behave differently in different situations due to fatigue, intoxication, poor health, or haste. A person can also enter text differently, depending on whether they are at the computer screen in the office or lying on a couch at home.

UnifyID has presented passive biometrics in the form of a mobile application. In 2016, the American company UnifyID introduced a new system that analyzes the behavior and habits of a user, their biometric characteristics, typing rhythm, location, and movement, as well as how they interact with different devices. The program uses machine learning to analyze the received information and identify the user.

The new system works as follows: a user installs a browser extension and a mobile application on a smartphone so that the program can study the user’s behavior and eventually recognize them by the characteristics discussed above. At the initial stages of "acquaintance," the program will call the user's phone and ask them to put their finger on the fingerprint scanner to confirm their identity. Consumer data collected by the system is stored on the local device in encrypted form. The UnifyID product is currently available for iOS and Android users.

The system development is built around three complementary elements:

Fraud detection and user behavior analysis
Continuous speech and voice-based authentication
Mobile multifactor authentication.

Multifactor user authentication is critical in today's interconnected commercial space, and authenticating users attempting to access the network is key to reducing many types of fraud, including chargebacks.

Why Fintech companies adopt voice/speech-based technologies

Initially, the word "biometrics" was found only in medical theory. However, there has been a growing need for enhanced security among businesses and government agencies. The use of biometric technologies is one of the key factors in the global speech recognition market. Since each person's voice is different, voice recognition is used to authenticate a person. This uniqueness will ensure a high level of accuracy and safety.

The speech recognition segment accounts for 3.5% of the share of biometrics technologies in the global market, but this share is constantly growing. Also, the low cost of biometric devices increases the demand from small and medium-sized businesses.

Speaker identification for forensic purposes

Using speaker identification technology for forensic purposes is among the crucial driving forces in the global voice recognition market. There is a complex process of determining whether the voice of a person suspected of a crime matches the voice from the forensic samples. This technology allows law enforcement to identify criminals by the unique characteristics of their voice, thereby offering a relatively high level of accuracy. Forensic experts analyze the consistency of the suspect's voice with the samples until the perpetrator is found. Recently, this technology has been used to help solve some criminal cases.

Speech recognition for military purposes

Military departments in most countries have restricted access to their premises. The military use voice recognition systems to ensure the privacy and security of such premises. These systems help military establishments detect the presence of unauthorized intrusions into a protected area. The system contains a database of the voices of military personnel and government officials who have access to the protected area. These people are identified by the voice recognition system, thereby preventing the admission of people whose voices are not in the system's database.

In addition, the US Air Force uses voice commands to control the aircraft. Apart from that, military departments use speech recognition and the voice-to-text system to communicate with service people in other countries. For example, the US military is actively using speech recognition systems in their operations in Iraq and Afghanistan.

Voice recognition in healthcare

Biometric technologies such as vascular recognition, voice recognition, and retinal scanning are widely implemented in the healthcare industry. Voice recognition is expected to become one of the main modes of identification in medical settings. While addressing HIPAA standards, many healthcare companies in the United States use biometric technologies for secure and efficient patient registration, patient information accumulation, and patient medical records protection. Clinical trial institutions also use voice technology to identify individuals recruited for clinical trials. Thus, voice biometrics is one of the main modes of customer identification in the healthcare market in the Asia-Pacific region.

Voice recognition and biometric market requirements

The impact of issues and trends is assessed based on the intensity and duration of their effects on the current market. Impact magnitude is classified as follows:

Low - little or no effect on the market
Medium – medium level of market impact
Moderately high – significant market impact
High - very strong impact with a drastic effect on market growth

Although today’s voice and speech tech are a marvel, various challenges arise along with its growing popularity. The global voice recognition market faces serious growth breaks. Some of the major challenges facing the global voice recognition market are:

Inability to suppress external noise
High-cost voice recognition application
Problems with recognition accuracy
Low security in speaker verification

The challenges of voice and speech tech in business

Inability to suppress external noise: Despite the technological progress in the field of voice recognition, noise continues to be one of the main problems in the global voice recognition market. Voice recognition, voice biometrics, and speech recognition applications are exposed to environmental noise. As a result, any noise disturbance interferes with recognition accuracy. Voice biometrics is particularly sensitive compared to other types of biometrics. The inability to suppress ambient noise is the only factor preventing voice recognition systems from achieving high results and taking a high percentage of the global biometric technology market share.
The high cost of voice and speech tech and seamless voice recognition applications: Another significant challenge hindering the development of speech recognition technologies is the need for large investments required for development and implementation. Large-scale deployment of voice recognition technology in an enterprise is time-consuming and requires a huge investment. In addition, budget savings lead to limited technology testing. Therefore, any failure can lead to large losses in the enterprise. Due to their cost-effectiveness, alternatives to voice recognition, such as swipe cards and keypads, are still actively used in many companies, especially among small and medium-sized businesses. Thus, voice recognition applications require large material investments, including the cost of an integration system, additional equipment, and other expenses.
Issues with recognition accuracy: In the global voice recognition market, a common problem is low recognition accuracy, even though voice recognition systems can determine various languages and the speech-base authenticity of the voice. Even a minor mistake in any part of the process can lead to an incorrect result. However, some manufacturers have begun developing systems with very low error levels in voice recognition. They have developed systems with less than 4% of inaccurate results (for example, cases when biometric voice measurements misidentify and reject the voice of a person who has authorized access).

Trends & tendencies of the voice recognition market

The effect of the market's challenges is expected to cancel out the various trends emerging in the market. One such trend is the increasing demand for speech recognition on mobile devices. Mobile device manufacturers in the global voice recognition market are developing innovative applications.

Europe Speech and Voice Recognition Market

This is one of the future driving factors. In addition, the growing demand for voice technology in banking, like speech-based and voice-based authentication, is another positive trend in the voice recognition market. Some of the major trends in the global voice recognition market are:

Increasing demand for speech recognition on mobile devices
Growing demand for speech-base and voice technology in banking
Integration of voice verification and speech recognition
Increase in mergers and acquisitions

US voice speech recognition software market size

Integration of voice verification and speech recognition

Some vendors are working towards integrating voice verification and speech recognition technology. Instead of selling voice verification as a separate product, manufacturers offer to incorporate the functionality of voice verification and speech recognition. Voice verification helps to determine who is speaking and what the person is saying. Most manufacturers have started the process of launching speech recognition applications that involve integrating the two technologies described above.

Speech recognition on mobile devices

The growing number of traffic regulations prohibiting mobile devices while driving has increased the demand for speech recognition applications. The countries with strict restrictions are Australia, the Philippines, the USA, the UK, India, and Chile. In the United States, more than 13 states are allowed to use the speakerphone while driving despite the "Regulation on the use of mobile devices." Consequently, consumers are increasingly choosing mobile devices equipped with speech recognition applications that can help them access the device without being distracted by the device itself.

To meet the growing demand for speech recognition applications in mobile devices, manufacturers have increased the amount of research and development work to develop speech command options for mobile devices. As a result, many speech recognition applications have been included in mobile devices, such as music playlist management, address reading, caller name reading, SMS voice messages, and so on.

Speech/voice-based authentication for mobile banking

Stronger verification leads to the universal integration of speech-based and voice-based authentication in mobile banking. Many customers use voice recognition in banking, for instance, in North America and Western Europe. As a result, many financial institutions accept user speech-based and voice-based authentication decisions to accept or reject mobile transactions. In addition, it is cost-effective and simultaneously provides a higher level of security. Phone-based voice institutions are partnering with speech-and voice-based authentication solution providers and biometrics incorporations, which is a significant competitive advantage.

How biometrics drives the development of Fintech?

Voice, fingerprint, and retina scanning are already used in payment relationships. However, the ability to use your body as an identity verification tool has been known for over 100 years. In 1892, the British anthropologist, sociologist, and psychologist Francis Galton created the first fingerprint classification system.

The dynamic urbanization that accompanied the Industrial Revolution stirred up the interest in methods of identifying people, so Galton's discovery was adopted and adapted by various institutions. Currently, biometric tools include fingerprints, voice, facial features, and cornea. The principle behind biometric technology is scanning a person's physiological characteristics and comparing them with their database. In cases when there is a match, the person is identified.

Fintech uses technology not only for speech-based and voice-based authentication but also to speed up transactions. For example, you can scan your fingerprint to verify your payment instead of using a credit card or cash. To illustrate, biometrics is used in fingerprint scanners on modern smartphones and tablets that allow access to Google Pay and Apple Pay systems. Therefore, technology provides a speech and voice for banks and financial services.

Biometrics is breaking down the barriers

Customers tend to abandon the transaction when facing obstacles during checkout. Biometric technology mitigates these risks by providing merchants with the verification data they need, but with the convenience the customer seeks. Moreover, the development of biometrics is happening simultaneously with smartphones. With more and more smartphone owners worldwide, Fintech can rely on the evolution of the device as a vector of innovation in biometric payments.
The priority is a convenience for the user: Since many biometric technologies are still in their infancy, the focus is on consumer convenience rather than security. As a result, Fintech enhances the usability of biometric technologies, making them much easier to be adopted later.
Biometric technologies complement the traditional ones: Traditional technologies such as ATMs incorporate biometrics into their infrastructure, thereby increasing security and spreading these innovations to other financial industry segments.

Use cases of voice, speech, and biometrics in Fintech

Recently, MasterCard's international payment system launched a service to confirm Internet purchases using a selfie in 12 European countries. In addition, Mastercard has partnered with a fintech company FinGo, which integrates its tokenization service to store personal data securely. This will allow registered users to make payments by scanning a unique finger vein pattern.

Meanwhile, PayByFace uses advanced facial recognition technology to facilitate transactions. Users set up the service by registering a payment card, a selfie, and a unique PIN code. Thus, you can leave your wallet and gadgets at home and pay in the store with your face.

Visa is introducing biometric speech-based and voice-based authentication in e-commerce in cooperation with Abu Dhabi Islamic Bank. The system uses biometric sensors built into a standard smartphone. Thus, ADIB's clients can use facial recognition or fingerprints to confirm their identity in the mobile application.

Another example of cooperation is between Samsung and Kookmin Bank, one of the most "heavyweight" Southern Korean banks. They implemented a new biometric system that allows users to be authorized to access accounts using irises.

Certainly, the development of biometric solutions for mobile payments has been fueled by the COVID pandemic. After all, cashless and contactless forms of income have become necessary components of the campaign to contain the spread of the virus. So, Apple Pay introduced mobile POS payments that use fingerprints through the Apple TouchID biometric system in the smartphone. The solution increases security during identity verification and simplifies work on the Internet.

Final thoughts: The future of biometrics in Fintech

Experts say the demand for biometric data is only growing. The popularity of the tech is mostly due to the intense interest in more accurate and simple identification methods and the general improvement of the capabilities of artificial intelligence technologies. Compared to remembering a password or entering a PIN code, biometrics offers a better alternative. Juniper Research analysts estimate that biometric data will be used to authenticate 2 trillion people.

Unlike conventional authentication procedures, speech-based and voice-based biometrics make it more difficult for attackers to use illegally obtained consumer credentials. Conventional forms of authentication can simply be changed in the event of a hijacking, which cannot be done with voice or face. Thus, it is prudent to pay attention to the development of cybersecurity, which will adequately solve these problems.

While many systems currently use a single type of biometric ID, the technology is expected to include multifactor authentication in the future, consisting of fingerprint, retina, and even heartbeat scans. This will provide adequate protection while maintaining comfort and speed of use.

Ready to power your security efforts with voice recognition and biometrics? Contact our experts for a consult.

How to use speech and biometrics recognition for Fintech Security?