Neeraj Chaudhary, Author at Symbl.ai

Building Real-Time Engagement Applications with Agora and Symbl.ai

Neeraj Chaudhary — Wed, 24 Nov 2021 06:16:45 +0000

Developers building the bleeding edge of voice, video, chat, or broadcast experiences with Agora are now in a position to take its Software-Defined Real-Time Network (SD-RTN) to the next level of real-time engagement (RTE) with Symbl.ai‘s Conversation Intelligence platform.

Developers building on Agora can extend their RTE applications to truly understand natural human conversations across video, audio and messaging channels with Symbl.ai. Through comprehensive APIs, developers can easily integrate with Symbl.ai and access the full suite of conversation intelligence capabilities:

Intelligent speech recognition
Generate contextual insights such as sentiment and custom intents
Auto-generate and classify topics, questions, follow ups, entities and action items
Access advanced conversation analytics

Combining Symbl.ai and Agora enables developers to easily augment and enhance RTE applications by adding live captioning, real-time or post-call coaching, compliance, content moderation, intelligence-driven search and more.

Symbl.ai’s proprietary AI/ML technology combines the power of contextual understanding of human language and deep learning models trained with diverse conversation data to truly understand natural conversations, beyond keywords or pre-set dictionaries. Symbl.ai is domain agnostic and requires no upfront training, eliminating the need to build, maintain and train ML models – empowering developers to easily and securely integrate AI and ML with their RTE apps.

The next section will walk you through the steps to deploy a multi-party video-conferencing application that demonstrates Symbl’s Real-time APIs in combination with the Agora SDK.

Pre-requisites

Features

Live Closed Captioning
Real-time Transcription
Real-time Insights: Questions, Action-Items and Follow-ups
Real-time Topics with sentiments
Video conferencing with real-time video and audio
Enable/Disable camera
Mute/unmute mic
Screen sharing

Browser Support

This application is supported only on Google Chrome and Firefox.

Credentials

Get your Symbl credentials (App Id and App Secret) from the Symbl Platform Console.

Get your Agora credentials (App Id) from the Agora Platform Console. See here for more information on how to do that.

Setup the Database

Download and install PostgreSQL. You can follow these steps to install PostgreSQL.
Create a database with the name of your choice. See here for more information.
Note the username, password and database name that you have created.

Setup the Backend

1. Clone the repo.

2. Navigate to the Symbl-Powered-Agora-Backend-master directory and open the config.json file.

3. Add your Symbl App Id and App Secret values in the respective fields below:

"SYMBL_APPID": ""  "SYMBL_SECRET": ""

4. Open the file models/db.go and replace the following line (19) within the CreateDB function:

db, err := gorm.Open("postgres", os.Getenv("PG_DB_DETAILS"))

with your PostgreSQL database user, password, host and database name as described below:

db, err := gorm.Open("postgres","postgres://:@/?sslmode=disable")

This sample application uses GORM for connecting to the PostgreSQL database and you can learn more about it here.

Run the Backend server

1. Navigate to the Symbl-Powered-Agora-Backend-master directory and run the following command:

go run server.go

Your backend server should be running on port 8080 and you should see a log message similar to the following:

{"level":"debug","time":"9999-99-99T99:99:99-07:00","message":"Backend server running on port: 8080"}

You can also navigate to http://localhost:8080 to make sure the server is up and running without any issues. You should see a sample web page as shown below.

Setup the Frontend

1. Open the file config.json under the folder Symbl-Powered-Agora-Master and provide your Agora project name, display name and App Id in the respective fields below:

"projectName": ""  "displayName": ""  "AppID": ""

2. Add the Backend URL in respective field below:

"backEndURL": "http://localhost:8080"

Run the Frontend

Navigate to the Symbl-Powered-Agora-master directory and run the following command:

npm install

The command above will install all the necessary frontend dependencies.

Run the following command to start the frontend application:

npm run webM

Your frontend server should be running on port 3000 (http://localhost:3000). You should see the web application ready to be used.

Testing the Application

With your backend and frontend servers up and running, navigate to http://localhost:3000, click on the Create a meeting button, enter a room name and click the Create a meeting button again.

When the meeting URL is created, click on the Enter Meeting (as host) button to enter the meeting.

Select your camera, microphone and type your display name before clicking on the Join Room button.

Conclusion

This application allows you to join an Agora video conference meeting with Symbl Transcripts and Insights, Topics and Sentiments enabled and displayed on the screen in real-time.

Community

Symbl.ai‘s invites developers to reach out to us via email at developer@symbl.ai, join our Slack channels, participate in our hackathons, fork our Postman public workspace, or git clone our repos at Symbl.ai’s GitHub.

The post Building Real-Time Engagement Applications with Agora and Symbl.ai appeared first on Symbl.ai.

Make Long-form Voice and Video Content Distribution Easy with Conversation Intelligence

Neeraj Chaudhary — Thu, 14 Oct 2021 00:48:27 +0000

Conversation Intelligence can automatically clip and index your long-form voice and video content (like podcasts, webinars, and conferences) facilitating easy content distribution. You can use conversation intelligence API’s to achieve this automatically – indexing by topic, parent-child hierarchy, or Q&As – for an improved user experience.

The problem with long-form voice and video content

We have all experienced the rise of digital content, with podcasts, webinars, and conferences held remotely. If users can easily access the parts of long-form content that interest them the most without having to spend hours navigating recordings, and creators can distribute their content in a focused and tailored way to their audience, everyone benefits.

Done well, identifying which content is good for a certain audience is smart and useful. This is because the audience is more likely to be engaged and trust that your message/product/brand understands what they want and need. This is similar to what Netflix does when it shows different TV shows to different geographical audiences across the world.

How to navigate long-form content with speech recognition

Long form content is hard to navigate because it contains free-flowing, continuous information.

Most businesses today manually clip long form content into smaller, consumable pieces for distribution and consumption. To do this someone has to review the content to see what is being discussed at each point and then clip or add tags to the video or audio (a tedious process). When you’ve clipped the file, you can create an index from which you can navigate or pull out topics or information, or skip to the section you’re most interested in.

Another way to navigate long-form voice or video content is to get a transcription, and then you can use that to search for keywords and topics.

The problem is, these methods are time-consuming and there are not many solutions that can automatically and contextually understand the content to identify the different topics it contains.

…Enter conversation intelligence.

Conversation intelligence helps you level up the value of long-form content

Symbl.ai offers a comprehensive suite of conversation intelligence APIs that can analyze human conversations and speech without having to use upfront training data, wake words, or custom classifiers.

Summaries

One of Symbl.ai’s features that’s really useful for long form voice/video content is the Summary API. With this API you can create summaries of your long form content in real time or after it has ended. Using Summary API, you can create summaries that are succinct and context-based. Here’s an example of a summary generated by Symbl.ai from a recorded transcript:

Indexing

WIth conversation intelligence you can create automatic indexing of voice or video content. You can index by topic, using a parent-child hierarchy, or by Q&A.

Three ways to index long-form content with Symbl.ai’s conversation intelligence API:

1. Topics

Symbl.ai can identify topics, which are the main subjects talked about, and add them to the long-form voice or video content. Every time a context switch happens in the content, Symbl.ai’s topic algorithm detects the change and extracts the most important topics out of it.

Every topic has one or many Symbl.ai Message IDs, which is a unique message identifier of the corresponding messages or sentences. Topic A might be the main theme of your content and might have six different Message IDs (which refer to timestamps in the conversation). Being able to automatically identify topics in context makes it easier to search your content because you don’t need to think about specific keywords.

Once you have the topic(s) of the content and their Message IDs, then you can automatically index your long-form voice or video content and give users a flexible and easy way to navigate or search hours of recordings.

2. Parent-Child Hierarchy

Any conversation or presentation can have multiple related topics that can be organized into a hierarchy for better insights and consumption. Symbl.ai’s Topic Hierarchy algorithm finds a pattern in the conversation and creates parent (global) topics and each parent topic can have multiple child (local) topics within it.

Example of parent-child topic hierarchy in a conversation.

Parent Topic: The highest-level organization of content ideas. These are the key points on which the speakers expanded and discussed at length.
Child Topic: These are the subtopics that aggregate or are originated from the parent topic. Child topics are linked to the parent topic as they form the time chunks of the parent topics in a certain way.

Example of a parent-child topic hierarchy with Symbl.ai Topics API.

Once the long-form content is split into parent-child topics, you can then use these to build a timeline and split the content very easily using the corresponding Message IDs.

3. Clips of Q&As and Sentiments

You can create clips of the most important questions and their answers and then display them however you want. For example, you can take a podcast and clip the Q&As throughout and compile them into a highlight reel. You could do this with all the Q&As or surface certain ones by topic.

The same process can be followed to create clips corresponding to sentiments, so for example you could clip all the positive, neutral, or negative sentiments within the voice/video content and use them to create another highlight reel.

Upgrade the user experience with Symbl.ai

In this busy world, your ability to automatically create a summary or comprehensive view of one or many videos or recordings saves everyone time and helps people consume the content easily. And once you have indexed your long-form content you maximize your chances of increasing user engagement and retention with greater accessibility for end users.

You can also spin up a customized video summary experience that further enhances your experience for recordings with your own brand and get to market fast.

Example of a customized video summary experience with Symbl.ai

Learn more about Symbl.ai‘s suite of APIs to automatically index your long-form audio content.

Additional reading:

The post Make Long-form Voice and Video Content Distribution Easy with Conversation Intelligence appeared first on Symbl.ai.

Augment Your Business Intelligence Tools with Conversation Intelligence

Neeraj Chaudhary — Mon, 06 Sep 2021 08:17:25 +0000

Business intelligence systems can be augmented with help you unlock the data found within human to human conversations in a way that makes it easy to see the patterns and connections.

Business intelligence (BI) systems have been helping companies make highly informed decisions that are backed by data for decades. These systems ingest the data that you have in your business (either in real-time or asynchronously) and present it in dashboards that make it easy to see the patterns and trends that exist within your company.

Depending on the data that gets fed into a BI system, you can use these platforms to determine everything from the most common time of day that people purchase your products to the kinds of emails that drive the most sales for your business.

When you think beyond the more traditional types of data that you can use in a BI, the use cases grow. At Symbl.ai, we’re big fans of using data found within human to human (H2H) conversations. This data can unlock a whole range of insights, like trending topics with sentiments and flag critical business insights around a specific promotion or offer in real time.

Combining conversation analytics to existing and new BI data

Adding conversation analytics to your BI platform opens up a whole new level of understanding of how your business is operating. If you’re accurately capturing audio recordings, video files, emails or even other forms of conversations and processing it using a conversation intelligence system like ours, you can uncover insights that add a new layer of understanding how your business is operating.

To do this, you need to be able to take the data you’ve captured and push it towards your BI systems.

What happens when you integrate BI and conversation intelligence

Businesses that have a BI system in place and are collecting customer conversation data at scale have a lot to gain by integrating the two. Doing so allows you to create conversation intelligence systems that offer all the benefits of BI provides (better access to data and information) combined with the intelligence needed to understand trends and gain actionable insights at scale.

The contextual understanding that the conversation intelligence system brings to the table allows you to access more granular aspects of the conversation data that you may need in real-time or for a historic view of large volumes of conversation.

How you can use this data

Once you’ve got your data in your BI system, you gain access to powerful insights that are available in real time. This data can be used in a variety of use cases across different industries:

Distance learning or online education – If you’re using a software application to run remote courses, training, or school classes in parallel for a large number of students or attendees, understanding how effective the content is can be tough. It’s often not until test results come back that you realize what’s been working and not working. By feeding conversation data into a BI system, you can see which teachers or instructors are delivering the best results at scale, who are answering the questions students are asking in the most effective way, and even which teachers are getting more questions. This information can be used to guide who teaches what subjects, what information is currently missing from the curriculum, and what needs to be covered most during exam reviews. Furthermore, such systems can provide real-time alerts when profanity is detected and ensure compliance in real time across the content.

Sales calls – Conversation intelligence helps you understand how the various members of your sales team are doing at scale. BI platforms, when augmented with conversation analytics can provide you a detailed look at the kinds of questions that are being asked, what the most effective answers are, and who’s using the best techniques when it comes to closing deals. Having this information displayed visually makes it easier to see the patterns as they’re emerging in real time and allows your sales reps to act on them in the moment when the sales conversation is happening.

Telehealth – Conversation intelligence can help you better identify which questions and symptoms from patients are likely to lead to certain diagnoses. A BI platform can provide an easy-to-follow large scale representation of the ongoing conversations on telehealth platforms with identification of key symptoms and common diagnoses based on the questions that are being asked. Along with better diagnostics, it also becomes possible to notice localized (or not-so-localized) outbreaks faster based on the data coming from the conversations.

Ready to integrate voice data into a BI system?

At Symbl.ai, we’re big fans of making sure that you get the most of your voice or video conversation data you’re capturing from calls, meetings or sessions. Symbl.ai’s APIs make it possible to easily capture and analyze the conversations at scale, both asynchronously and in real time.

Symbl.ai’s Streaming API gives you the power to add a new form of data and insights from voice and video to your existing BI coverage. The Symbl.ai platform APIs make it very easy to integrate this intelligence in your current systems by removing the need to build machine learning models or custom data pipelines in your business. Symbl.ai makes it possible to get real-time, actionable insights from your BI system where traditional data sources can be combined with all the sources of conversation and visualized under common themes.

If you’d like to learn more about how conversation intelligence can help, our documentation is a great place to start.

Additional Reading

Symbl.ai documentation – Conversation Analytics

Building a Conversation Intelligence System

The 5 Dimensions of Conversation Intelligence

The post Augment Your Business Intelligence Tools with Conversation Intelligence appeared first on Symbl.ai.

8 Tips and Tricks to Improve Video and Audio Transcription Accuracy with Symbl.ai

Neeraj Chaudhary — Thu, 26 Aug 2021 00:48:17 +0000

The main variables for accurate audio transcriptions are frequency and quality of audio. You can improve the accuracy even more if you: pre-feed custom vocabulary, set up different streams of audio, identify dialects and accents, keep your audio clean, beware of noise cancellation, avoid using automatic gain control (AGC), avoid audio clipping, and position the user close to the microphone.

Why is video and audio transcription accuracy important?

When you provide video and audio transcriptions for your clients, you’ll want them to be as close to the spoken word as possible to ensure they’re correct, helpful, and, professional.

The main variables that can affect audio transcription accuracy are the frequency and quality of your audio. The type of video or audio file that you’re transcribing and how it has been created will affect how much you can improve the resulting transcription.

The three most commonly transcribed types of audio or video

You’ll see next that the audio sampling rate can range from 8-48khz, depending on the type of stream you use to produce transcriptions. The better the audio frequency, the more accurate your transcriptions will be.

1. Recorded files

You can use recorded files, either audio or video, and create transcriptions after the event. You can digest the files with Symbl.ai’s Async API and then use the platform’s Conversation API to produce the transcription.

2. Real time WebSocket based integration

A WebSocket is a protocol for establishing two-way communication streams over the Internet. If you use an API for WebSockets, you can create transcriptions in real-time conversations. WebSockets facilitate communications between clients or servers in real time without the connection suffering from sluggish, high-latency, and bandwidth-intensive HTTP API calls.

Symbl.ai supports most common audio formats with a sample rate range of 8 to 48kHz. Symbl.ai recommends you use the audio format OPUS because it provides the most flexibility in terms of audio transportation. OPUS also has packet retransmission mechanisms, like the Forward Error Correction (FEC) feature, which work well especially in low-bandwidth scenarios.

3. Real time SIP and PSTN-based integration (also known as telephony)

Session Initiation Protocol (SIP) is the foundation of voice over internet protocol (VoIP). It enables businesses to make voice and video calls over internet-connected networks. If you use SIP, your chosen meeting assistant can connect to the stream and listen like another user. Using a SIP line provides a higher sample rate (Zoom’s sample rate is 16-48 kHz, which is very good) and consequently provides more accurate transcriptions.

The Publicly Switched Telephone Network (PSTN) is the traditional worldwide circuit-switched telephone network that carries your calls when you dial in from a landline phone. It includes privately-owned and government-owned infrastructure. The audio sample rate for PSTN is a maximum of 8kHz.

Symbl.ai’s 8 tips and tricks to improve transcription accuracy

Often you won’t have any control over the types of audio or video stream your client provides you with to create transcription capabilities. The type of stream will be a major determinant of the accuracy of the transcription. However, there are other ways that you can optimize your speech recognition results.

1. Boost accuracy with custom vocabularies

If your subject matter is specialized or technical, you can help out the machine by pre-feeding it with some custom vocabulary that it might expect to hear. An example of custom vocabulary in a medical context would be specific medical terminology or abbreviations.

2. Set up different streams of audio

If there are a lot of people on the same channel, you’ll be faced with two options: to write your code in a way that creates separate audio/video streams for each speaker, or to leave it as one stream.

If you create different audio streams for each speaker, you’ll provide better speech recognition accuracy and handling of streams. There is a cost factor to consider in this scenario if there are lots of people on the recording because it means you’ll need a better infrastructure and more resources to manage all the different streams.

Let’s say you have three speakers and a large audience in an interactive webinar. You could attribute a single stream to each of the speakers so they can be individually identified and then create one additional stream for the whole audience. In this scenario, all questions would be displayed in a transcription as “audience”.

3. Identify dialects and accents

If you know where the speakers on the audio stream are from, then you can identify the accent or dialect that they are using (e.g. American vs. Scottish accents). By pre-teaching your model to identify and contextually understand different relevant accents and dialects, you can avoid simple errors and improve the accuracy of your transcription.

4. Keep your audio clean

It’s best to provide audio for transcription that is as clean as possible. Excessive background noise and echoes can reduce transcription accuracy. This balance between speech and noise is measured by a speech-to-noise ratio (SNR). SNR is a measure of unwanted noise in an audio stream relative to recognizable speech.

The SNR can negatively affect your system’s performance by limiting operating range or affecting receiver sensitivity. When you understand how to calculate and manage this you’ll be able to create a robust, accurate system for real-life situations.

5. Beware of noise cancellation

If you are considering noise-canceling techniques, you should be aware that they may result in information loss and reduced accuracy. If you’re unsure whether the techniques you are considering will do this, it’s best to avoid noise cancellation.

6. Don’t use automatic gain control

A disadvantage of automatic gain control (AGC) is that when recording something with both quiet and loud periods, the AGC will tend to make the quiet passages louder and the loud passages quieter, compressing the dynamic range. The result can be a reduced audio quality if the signal is not re-expanded when playing.

7. Position the speaker close to the microphone whenever possible

The proximity of the speaker to the microphone will affect the audio quality. You can disrupt your audio if your microphone is too close to your mouth or too far away. It’s usually best to position your face about two inches away from the microphone. Hold your hand in front of your face with your fingers pointed up and spread naturally – that’s about the right distance. Any closer and the microphone will pick up your mouth sounds. Any farther and the microphone will pick up room sounds. Try to maintain a constant distance from the microphone throughout the recording.

8. Avoid audio clipping

Audio clipping is a form of waveform distortion. When you push an amplifier beyond its maximum limit, it goes into overdrive. The overdriven signal causes the amplifier to attempt to produce an output voltage beyond its capability, which is when clipping occurs. If your audio is clipping, you are overloading your audio interface or recording device. In doing so, you have run out of headroom in your recording equipment. There are ways to avoid audio clipping, like using any attenuator technology built into your camera or recorder, and/or creating a safety channel.

How do you know how good your transcription is?

To analyze the quality of the transcription, you can measure the word error rate (WER) and/or the sentence error rate (SER). To have a basis of comparison, a human also needs to do the same transcription from the same audio or video stream.

There is no set norm of what a good percentage accuracy would be. Based on what is achievable with today’s technology, 80% or above can be considered a very good level of accuracy.

Symbl.ai provides real-time and asynchronous transcription capabilities that can help you achieve better accuracy in your transcriptions. Symbl.ai is a conversation intelligence platform with a secure and scalable infrastructure that provides programmable APIs and SDKs for developers and doesn’t require building or training a machine learning model.

If you use Symbl.ai for your audio transcriptions, you’ll usually be able to achieve up to 90% audio transcription accuracy. Of course, your accuracy always depends on the audio quality you have to work with, but by deploying the tips and tricks from Symbl.ai in this article you can better optimize your results.

Additional reading:

The post 8 Tips and Tricks to Improve Video and Audio Transcription Accuracy with Symbl.ai appeared first on Symbl.ai.

Build AI-driven Webinar Experiences with Symbl.ai APIs for Conversation Intelligence

Neeraj Chaudhary — Wed, 11 Aug 2021 22:42:29 +0000

It takes a lot of effort to build and train a machine learning model that can contextually understand webinars and pull valuable insights, like topics, sentiments, and summaries. With Symbl.ai’s plug-and-play API, developers can optimize the webinar experience for those who build and use it.

What is an AI-driven webinar experience?

An AI-driven webinar experience often includes live captioning capabilities, but it can also offer so much more to creators and attendees. Adding conversation intelligence to a webinar means that everything can be contextually understood and analyzed in real time. The analysis can include identification of the topics covered, specific reactions (positive, negative, emojis, etc), the sentiment of what is being said at the topic or message level, what questions are being asked, and what follow-ups are required.

Webinar pain points

Webinars offer great potential benefits for both creators and attendees, but they are often unrealized. To take the value of webinars to the next level there are a number of pain points that you first need to address.

For the creator:

Creating structured, searchable, and interactive content that’s optimized for online learning takes a lot of time.
It’s difficult to make content consumable after the webinar or webcast to promote distribution for higher engagement.
Creating an unbiased webinar format can be tricky. You want to avoid bias towards a certain business use case to ensure a wider understanding from your audience’s perspective. So, it’s best to use an API platform that is designed for all business use cases.
Finding topics for growth, identifying sales, and any Q&As.

For attendees:

Navigating to relevant parts of the content based on specific topics.
Needing a complete post-event summary experience, particularly one that’s relevant to their specific interests.

What AI for webinar features does Symbl.ai offer?

You can add value to the webinar experience by using Symbl.ai, a conversation intelligence API platform which provides real time and contextual AI for webinar capabilities including:

Real-time transcription
Transcription of recordings
Speaker Separation (speaker events, multi-channel audio, AI-powered speaker separation)
Sentiments (at sentence and topic level)
Automatic topic detection and topic hierarchy
Pre-built UI
Q&A detection

How to enhance webinar experiences with Symbl.ai’s conversation intelligence API

When you incorporate Symbl.ai’s conversation intelligence features into a webinar, you improve the quality and value of the experience for the creator and attendee.

Here’s what you can do with Symbl.ai’s conversation intelligence API:

Enable CI-powered search by conversation intelligence across webinars based on entities, topics, sentiments, and custom keywords.
Identify specific topics that the webinar covers. Then the conversation intelligence system can perform tasks like creating a topic hierarchy (using a parent-child hierarchy that has multiple levels to track the topic relationships), clip long-form content, or clip Q&A reels into different topics. This way subject areas can be searched for and surfaced as required allowing for easy content distribution.
Give access to AI-powered recordings and personalized content to find and upsell opportunities and increase customer retention. A person accessing a post-webinar recording can select and receive the information that most interests them. This approach is efficient for the user, and the creator can identify and target opportunities reflecting the user’s preferred topics.
Highlight the popular sections of your webinar. When you use conversation intelligence, reactions in the chat or with emojis can be contextually understood and identified, and then you can include these in a webinar highlight reel. Similarly, you can analyze which topics have received positive feedback and then optimize the topic in the future (e.g. with more emphasis or details).
Customize a pre-built webinar summary UI for easy content navigation. Using Symbl’s Video Summary UI, you can provide a screen where users can select key elements like topics, transcripts, and insights. The interface will surface the timestamp where it occurred and begin playback from there. The image below illustrates Symbl.ai’s Video Summary UI for a webinar in action.

Using Symbl.ai vs. doing it yourself

Building all these capabilities from scratch is a time-consuming task. With Symbl.ai’s flexible API, you can skip hours of being hunched over your keyboard training machine learning models to understand context and recognize speech. Depending on the platform you can use the Symbl.ai Adaptor and then unleash Symbl.ai on the recorded file after the webinar with the Async API, or have it work in real-time with the Streaming API. Here’s what the integration of Symbl.ai’s conversation intelligence API with a webinar looks like:

Additional reading

Check out our sample Github apps to learn more about how you can use Symbl.ai‘s conversation intelligence APIs to short-circuit development time and release AI capabilities in the webinar platform.

Twilio: https://github.com/symblai/symbl-twilio-video-react
Agora: https://github.com/symblai/symbl-agora-demo-app
Chime SDK: https://github.com/symblai/symbl-chime-adapter

The post Build AI-driven Webinar Experiences with Symbl.ai APIs for Conversation Intelligence appeared first on Symbl.ai.

Getting started with speech analytics in your application

Neeraj Chaudhary — Mon, 02 Aug 2021 12:00:40 +0000

TLDR: Voice data can be captured during business communications, like video calls, and run through speech analytics to unlock insights found in human to human conversations.

It’s no secret that businesses generate a lot of data. It comes from the software we use to work, from customers using our products, and from our cell phones, computers, and tablets.

For the most part, we do an excellent job of using that data to help fuel innovation, create better products, and better serve our customers. However, we’re often able to capture far more data than we actually use.

Voice data is a great example of data that many businesses aren’t using to their advantage. Voice data is generated from the conversations we have with each other while doing business. This can be everything from meetings with colleagues to conversations with customers.

Capturing the valuable information that’s found in our conversations lets you create apps that can help businesses better serve their customers or provide better answers to patients in medical settings.

What kind of voice data can be captured (and how)?

In its simplest form, voice data is any kind of information that can be pulled from a conversation between two or more humans. When you don’t tap into this data, you introduce the need to do more things manually, like having someone listening to calls after the fact to extract the key points you want.

Voice data contains insights like:

Follow up meetings mentioned during a conversation
Transcriptions of the conversation itself
Action items discussed in the meeting
Any next steps that may have come up
Sentiment analysis from customer service calls

This data can be captured in two ways: asynchronously or real-time. With asynchronous data, you get a recording of the conversation after it’s happened and then you process it with a speech analysis AI system that helps you analyze the data. In real time, you can use speech analytics to analyze your data as it happens.

Not surprisingly, real-time analysis of voice data is more challenging because you have to consider factors like what protocol is being used to transmit the data (SIP, PSTN, WebSocket) and whether or not you can even access the data stream.

Real-time data collection is especially hard if you’re not using a system that lets you add voice APIs, which are the easiest way to get started when you’re working with voice data.

How Symbl.ai can help you collect (and analyze) voice data

The good news is that even with the challenges around capturing real-time data, getting started with speech analytics is fairly straightforward.

The best way to get started is to use a communication system that allows you to build out the custom functionality you need in tools like VoIP or video conferencing platforms using Symbl.ai’s APIs.

These APIs help you stop worrying about how to handle and capture voice data in your applications, and instead, let you focus more on the insights gained from speech analysis AI.

For example, Symbl.ai’s Async APIs can help you process any recording (both audio and video) you have to reveal valuable insights. You can process files in several formats but let’s start with a .wav format audio recording.

First, you need to create an account on Symbl.ai platform to get your appId and appSecret. Once you have those, you can use them to generate the authorization token necessary to make a call to Symbl.ai’s Async API.

Below are the sample cURL requests for generating authorization token and then sending audio file to Symbl.ai for processing .

Getting your OAuth token:

curl --location --request POST 'https://api.symbl.ai/oauth2/token:generate'   --header 'Content-Type: application/json'   --data-raw '{     "type": "application",     "appId": "",     "appSecret": "Your APPSecret"  }'

You will get a token as a response here, use it next cURL request below, sending any locally stored audio file recording for processing:

curl --location --request POST 'https://api.symbl.ai/v1/process/audio'   --header 'Content-Type: audio/mpeg'   --header "Authorization: Bearer $AUTH_TOKEN"   --data-binary '@/file/location/audio.mp3'

You will get a conversationID as a response to the second request and you should save it. Using the conversationId you can use our Conversation API to extract the valuable information found within the audio recording conversation (like action items, transcripts, etc.). You get the output as a JSON file, which makes it possible for you to use it however you want.

These APIs not only help you mine this data from your conversations, but also do it without having to use separate platforms for everything. Instead of having to use a separate automatic speech recognition platform or analytics vendor, you can manage everything from the communication tool you’re already using.

What’s great about all of this is that it helps unlock the data that happens in conversations across the entire business and in a variety of industries. Call centers are one of the major uses for conversation analytics. Conversation AI can help call center agents more easily navigate calls and provide better customer experiences with real-time insights, sentiment analysis, knowledge base searches, generating automatic follow ups, and action items.

Other use cases include:

Business meetings — Surface action items, meetings, and other useful information that arose during the conversation.
Sales calls — Not only can you pull the same kind of data you would from a business meeting, but you’re also able to analyze the effectiveness of your sales teams at closing deals.
Telehealth — AI analysis can help you train your system with contextual data to help make better diagnoses.
Distance education — E-learning platforms gain the ability to provide real-time transcriptions to students who are watching live lectures, increasing the accessibility of the lecture. The AI could also collect key points made during the lessons and send out summaries to students after the lesson ends.
Sales staff — You can learn who the best salesperson is, and (best of all), you can understand what phrases or tactics they use that makes them so effective.

Want help getting started with voice data?

Symbl.ai makes it easy to get started analyzing your voice data with voice APIs that provide out-of-the-box advanced capture and analysis functionality unlocking the ability to work in both real time and asynchronously. This reduces the amount of time you would normally spend building and training your AI down to virtually nothing, allowing you to accelerate time to value and scale with ease.

Check out our documentation page to explore all the different ways you can leverage voice data in your business.

Additional Reading

The What, Where, and Why of Contextual AI

Enhance Human Conversations with Conversation Intelligence

Transcribing audio from streaming input

The post Getting started with speech analytics in your application appeared first on Symbl.ai.

How to Elevate your Voicemails with Conversation Intelligence and Symbl.ai

Neeraj Chaudhary — Mon, 26 Jul 2021 12:00:10 +0000

By adding conversation intelligence to your voicemail services, you can access voicemail for better contextual transcriptions, to identify voicemail action items, aggregate your voicemails to identify trends, catch spam, tag or bookmark your voicemails, and even recommend and action other collaborative workflows, like emails or calendar entries.

Smarter voicemails

Voicemails are one of the most popular forms of asynchronous communication. But wading through them to find relevant information and acting on it is time-consuming in a modern work environment, and most voicemail services will only provide you with transcription services.

Adding conversation intelligence to your voicemails elevates them to a new level of sophistication — providing you with voicemail speech analytics, like contextually accurate transcriptions and understanding trends.

Conversation intelligence analytics lead to many useful and cool benefits; like saving you time because you don’t have to listen to every voicemail yourself, and surfacing action items and key information asynchronously or in real time. This leap in technology is available now and makes your voicemail system work harder and smarter for you.

You can add conversation intelligence and analytics to your voicemail using a conversation intelligence API platform like Symbl.ai, which provides contextual AI in real time.

Smart voicemail features and benefits with Symbl.ai

1. The most accurate transcriptions

Some voicemail service providers can already provide transcriptions, but they’re not very accurate or contextual. Symbl.ai’s contextual AI doesn’t just recognize the words, but actually understands their meaning in that specific context. This gives you a high level of accuracy, ensuring fewer mistakes and misunderstandings, and so provides a more professional business image.

2. Identifying voicemail action items

Having contextually understood and transcribed your voicemail, Symbl.ai can identify action items based on action phrases, insights, and entities. When an action item is identified it will be associated with a corresponding entity.

An entity is an organization, place, person, date, or number. Here’s a voicemail transcript example with the entities highlighted:

“Hi Neeraj, it’s Doug here. Let’s have a Zoom meeting on Friday at 10am to discuss the sales strategy”.

From this example voicemail, Symbl.ai will determine that the action item required is to schedule this Zoom meeting in the calendars of the people mentioned. You can then create a workflow to automatically create or suggest a meeting.

3. Aggregating voicemails to understand trends

Most businesses receive so many voicemails that the thought of getting an overview of the material and trying to track patterns by a human is too daunting. You can use a conversation intelligence API, like Symbl.ai, to aggregate all of the messages and analyze the overall context.

If, for example, you get 30 voicemail messages from one business or person in a month, Symbl.ai can identify these as a group and tell you which are the most important topics, what the customers are calling about, and/or overall trends in context. You can also use these tools to identify the sentiment (positive, negative, or neutral) of the messages.

4. Auto-identifying spam in voicemails

Symbl.ai offers a useful feature called Trackers that identifies context based on certain vocabulary or intents. You can process your voicemail with Symbl.ai and analyze it with a spam vocabulary dictionary, so you’ll know which ones to listen to and which to delete — saving you valuable time.

For example, if you want to avoid finance offers from banks and a voicemail includes, “XY bank has Z offer for you,” or “we can offer you finance,” then Symbl.ai can contextually understand the message and its intent. It doesn’t matter if the words in the voicemail aren’t the exact phrases or words you initially ran through the model to be identified because of Symbl.ai’s contextual understanding.

5. Bookmarking certain voicemails

Another way you can use Symbl.ai’s Trackers feature is to create custom tags or bookmarks in your voicemails. For example, you could ask Symbl.ai to tag promotional offers or voicemails with positive sentiments. As with identifying spam, you provide the Trackers feature with a predefined vocabulary or intent and Symbl.ai will intelligently search in context and categorize the data.

6. Recommend other collaboration workflows

Symbl.ai understands voicemail contextually so it can suggest follow-ups for you, like sending an email, to phone someone back, or to diarize a meeting. Based on what’s said in the voicemail you can trigger the email automatically.

For example, if a customer leaves a voicemail asking about a specific deal you’re offering, you can use the Symbl.ai API Trackers feature to create a workflow that sends an automatic reply to the customer with the right information. The workflow in this example would look like this:

Symbl.ai “How To”:

a) Process multiple voicemail recordings using Symbl.ai

Let’s imagine you have multiple voicemails containing publicly accessible URL’s of the voicemail recordings, then you can simply loop through these submitting each URL to be processed by Symbl.ai.

Here is a sample code snippet in Javascript. This loop will submit files to be processed with Symbl.ai and print a Conversation ID that corresponds to each file.

for (var i in urls) {  var myHeaders = new Headers(); myHeaders.append("x-api-key", “Insert Symbl token here”); myHeaders.append("Content-Type", "application/json"); var raw = JSON.stringify({"url":JSON.stringify(urls[i]),"confidenceThreshold":0.6,"timezoneOffset":0}); var requestOptions = {  method: 'POST',  headers: myHeaders,  body: raw,  redirect: 'follow' }; fetch("https://api.symbl.ai/v1/process/audio/url?languageCode=en-US", requestOptions)  .then(response => response.text())  .then(result => { console.log(result); console.log(urls[i]); })  .catch(error => console.log('error', error)); }

Note: The limit for concurrent requests that can be submitted for processing to Symbl.ai is 50 at the time of publication of this article. Please reach out to support@symbl.ai to get it increased.

b) Get the corresponding voicemail analytics.

For each conversation ID created, you can use Symbl’s Conversation API to get the corresponding analytics i.e. action items, follow ups, questions, sentiment, etc.

For example, make this request to print all the action items corresponding to the conversation ID used:

var myHeaders = new Headers();  myHeaders.append("x-api-key", "Insert Symbl token here");  var requestOptions = {   method: 'GET',   headers: myHeaders,   redirect: 'follow'  };  fetch("https://api.symbl.ai/v1/conversations/Insert Conversation ID here/action-items", requestOptions)   .then(response => response.text())   .then(result => console.log(result))   .catch(error => console.log('error', error));

c) Use telephony endpoints (SIP and PSTN) in Symbl.ai to get real time voicemail analytics.

In this example let’s walk through how you can get live transcription and insights events in a telephone call. You’ll need to provide a phone number, Symbl AppId, and AppSecret to get started. Also, you’ll need to download Symbl Node SDK. Once you run this, you will receive a call on the specified phone number and you should then see the live transcript and insights on your screen. Here’s the code:

const {sdk, SpeakerEvent} = require("symbl-node");  sdk.init({   // Your appId and appSecret https://platform.symbl.ai   appId: 'your_appId',   appSecret: 'your_appSecret'  }).then(async () => {   console.log('SDK initialized.');   try {    const connection = await sdk.startEndpoint({     endpoint: {      type: 'pstn', // when making a regular phone call      // Replace this with a real phone number      phoneNumber: '1XXXXXXXXXX' // include country code, example - 19998887777     }    });    const {connectionId} = connection;    console.log('Successfully connected. Connection Id: ', connectionId);    // Subscribe to connection using connectionId.    sdk.subscribeToConnection(connectionId, (data) => {     const {type} = data;     if (type === 'transcript_response') {      const {payload} = data;      // You get live transcription here!!      process.stdout.write('Live: ' + payload && payload.content + '\r');     } else if (type === 'message_response') {      const {messages} = data;      // You get processed messages in the transcript here!!! Real-time but not live! 🙂      messages.forEach(message => {       process.stdout.write('Message: ' + message.payload.content + '\n');      });     } else if (type === 'insight_response') {      const {insights} = data;      // See link here for more details on Insights      // You get any insights here!!!      insights.forEach(insight => {       process.stdout.write(`Insight: ${insight.type} - ${insight.text} \n\n`);      });     }    });    // Stop call after 60 seconds to automatically.    setTimeout(async () => {     await sdk.stopEndpoint({connectionId});     console.log('Stopped the connection');     console.log('Conversation ID:', connection.conversationId);    }, 60000); // Change the 60000 with higher value if you want this to continue for more time.   } catch (e) {    console.error(e);   }  }).catch(err => console.error('Error in SDK initialization.', err));

Learn more about how Symbl.ai can elevate your voicemails and make them work more intelligently for your business.

Additional reading:

The post How to Elevate your Voicemails with Conversation Intelligence and Symbl.ai appeared first on Symbl.ai.