Improved Punctuation & Video OCR: Announcing Intelligent Voice v5.3 & 5.4

We have welcomed a number of new faces to the IV family this summer and this has resulted in expediating a number of improvements and new features for our new and existing customer base.

In a world where physical face-to-face presentations are being replaced by remote online video conferences, we understand the need to extract the content of presented slides/materials as well as a transcription of the meeting. Battle tested as part of IV’s Myna prosumer product since 2020,  IV is pleased to announce its new ‘Video OCR’ feature, we extract the text from presented content, add its patented “Topics” and make the text searchable and available through the IV API.

IV is constantly looking at ways to improve the presentation and punctuation of its transcripts: The accuracy and identification of words is of course key but without a logical presentation in-line with expected grammar, transcription accuracy value is diluted. In this release, we see a new punctuator model (v4.1) which has significant improvements in the placement of full stops, commas and questions marks, as well as improvements to how abbreviations are presented. We have also introduced Inverse Text Normalisation (ITN) which is a new transcript formatting technology which improves readability and the identification of semantically important punctuation in transcripts.

A lot of our recent work has been integrating IV’s transcription capabilities via our APIs into our well-established partner applications which have their own workflows and front-ends. We have made some further investments in this area therefore and have created some new web services for Video OCR and the ability to see the status of model adaption. We have also added a lot of new import and export options to the API parameters: For example you can switch Video OCR on/off, select a punctuator model version, switch Tagger on/off, set the export type and template. For a full list please see our API documentation here:

We also announce the release of a brand-new look and feel for our Relativity plugin, to bring it inline with our most recent SmartTranscripts, as well as a slew of bug fixes.  We also see major product enhancement with the support for complex search highlighting when reviewing audio from Relativity Trace.

In addition to these major improvements, we have introduced a number of smaller enhancements that will add value to you our customers and we are committed to continual iterative development as we continue to bring you cutting edge technology for all your audio, video and chat surveillance needs. This includes further work on Dockerizing the Stack, which will be completed for our v6 release at the end of the year.  Sneak peek?  We’ll be introducing a dramatic enhancement to the IV Biometric capability, to allow for full metadata verification and unknown speaker search, as well as v1 of our new meeting summarisation technology to extend the impact and usability of Teams meetings recordings

If you are looking for a demo of current or new functionality, please contact your usual sales representative or sales@intelligent-voice.test.

Existing users can download the new release from their support portal, but please feel free to get in touch with your support representative to talk more about deployment options, or contact support@intelligent-voice.test.