Language Trasncription

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Tuesday, 11 January 2011

Reliability of Voice Recognition Technology

Posted on 19:30 by Unknown
Voice recognition technology refers to that technology that recognizes spoken word and converts it into text.  There are many voice recognition softwares in the market, the most popular one being Dragon Naturally Speaking Software by Nuance.  Here, we will talk about this particular voice recognition software.
 
Dragon Naturally Speaking

Voice recognition software is very helpful to an individual whose keyboard skills are poor.  The software ‘Dragon’ is designed in such a way that the user has a proper interface with the software and its features to the fullest extent possible.

To start with, the software needs to be trained.  Every new user creates his individual profile and then starts the procedure to train the software.  Dragon comes with a module in which the user needs to train it with regards to the tone of the user’s voice.  This module has a series of steps to be followed so that the software gets accustomed to his voice.  Once the user is comfortable with the commands of the software, he can work with the software on live jobs.

As a transcriber, a live job means the audios to be transcribed.  With this software, a transcriber can listen to the audio and speak out the lines as they are heard.  This software also contains intelligence which is of added advantage to the user.  If a user narrates a line, the software is able to interpret the content to some extent and not confuse with phrases like ‘I scream’ and ‘ice cream’ according to the context.

This software is also helpful for those who lack a proper English vocabulary.  Difficult and rarely used words like ‘habiliments,’ ‘sacerdotal,’ etcetera, if spoken properly and clearly, can be typed out by Dragon without the transcriber knowing these words.  It is also useful in a similar way in case of names of places.  The more this software is used, the more it gets accustomed to the voice and tone of the speaker thus enabling it to grasp the context and content matter of the file.  This helps to easily get hold of some words which are time-consuming to find in some cases.

Using 'Dragon' for the actual work
At times, a file can have some medical terms specific to some disease.  In these cases also Dragon helps to some extent and deciphers the words with regards to the contextual meaning of the statement.  A disease name such as amebiasis can be spelt as pronounced by Dragon more or less in a correct way provided the user narrates it correctly.

In general, even simple English words which frequently appear in a file; for example, words like ‘differentiation,’ which are long and tedious to type, are easily taken care of by Dragon once it gets used to the speaker’s accent, tone, pronunciations, etcetera.  All in all, Dragon reduces the time spent on typing the file and enables a transcriber to devote more time to research.  This results in optimum quality transcripts.

Read More
Posted in | No comments

Tuesday, 4 January 2011

Speaker Identification – Expectations and Limitations

Posted on 19:30 by Unknown
Speaker Identification is the process of identifying different speakers in the audio file/transcript.  Usually, speaker identification is done by one of the following methods.

1.    Reference Material/Agenda – In case of a meeting/conference/symposium, if we have the minute-to-minute agenda of the meeting along with the names of the speakers, we try to match the speaker to the speaker names as given in the agenda.  This is the simplest way to get accurate speaker names.  So, we always encourage our clients to send us the agenda or draft of the meeting/conference/symposium at the time of job confirmation so as to get accurate speaker identification for their jobs.

2.    Video Reference –Speaker identification can be made easy if the client provides professionally recorded video files as references where the focus is completely on the speakers.  In such cases, the transcriber can easily identify the speakers by viewing the video.

3.    Googling/Research skill based – There are many cases where there is no reference or speaker names provided by the client.  At this time, it is a very challenging task for the transcriber.  The transcriber tries to identify the voices of different speakers, simultaneously differentiating them into Speaker1 or Speaker2.  If the speakers identify themselves while speaking, the transcriber then uses his Googling/search engine skills to find out the name on the internet.  He then tries to relate the speaker name to the content of the file so as to judge whether it might be the same speaker.  The limitation in this case is that there is a high chance of identifying the wrong speaker as the internet is not a very reliable source.  This is because the search engine shows up a very wide variety of results with the same name.

4.    Voice differentiation – This is the most difficult method.  If there is no reference material or agenda and if there are more than 3 to 4 speakers in the audio, then it is very challenging to differentiate the speakers based on their voice tone.  This is especially so when it is a discussion where the speakers’ voices overlap and one cannot decipher what each speaker is saying.  Identification of speakers is almost impossible if the audio quality is bad.  In such cases, we try our best to use our listening skills to differentiate the voices to our best possible ability.  The limitation in this case is that it is based on the listening skills of the transcriber and there are high chances of mix up of speakers.

As our standard service at Cripton, Speaker Identification is done as [Male] and [Female] or Interviewer or Interviewee.  Having said that, our Transcribers/Editors always strive to do speaker identification to their best ability by applying one of the above-mentioned methods.  But these methods have their limitations; for more than three speakers, it becomes difficult to identify a particular speaker without a set agenda as a reference.

Hence, the only way to expect correct speaker identification is to provide proper reference material in the form of agenda/draft or speaker names of the conference/meeting/symposium, video files, etc.  Also, it is very important that the client uploads all the reference material along with the job itself and not at a later time as speaker identification is done at a primary stage when the transcriber is working on the document.  Hence, it is always advisable for the client to upload all the appropriate reference materials along with the audio jobs.  This will ensure the delivery of a transcript with accurate speaker identification.
Read More
Posted in | No comments
Newer Posts Older Posts Home
Subscribe to: Posts (Atom)

Popular Posts

  • Why transcripts contain [Unclear]s?
    In general, whenever a client submits a job to us for transcription purpose, we make a commitment to the client that the transcript will be ...
  • Cripton's Transcribers: Their Background and Experience
    Cripton , a division of Cactus Communications Pvt. Ltd., provides English transcription services to academic and corporate clients. Cripton...
  • Key Factors for an Effective Interview
    A conversation between two or more people – the interviewer(s) and the interviewee(s) – where questions are asked by the interviewer to obta...
  • Equipments we use - 2 (Software- 1)
    The transcription process, as mentioned in our previous section, begins with the audio/video recording to the finalizing of the transcribed ...
  • The Challenges of Mixed Audio Transcription
    In today’s article, we are going to focus on the challenges of mixed audio transcription. An audio which contains two more than two language...
  • BPO and Transcription Industry in India – An Overview 2/2
    When transcription first began in India, there were various institutes that imparted professional transcription training in lieu of a fee.  ...
  • Equipments we use - 1 (Hardware)
    In this serial topic, we explain about the equipments which enable/enhance transcription work. Today, we cover hardware (and software in the...
  • Why does inserting timestamps necessitate a premium charge?
    In today’s article, we will focus on inserting timestamps in transcripts. · Entering timestamps is largely a manual task performed by trans...
  • Speaker Identification – Expectations and Limitations
    Speaker Identification is the process of identifying different speakers in the audio file/transcript.   Usually, speaker identification is d...
  • Member profile(1): Majid
    In this topic , we will cover our Cripton staffs; in every issue we close up one staff and interview about how he works at Cripton, what ...

Blog Archive

  • ▼  2011 (12)
    • ►  September (2)
    • ►  July (1)
    • ►  June (1)
    • ►  May (3)
    • ►  April (2)
    • ►  February (1)
    • ▼  January (2)
      • Reliability of Voice Recognition Technology
      • Speaker Identification – Expectations and Limitations
  • ►  2010 (6)
    • ►  December (3)
    • ►  November (3)
Powered by Blogger.

About Me

Unknown
View my complete profile