US20030097253A1 - Device to edit a text in predefined windows - Google Patents
Device to edit a text in predefined windows Download PDFInfo
- Publication number
- US20030097253A1 US20030097253A1 US10/294,016 US29401602A US2003097253A1 US 20030097253 A1 US20030097253 A1 US 20030097253A1 US 29401602 A US29401602 A US 29401602A US 2003097253 A1 US2003097253 A1 US 2003097253A1
- Authority
- US
- United States
- Prior art keywords
- text
- spoken
- editing
- recognized
- spoken text
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000013518 transcription Methods 0.000 claims abstract description 85
- 230000035897 transcription Effects 0.000 claims abstract description 85
- 238000000034 method Methods 0.000 claims description 17
- 230000004913 activation Effects 0.000 claims description 10
- 238000004590 computer program Methods 0.000 claims description 9
- 230000001360 synchronised effect Effects 0.000 claims description 7
- 230000000007 visual effect Effects 0.000 claims description 2
- 239000012141 concentrate Substances 0.000 description 3
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/225—Feedback of the input speech
Definitions
- the invention relates to a transcription device for the transcription of a spoken text into a recognized text and for editing the recognized text.
- the invention further relates to an editing device for editing a text recognized by a transcription device.
- the invention further relates to an editing process for editing a text recognized during the execution of a transcription process.
- the invention further relates to a computer program product which may be loaded directly into the internal memory of a digital computer and comprises software code sections.
- a transcription device of this kind, an editing device of this kind, an editing process of this kind, and a computer program product of this kind are known from the document U.S. Pat. No. 5,267,155, in which a so-called “online” dictation device is disclosed.
- the known dictation device is formed by a computer which executes voice recognition software and text processing software.
- a user of the known dictation device may dictate a spoken text into a microphone connected to the computer.
- the voice recognition software forming transcription means executes a voice recognition process and in doing so assigns a recognized word to each spoken word of the spoken text, thereby obtaining recognized text for the spoken text.
- the computer which executes the text processing software computer forms an editing device and stores the recognized text and facilitates the editing or correction of the recognized text.
- a monitor is connected to the computer, and editing means in the editing device facilitate the display of texts in several display windows shown on the monitor simultaneously.
- a first display window shows a standard text
- a second display window shows words which may be inserted in the standard text.
- the user of the known dictation device can position a text cursor in the first display window forming an input window at a specific position in the standard text and speak one of the insertable words shown in the second display window into the microphone.
- the spoken word is recognized by the transcription means and the recognized word is inserted into the standard text at the position of the text cursor. This facilitates the simple generation of standard letters, which may be adapted by the user for the individual case in question by means of spoken words.
- the known transcription device also facilitates the completion of forms with the aid of spoken commands and spoken texts.
- the editing means displays the form to be completed in a display window and the user may speak into a microphone firstly a command to mark the field in the form and then the text to be entered into this marked field of the form.
- a transcription device of this kind is provided with features according to the invention so that the transcription device may be characterized in the way described in the following.
- a transcription device for the transcription of a spoken text into a recognized text and for editing the recognized text, with
- reception means for the reception of the spoken text together with associated marking information which assignsparts of the spoken text to specific display windows, and with
- transcription means for transcribing the spoken text and for outputting the associated recognized text, and with
- storage means for storing the spoken text, the marking information, and the recognized text, and with
- editing means for editing the recognized text such that it is possible to display the recognized text visually in at least two display windows in accordance with the associated marking information.
- an editing device of this type is provided with features according to the invention, so that the editing device may be characterized in the way described in the following.
- An editing device for editing a text recognized by a transcription device with
- reception means for receiving a spoken text together with associated marking information which assignsparts of the spoken text to specific display windows, and for receiving a text recognized by the transcription device for the spoken text, and with
- storage means for storing the spoken text, the marking information, and the recognized text, and with
- editing means for editing the recognized text such that it is possible to display the recognized text visually in at least two display windows in accordance with the associated marking information.
- an editing process of this kind is provided with features according to the invention, so that the editing process may be characterized in the way described in the following.
- a computer program product which may be loaded directly into the internal memory of a digital computer and which comprises software code sections such that the computer executes the steps of the process in accordance with claim 10 when the product is running on the computer.
- the features according to the invention enable an author of a dictation or the spoken text to assign these parts of the spoken text to specific display windows, in which the associated recognized text is to be displayed after the automatic transcription by the transcription device, during the dictation already.
- This is particularly advantageous with a so-called “offline” transcription device to which the author transmits the dictation and by which the automatic transcription is first performed.
- the text automatically recognized by the transcription device is manually edited by a corrector with the aid of the editing device.
- each part of the recognized text shown in a display window is also stored in an individual computer file. These parts of the recognized text stored in separate computer files may subsequently be subjected to different types of processing, which is also advantageous.
- the measures in claim 2, in claim 8, and in claim 11 achieve the advantage that during the acoustic reproduction of the spoken text stored in the storage means, to support the manual correction by the corrector, the display window is automatically activated as an input window containing the recognized text for the spoken text which has just been acoustically reproduced. This means that the corrector can concentrate on the correction of the recognized text and does not first need to activate the associated display window for a correction to the recognized text.
- the parts of the recognized text are displayed in several display windows, it may occur that not all display windows are visible simultaneously. In addition, it may be desirable always only to display one display window on the monitor.
- the measures in claim 3, in claim 9, and in claim 12 achieve the advantage that the display of the display window containing the recognized text for that spoken text that has just been reproduced is automatically activated. In this way, there is an advantageous automatic switch between the display windows containing the recognized text during the acoustic reproduction of the spoken text.
- the measures in claim 4 achieve the advantage that they permit a synchronous type of reproduction to support the corrector during the correction of the recognized text.
- the measures in claim 5 achieve the advantage that the link information transmitted by the transcription device for the synchronous type of reproduction is used as marking information, and the display windows corresponding to the link information for the spoken text which has just been acoustically reproduced are activated.
- the author of the spoken text could use a button on the microphone or a button on his dictation device to enter marking information to mark parts of the spoken text.
- the measures in claim 6 achieve the advantage that the author can enter the marking information in the form of spoken commands. This greatly simplifies the entry of the marking information, and the author's microphone and dictation device do not have to provide input possibilities.
- FIG. 1 shows a transcription device for the transcription of a spoken text into a recognized text, with the parts of the recognized text being displayed in three different display windows.
- FIG. 2 shows the recognized text displayed on a monitor in three different display windows.
- FIG. 1 shows a transcription device 1 for the transcription of a spoken text GT into a recognized text ET and for editing incorrectly recognized text parts of the recognized text ET.
- the transcription device 1 facilitates a transcription service with which doctors from several hospitals may dictate medical histories as the spoken text GT with the aid of their telephones in order to obtain a written medical history as the recognized text ET by post or email from the transcription device 1 .
- the operators of the hospitals will pay the operator of the transcription service for the use of the transcription service. Transcription services of this kind are widely used particularly in America and save the hospitals a large number of typists.
- the transcription device 1 is formed by a first computer 2 and a large number of second computers 3 , of which second computers 3 , however, only one is shown in FIG. 1.
- the first computer 2 executes voice recognition software and in doing so forms transcription means 4 .
- the transcription means 4 are designed for the transcription of a spoken text GT received from a telephone 5 via a telephone network PSTN into a recognized text ET.
- Voice recognition software of this type has been known for a long time and was, for example, marketed by the applicant under the name “Speech MagicTM” and therefore will not be dealt with in any more detail here.
- the first computer 2 also has a telephone interface 6 .
- the telephone interface 6 forms reception means for the reception of the spoken text GT, which according to the invention also contains associated marking information MI.
- the marking information MI assigns parts of the spoken text GT to specific display windows D, which will be described in further detail with reference to FIG. 2.
- the first computer 2 also has storage means 7 for storing the received spoken text GT, the marking information MI, and the text ET recognized by the transcription means 4 .
- the storage means 7 are formed from a RAM (random access memory) and from a hard disk in the first computer 2 .
- Correctors in the transcription services edit or correct the text ET recognized by the transcription means 4 .
- Each one of these correctors has access to one of these second computers 3 , which forms an editing device for editing the recognized text ET.
- the second computer 3 executes text processing software—such as, for example, “Word for Windows®”—and in doing so forms editing means 8 .
- Connected to the second computer 3 are a keyboard 9 , a monitor 10 , a loudspeaker 11 , and a data modem 12 .
- a text ET recognized by the transcription means 4 and edited with the editing means 8 may be transmitted by the editing means 8 via the data modem 12 and a data network NET to a third computer 13 belonging to the doctor in the hospital in the form of an email. This will be described in further detail with reference to the following example of an application of the transcription device 1 .
- the doctor uses the telephone 5 to dial the telephone number of the transcription device 1 and identifies himself to the transcription device 1 . To do this he says the words “Doctor's Data” and then states his name “Dr. Haunold”, his hospital “Rudolfwung” and a code number assigned to him “2352”.
- the doctor dictates the patient's data. To do this he says the words “Patient '5 Data” and “F. Mueller . . . male . . . forty seven . . . WGKK . . . onetwo . . . three”. Then, he starts to dictate the medical history. To do this, he says the words “Medical History” and “The patient . . . and had pain in his left leg . . . ”.
- the spoken words “Doctor's Data”, “Patient's Data” and “Medical History” form marking information MI for the assignment of parts of the spoken text GT to display windows, which will be described in more detail below.
- the telephone 5 will transmit a telephone signal via the telephone network PSTN to the telephone interface 6 containing the spoken text GT dictated by the doctor “Dr. Haunold”.
- the digital data containing the spoken text GT are then stored by the telephone interface 6 in the storage means 7 .
- the transcription means 4 determine the recognized text ET assigned to the stored spoken text GT during the execution of the voice recognition software and store it in the storage means 7 .
- the transcription means 4 are designed to recognize the spoken commands in the spoken text GT and to generate the marking information MI, which assigns the subsequent spoken text GT in the dictation to a display window.
- the marking information MI is also stored in the storage means 7 .
- a corrector starts to correct or edit the recognized text ET in the dictation by the doctor “Dr. Haunold” and accordingly uses the keyboard 9 to activate the second computer 3 , the monitor 10 displaying the image shown in FIG. 2.
- the editing means 8 are designed to output the spoken text GT read out from the storage means 7 to the loudspeaker 11 for the acoustic reproduction of the spoken text.
- the editing means 8 now have activation means 14 which are designed to activate the display of the display window during the acoustic reproduction of the spoken text GT, the display window being identified by the marking information MI assigned to the spoken text GT which has just been acoustically reproduced.
- the third display window D 3 could be displayed on the entire monitor 10 in order to enable a larger part of the medical history to be viewed at once. If the spoken text GT stored in the storage means 7 for which the associated recognized Text ET is displayed in the first display window D 1 is acoustically reproduced, then in accordance with the invention, the display of the first display window D 1 is activated and hence the first display window D 1 displayed in front of the third display window D 3 . This enables the corrector to listen to the spoken text GT and the relevant display windows D 1 to D 3 are activated at the correct time and shown in the foreground.
- the activation means 14 are also designed to activate the relevant display window assigned by the marking information MI as an input window for editing the recognized text ET during the acoustic reproduction of the spoken text GT. This achieves the advantage that, if the corrector recognizes an error in the recognized text ET or would like to make other changes to the recognized text ET, the display window for which he/she is currently listening to the associated spoken text GT is already activated as an input window.
- a display window is activated as an input window if a text cursor C is positioned and displayed therein.
- the text cursor C indicates the position in the recognized text ET at which a text entry by the corrector would be entered with the keyboard 9 .
- the first display window has a double frame and is hence identified to the corrector as the active display window and input window.
- the transcription means 4 are furthermore designed to determine link information during the transcription, said link information identifying the associated recognized text ET for each part of the spoken text GT.
- the editing means 8 are designed for the acoustic reproduction of the spoken text GT and for the synchronous visual marking of the associated recognized text ET identified by the link information.
- the display window may also be activated at the correct time by means of the link information. Therefore, in this case, the link information also forms marking information for the activation of display windows.
- a user of the transcription device 1 can enter marking information MI in many different ways. For example, he could actuate a button on the keypad of the telephone 5 at the beginning and/or end of each part of the spoken text GT to be assigned to a display window. The user could also record the dictation in advance with a dictation device and use a marking button on the dictation device to enter marking information MI. However, it is particularly advantageous -as explained with reference to the application-example- to enter marking information MI for marking parts of the spoken text GT by spoken commands contained in the spoken text GT.
- the transcription device 1 could also be formed by a computer which executes voice recognition software and text processing software.
- This one computer could, for example, be formed by a server connected to the Internet.
- the division of the parts of the recognized text ET into files according to the invention in accordance with the user's marking information MI may be performed by the transcription means 4 .
- the editing means 8 would display parts of the recognized text in separate files in separate display windows, such as is the case, for example, with Windows® programs.
- a computer program product in accordance with the invention which is executed by the computer, may be stored on an optically or magnetically readable data carrier.
- an editing device in accordance with the invention may alternatively be designed for the manual typist of a spoken text together with the associated marking information.
- a typist would listen to the spoken text and write it manually with the aid of the computer keyboard.
- activation means would activate the associated display window as an input window in accordance with the marking information assigned to the spoken text at the correct time and position the text cursor in the input window. This achieves the advantage that the typist only has to concentrate on entering the text and not on changing the input window.
- the spoken text and the marking information may also be received by a digital dictation device as digital data via a data modem in the transcription device.
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Document Processing Apparatus (AREA)
Abstract
A user of a transcription device (1) can output a spoken text (GT) with associated marking information (MI) to the transcription device (1). The transcription device (1) performs an automatic transcription of the spoken text (GT) into a recognized text (ET) and assigns parts of the recognized text (ET) to display windows (D1, D2, D3) in accordance with the marking information (MI). The parts of the recognized texts (ET) are displayed in the display windows (D1, D2, D3) identified by the marking information (MI), with the corresponding display window (D1, D2, D3) being activated at the correct time during the acoustic reproduction of the spoken text (GT).
Description
- The invention relates to a transcription device for the transcription of a spoken text into a recognized text and for editing the recognized text.
- The invention further relates to an editing device for editing a text recognized by a transcription device.
- The invention further relates to an editing process for editing a text recognized during the execution of a transcription process.
- The invention further relates to a computer program product which may be loaded directly into the internal memory of a digital computer and comprises software code sections.
- A transcription device of this kind, an editing device of this kind, an editing process of this kind, and a computer program product of this kind are known from the document U.S. Pat. No. 5,267,155, in which a so-called “online” dictation device is disclosed. The known dictation device is formed by a computer which executes voice recognition software and text processing software. A user of the known dictation device may dictate a spoken text into a microphone connected to the computer. The voice recognition software forming transcription means executes a voice recognition process and in doing so assigns a recognized word to each spoken word of the spoken text, thereby obtaining recognized text for the spoken text.
- The computer which executes the text processing software computer forms an editing device and stores the recognized text and facilitates the editing or correction of the recognized text. A monitor is connected to the computer, and editing means in the editing device facilitate the display of texts in several display windows shown on the monitor simultaneously. Here, a first display window shows a standard text and a second display window shows words which may be inserted in the standard text.
- The user of the known dictation device can position a text cursor in the first display window forming an input window at a specific position in the standard text and speak one of the insertable words shown in the second display window into the microphone. The spoken word is recognized by the transcription means and the recognized word is inserted into the standard text at the position of the text cursor. This facilitates the simple generation of standard letters, which may be adapted by the user for the individual case in question by means of spoken words.
- The known transcription device also facilitates the completion of forms with the aid of spoken commands and spoken texts. For this, the editing means displays the form to be completed in a display window and the user may speak into a microphone firstly a command to mark the field in the form and then the text to be entered into this marked field of the form.
- It has been found to be a drawback with the known transcription device that a user always has to activate the display window in which the text recognized by the transcription device is to be displayed. Another drawback identified is that the user does not receive any support from the editing device when editing the text recognized by the transcription device.
- It is an object of the invention to create a transcription device of the type specified in the first paragraph, an editing device of the type specified in the second paragraph, an editing process of the type described in the third paragraph, and a computer program product of the type described in the fourth paragraph in which the aforesaid drawbacks are avoided.
- To achieve the aforesaid object, a transcription device of this kind is provided with features according to the invention so that the transcription device may be characterized in the way described in the following.
- A transcription device for the transcription of a spoken text into a recognized text and for editing the recognized text, with
- reception means for the reception of the spoken text together with associated marking information which assignsparts of the spoken text to specific display windows, and with
- transcription means for transcribing the spoken text and for outputting the associated recognized text, and with
- storage means for storing the spoken text, the marking information, and the recognized text, and with
- editing means for editing the recognized text such that it is possible to display the recognized text visually in at least two display windows in accordance with the associated marking information.
- To achieve the aforesaid object, an editing device of this type is provided with features according to the invention, so that the editing device may be characterized in the way described in the following.
- An editing device for editing a text recognized by a transcription device with
- reception means for receiving a spoken text together with associated marking information which assignsparts of the spoken text to specific display windows, and for receiving a text recognized by the transcription device for the spoken text, and with
- storage means for storing the spoken text, the marking information, and the recognized text, and with
- editing means for editing the recognized text such that it is possible to display the recognized text visually in at least two display windows in accordance with the associated marking information.
- To achieve the aforesaid object, an editing process of this kind is provided with features according to the invention, so that the editing process may be characterized in the way described in the following.
- An editing process for editing a text recognized during the execution of a transcription process with the following steps being executed:
- reception of a spoken text together with associated marking information which assigns parts of the spoken text to specific display windows;
- reception of a recognized text for the spoken text during the transcription process;
- storage of the spoken text, the marking information, and the recognized text;
- editing of the recognized text, such that it is possible to display the recognized text visually in at least two display windows in accordance with the associated marking information.
- To achieve the aforesaid object, a computer program product of this type is provided with features according to the invention, so that the computer program product may be characterized in the way described in the following.
- A computer program product which may be loaded directly into the internal memory of a digital computer and which comprises software code sections such that the computer executes the steps of the process in accordance with
claim 10 when the product is running on the computer. - The features according to the invention enable an author of a dictation or the spoken text to assign these parts of the spoken text to specific display windows, in which the associated recognized text is to be displayed after the automatic transcription by the transcription device, during the dictation already. This is particularly advantageous with a so-called “offline” transcription device to which the author transmits the dictation and by which the automatic transcription is first performed. Following this, the text automatically recognized by the transcription device is manually edited by a corrector with the aid of the editing device.
- Therefore, advantageously, the corrector does not have to worry about distributing the recognized text over display windows. Usually, each part of the recognized text shown in a display window is also stored in an individual computer file. These parts of the recognized text stored in separate computer files may subsequently be subjected to different types of processing, which is also advantageous.
- The measures in
claim 2, inclaim 8, and inclaim 11 achieve the advantage that during the acoustic reproduction of the spoken text stored in the storage means, to support the manual correction by the corrector, the display window is automatically activated as an input window containing the recognized text for the spoken text which has just been acoustically reproduced. This means that the corrector can concentrate on the correction of the recognized text and does not first need to activate the associated display window for a correction to the recognized text. - If the parts of the recognized text are displayed in several display windows, it may occur that not all display windows are visible simultaneously. In addition, it may be desirable always only to display one display window on the monitor. The measures in
claim 3, inclaim 9, and inclaim 12 achieve the advantage that the display of the display window containing the recognized text for that spoken text that has just been reproduced is automatically activated. In this way, there is an advantageous automatic switch between the display windows containing the recognized text during the acoustic reproduction of the spoken text. - The measures in
claim 4 achieve the advantage that they permit a synchronous type of reproduction to support the corrector during the correction of the recognized text. - The measures in
claim 5 achieve the advantage that the link information transmitted by the transcription device for the synchronous type of reproduction is used as marking information, and the display windows corresponding to the link information for the spoken text which has just been acoustically reproduced are activated. - The author of the spoken text could use a button on the microphone or a button on his dictation device to enter marking information to mark parts of the spoken text. The measures in
claim 6 achieve the advantage that the author can enter the marking information in the form of spoken commands. This greatly simplifies the entry of the marking information, and the author's microphone and dictation device do not have to provide input possibilities. - The invention will be further described with reference to embodiments shown in the drawings to which, however, the invention is not restricted.
- FIG. 1 shows a transcription device for the transcription of a spoken text into a recognized text, with the parts of the recognized text being displayed in three different display windows.
- FIG. 2 shows the recognized text displayed on a monitor in three different display windows.
- FIG. 1 shows a
transcription device 1 for the transcription of a spoken text GT into a recognized text ET and for editing incorrectly recognized text parts of the recognized text ET. Thetranscription device 1 facilitates a transcription service with which doctors from several hospitals may dictate medical histories as the spoken text GT with the aid of their telephones in order to obtain a written medical history as the recognized text ET by post or email from thetranscription device 1. The operators of the hospitals will pay the operator of the transcription service for the use of the transcription service. Transcription services of this kind are widely used particularly in America and save the hospitals a large number of typists. - The
transcription device 1 is formed by afirst computer 2 and a large number ofsecond computers 3, of whichsecond computers 3, however, only one is shown in FIG. 1. Thefirst computer 2 executes voice recognition software and in doing so forms transcription means 4. The transcription means 4 are designed for the transcription of a spoken text GT received from atelephone 5 via a telephone network PSTN into a recognized text ET. Voice recognition software of this type has been known for a long time and was, for example, marketed by the applicant under the name “Speech Magic™” and therefore will not be dealt with in any more detail here. - The
first computer 2 also has atelephone interface 6. Thetelephone interface 6 forms reception means for the reception of the spoken text GT, which according to the invention also contains associated marking information MI. The marking information MI assigns parts of the spoken text GT to specific display windows D, which will be described in further detail with reference to FIG. 2. - The
first computer 2 also has storage means 7 for storing the received spoken text GT, the marking information MI, and the text ET recognized by the transcription means 4. The storage means 7 are formed from a RAM (random access memory) and from a hard disk in thefirst computer 2. - Correctors in the transcription services edit or correct the text ET recognized by the transcription means4. Each one of these correctors has access to one of these
second computers 3, which forms an editing device for editing the recognized text ET. Thesecond computer 3 executes text processing software—such as, for example, “Word for Windows®”—and in doing so forms editing means 8. Connected to thesecond computer 3 are akeyboard 9, amonitor 10, aloudspeaker 11, and adata modem 12. A text ET recognized by the transcription means 4 and edited with the editing means 8 may be transmitted by the editing means 8 via thedata modem 12 and a data network NET to athird computer 13 belonging to the doctor in the hospital in the form of an email. This will be described in further detail with reference to the following example of an application of thetranscription device 1. - For the purposes of the example of an application, it is assumed that a doctor “Dr. Haunold” from the hospital “Rudolfstiftung” dictates the medical history of a patient “F. Mueller” in order to obtain a written medical history. In addition, at the same time all the data required for arrangement payment for the transcription services with the operator of the transcription services and for arranging payment for the medical services with the medical insurance scheme are to be entered into the relevant databases.
- In order to use the transcription service, the doctor uses the
telephone 5 to dial the telephone number of thetranscription device 1 and identifies himself to thetranscription device 1. To do this he says the words “Doctor's Data” and then states his name “Dr. Haunold”, his hospital “Rudolfstiftung” and a code number assigned to him “2352”. - Then, the doctor dictates the patient's data. To do this he says the words “Patient '5 Data” and “F. Mueller . . . male . . . forty seven . . . WGKK . . . onetwo . . . three”. Then, he starts to dictate the medical history. To do this, he says the words “Medical History” and “The patient . . . and had pain in his left leg . . . ”. Here, the spoken words “Doctor's Data”, “Patient's Data” and “Medical History” form marking information MI for the assignment of parts of the spoken text GT to display windows, which will be described in more detail below.
- The
telephone 5 will transmit a telephone signal via the telephone network PSTN to thetelephone interface 6 containing the spoken text GT dictated by the doctor “Dr. Haunold”. The digital data containing the spoken text GT are then stored by thetelephone interface 6 in the storage means 7. - The transcription means4 then determine the recognized text ET assigned to the stored spoken text GT during the execution of the voice recognition software and store it in the storage means 7. In addition, the transcription means 4 are designed to recognize the spoken commands in the spoken text GT and to generate the marking information MI, which assigns the subsequent spoken text GT in the dictation to a display window. The marking information MI is also stored in the storage means 7.
- If a corrector starts to correct or edit the recognized text ET in the dictation by the doctor “Dr. Haunold” and accordingly uses the
keyboard 9 to activate thesecond computer 3, themonitor 10 displaying the image shown in FIG. 2. The part of the recognized text identified by the marking information MI=“Doctor's Data” is inserted by the editing means 8 into a form in a first display window D1. This is possible because when making his dictation the doctor adhered to the sequence of the data to be entered into the form. The part of the recognized text identified by the marking information MI=“Patient's Data” is entered into a form in a second display window D2, and the part of the recognized text identified by the marking information MI=“medical history” is inserted in a text field in a third display window D3. - This achieves the advantage that the corrector did not have to divide the text ET recognized by the transcription means4 into parts and assign them to the individual display windows D1 to D3 by means of “Copy” and “Insert” manually. Another advantage achieved is that, owing to the marking information MI, parts of the recognized text ET assigned to a display window were also stored in their own files. The fact that this does not have to be the case, however, is also particularly advantageous in this application, because the data for the calculating the accounts with the operator of the transcription services and the medical insurance scheme have to be processed differently.
- The editing means8 are designed to output the spoken text GT read out from the storage means 7 to the
loudspeaker 11 for the acoustic reproduction of the spoken text. The editing means 8 now have activation means 14 which are designed to activate the display of the display window during the acoustic reproduction of the spoken text GT, the display window being identified by the marking information MI assigned to the spoken text GT which has just been acoustically reproduced. - This is in particular advantageous if it is not possible to display all display windows simultaneously on the
monitor 10. For example, the third display window D3 could be displayed on theentire monitor 10 in order to enable a larger part of the medical history to be viewed at once. If the spoken text GT stored in the storage means 7 for which the associated recognized Text ET is displayed in the first display window D1 is acoustically reproduced, then in accordance with the invention, the display of the first display window D1 is activated and hence the first display window D1 displayed in front of the third display window D3. This enables the corrector to listen to the spoken text GT and the relevant display windows D1 to D3 are activated at the correct time and shown in the foreground. - The activation means14 are also designed to activate the relevant display window assigned by the marking information MI as an input window for editing the recognized text ET during the acoustic reproduction of the spoken text GT. This achieves the advantage that, if the corrector recognizes an error in the recognized text ET or would like to make other changes to the recognized text ET, the display window for which he/she is currently listening to the associated spoken text GT is already activated as an input window.
- It may be mentioned that a display window is activated as an input window if a text cursor C is positioned and displayed therein. The text cursor C indicates the position in the recognized text ET at which a text entry by the corrector would be entered with the
keyboard 9. As shown in FIG. 2, the first display window has a double frame and is hence identified to the corrector as the active display window and input window. - The transcription means4 are furthermore designed to determine link information during the transcription, said link information identifying the associated recognized text ET for each part of the spoken text GT. In addition, with the synchronous type of reproduction activated in the
transcription device 1, the editing means 8 are designed for the acoustic reproduction of the spoken text GT and for the synchronous visual marking of the associated recognized text ET identified by the link information. - This achieves the advantage that during the acoustic reproduction of the spoken text GT the relevant recognized word for the reproduced spoken word is visually marked, and in addition the active display window is changed at the correct time. The corrector can therefore advantageously concentrate particularly well on the content of the recognized text ET to be corrected.
- If the text ET recognized by the transcription means4 corresponding to the marking information MI has already been assigned display windows or files by the editing means 8, then during the synchronous type of reproduction, the display window may also be activated at the correct time by means of the link information. Therefore, in this case, the link information also forms marking information for the activation of display windows.
- A user of the
transcription device 1 can enter marking information MI in many different ways. For example, he could actuate a button on the keypad of thetelephone 5 at the beginning and/or end of each part of the spoken text GT to be assigned to a display window. The user could also record the dictation in advance with a dictation device and use a marking button on the dictation device to enter marking information MI. However, it is particularly advantageous -as explained with reference to the application-example- to enter marking information MI for marking parts of the spoken text GT by spoken commands contained in the spoken text GT. - It may be mentioned that the
transcription device 1 could also be formed by a computer which executes voice recognition software and text processing software. This one computer could, for example, be formed by a server connected to the Internet. - Similarly, the division of the parts of the recognized text ET into files according to the invention in accordance with the user's marking information MI may be performed by the transcription means4. In this case, the editing means 8 would display parts of the recognized text in separate files in separate display windows, such as is the case, for example, with Windows® programs.
- It may be mentioned that the measures according to the invention, in particular with so-called “offline” transcription devices—as described with reference to the application example—are advantageous. However, it is also possible to provide these measures with a so-called “online” transcription device, with which the words spoken by a user are directly transcribed by transcription means and displayed on a monitor.
- It may be mentioned that a computer program product in accordance with the invention, which is executed by the computer, may be stored on an optically or magnetically readable data carrier.
- It may be mentioned that an editing device in accordance with the invention may alternatively be designed for the manual typist of a spoken text together with the associated marking information. In this case, a typist would listen to the spoken text and write it manually with the aid of the computer keyboard. In accordance with the invention, activation means would activate the associated display window as an input window in accordance with the marking information assigned to the spoken text at the correct time and position the text cursor in the input window. This achieves the advantage that the typist only has to concentrate on entering the text and not on changing the input window.
- It may be mentioned that the spoken text and the marking information may also be received by a digital dictation device as digital data via a data modem in the transcription device.
Claims (14)
1. A transcription device (1) for the transcription of a spoken text (GT) into a recognized text (ET) and for editing the recognized text (ET), with
reception means (6) for receiving the spoken text (GT) together with associated marking information (MI) which assigns parts of the spoken text (GT) to specific display windows (D1, D2, D3), and with
transcription means (4) for transcribing the spoken text (GT) and for outputting the associated recognized text (ET), and with
storage means (7) for storing the spoken text (GT), the marking information (MI), and the recognized text (ET), and with
editing means (8) for editing the recognized text (ET), such that it is possible to display the recognized Text (ET) visually in at least two display windows (D1, D2, D3) in accordance with the associated marking information (MI).
2. A transcription device (1) as claimed in claim 1 , wherein it is possible to reproduce the spoken text (GT) acoustically, and activation means (14) are provided which are designed to activate the display window (D1, D2, D3) as an input window for editing the recognized text (ET) during the acoustic reproduction of the spoken text (GT), said display window (D1, D2, D3) being identified by the marking information (MI) assigned to the spoken text (GT) which has just been acoustically reproduced.
3. A transcription device (1) as claimed in claim 1 , wherein activation means (14) are provided which are designed to activate the display of the display window (D1, D2, D3) during the acoustic reproduction of the spoken text (GT), said display window (D1, D2, D3) being identified by the marking information (MI) assigned to the spoken text (GT) which has just been acoustically reproduced.
4. A transcription device (1) as claimed in claim 1 , wherein the transcription means (4) are designed to determine link information during transcription, said link information identifying the associated recognized text (ET) for every part of the spoken text (GT), and wherein, with the synchronous type of reproduction activated in the transcription device (1), the editing means (8) are designed for the acoustic reproduction of the spoken text (GT) and for the synchronous visual marking of the associated recognized text (ET) identified by the link information.
5. A transcription device (1) as claimed in claim 4 , wherein activation means (14) are provided which are designed to activate the display window (D1, D2, D3) as an input window for editing the recognized text (ET) during the acoustic reproduction of the spoken text (GT), said display window (D1, D2, D3) being identified by the link information assigned to the spoken text (GT) which has just been acoustically reproduced.
6. A transcription device (1) as claimed in claim 1 , wherein the marking information (MI) is formed by a spoken command which is contained in the spoken text (GT) at the beginning and/or the end of a respective part of the spoken text (GT) assigned to a display window (D1, D2, D3).
7. An editing device (3) for editing a text (ET) recognized by a transcription device (1) with
reception means for receiving a spoken text (GT) together with the associated marking information (MI) which assigns parts of the spoken text (GT) to specific display windows (D1, D2, D3), and for receiving a text (ET) recognized by the transcription device (1) for the spoken text (GT), and with
storage means for storing the spoken text (GT), the marking information (MI), and the recognized text (ET), and with
editing means (8) for editing the recognized text (ET), such that it is possible to display visually the recognized text (ET) in at least two display windows (D1, D2, D3) in accordance with the associated marking information (MI).
8. An editing device (3) as claimed in claim 7 , wherin it is possible to acoustically reproduce the spoken text (GT), and activation means (14) are provided which are designed for the activation of the display window (D1, D2, D3) as an input window for editing the recognized text (ET) during the acoustic reproduction of the spoken text (GT), said display window (D1, D2, D3) being identified by the marking information (MI) assigned to the spoken text (GT) which has just been acoustically reproduced.
9. An editing device (3) as claimed in claim 7 , wherein activation means (14) are provided which are designed to activate the display of the display window (D1, D2, D3) during the acoustic reproduction of the spoken text (GT), said display window (D1, D2, D3) being identified by the marking information (MI) assigned to the spoken text (GT) which has just been acoustically reproduced.
10. An editing process for editing a recognized text (ET) during the execution of an transcription process wherein the following steps are executed:
reception of a spoken text (GT) together with associated marking information (MI), which assigns parts of the spoken text (GT) to specific display windows (D1, D2, D3);
reception of a recognized text (ET) for the spoken text (GT) during the transcription process;
storage of the spoken text (GT), the marking information (MI), and the recognized text (ET);
editing of the recognized text (ET), such that it is possible to display the recognized text (ET) visually in at least two display windows (D1, D2, D3) in accordance with the associated marking information (MI).
11. An editing process as claimed in claim 10 , wherein the following further step is executed: acoustic reproduction of the spoken text (GT) wherein the display window (D1, D2, D3) is activated as an input window for editing the recognized text (ET) during the acoustic reproduction of the spoken text (GT) said display window (D1, D2, D3) being identified by the marking information assigned to the spoken text (GT) which has just been acoustically reproduced.
12. An editing process as claimed in claim 10 , with the following further step being executed:, display of the display window (D1, D2, D3) is activated during the acoustic reproduction of the spoken text (GT), said display window (D1, D2, D3) being identified by the marking information (MI) assigned to the spoken text (GT) which has just been acoustically reproduced.
13. A computer program product which may be loaded directly into the internal memory of a digital computer (1) and contains software code sections, wherein the computer (1) executes the steps of the procedure as claimed in claim 10 when the product is run on the computer (1).
14. A computer program product as claimed in claim 13 , said product being stored on a computer-readable medium.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP01000639 | 2001-11-16 | ||
EP01000639.3 | 2001-11-16 |
Publications (1)
Publication Number | Publication Date |
---|---|
US20030097253A1 true US20030097253A1 (en) | 2003-05-22 |
Family
ID=8176089
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/294,016 Abandoned US20030097253A1 (en) | 2001-11-16 | 2002-11-12 | Device to edit a text in predefined windows |
Country Status (5)
Country | Link |
---|---|
US (1) | US20030097253A1 (en) |
EP (1) | EP1456838A1 (en) |
JP (1) | JP2005509906A (en) |
CN (1) | CN1585969A (en) |
WO (1) | WO2003042975A1 (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030212554A1 (en) * | 2002-05-09 | 2003-11-13 | Vatland Danny James | Method and apparatus for processing voice data |
US20050091064A1 (en) * | 2003-10-22 | 2005-04-28 | Weeks Curtis A. | Speech recognition module providing real time graphic display capability for a speech recognition engine |
US20080235014A1 (en) * | 2005-10-27 | 2008-09-25 | Koninklijke Philips Electronics, N.V. | Method and System for Processing Dictated Information |
US20160078865A1 (en) * | 2014-09-16 | 2016-03-17 | Lenovo (Beijing) Co., Ltd. | Information Processing Method And Electronic Device |
US10423721B2 (en) * | 2006-06-29 | 2019-09-24 | Nuance Communications, Inc. | Insertion of standard text in transcription |
US10665231B1 (en) * | 2019-09-06 | 2020-05-26 | Verbit Software Ltd. | Real time machine learning-based indication of whether audio quality is suitable for transcription |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060036438A1 (en) * | 2004-07-13 | 2006-02-16 | Microsoft Corporation | Efficient multimodal method to provide input to a computing device |
US8639505B2 (en) * | 2008-04-23 | 2014-01-28 | Nvoq Incorporated | Method and systems for simplifying copying and pasting transcriptions generated from a dictation based speech-to-text system |
TWI664536B (en) * | 2017-11-16 | 2019-07-01 | 棣南股份有限公司 | Phonetic control method and phonetic control system of clerical editing software |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5267155A (en) * | 1989-10-16 | 1993-11-30 | Medical Documenting Systems, Inc. | Apparatus and method for computer-assisted document generation |
US5799273A (en) * | 1996-09-24 | 1998-08-25 | Allvoice Computing Plc | Automated proofreading using interface linking recognized words to their audio data while text is being changed |
US5873064A (en) * | 1996-11-08 | 1999-02-16 | International Business Machines Corporation | Multi-action voice macro method |
US5960447A (en) * | 1995-11-13 | 1999-09-28 | Holt; Douglas | Word tagging and editing system for speech recognition |
US20010018653A1 (en) * | 1999-12-20 | 2001-08-30 | Heribert Wutte | Synchronous reproduction in a speech recognition system |
US6477499B1 (en) * | 1992-03-25 | 2002-11-05 | Ricoh Company, Ltd. | Window control apparatus and method having function for controlling windows by means of voice-input |
US6611802B2 (en) * | 1999-06-11 | 2003-08-26 | International Business Machines Corporation | Method and system for proofreading and correcting dictated text |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2001031634A1 (en) * | 1999-10-28 | 2001-05-03 | Qenm.Com, Incorporated | Proofreading system and method |
-
2002
- 2002-10-29 WO PCT/IB2002/004588 patent/WO2003042975A1/en not_active Application Discontinuation
- 2002-10-29 JP JP2003544728A patent/JP2005509906A/en active Pending
- 2002-10-29 EP EP02781470A patent/EP1456838A1/en not_active Withdrawn
- 2002-10-29 CN CNA028226216A patent/CN1585969A/en active Pending
- 2002-11-12 US US10/294,016 patent/US20030097253A1/en not_active Abandoned
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5267155A (en) * | 1989-10-16 | 1993-11-30 | Medical Documenting Systems, Inc. | Apparatus and method for computer-assisted document generation |
US6477499B1 (en) * | 1992-03-25 | 2002-11-05 | Ricoh Company, Ltd. | Window control apparatus and method having function for controlling windows by means of voice-input |
US5960447A (en) * | 1995-11-13 | 1999-09-28 | Holt; Douglas | Word tagging and editing system for speech recognition |
US5799273A (en) * | 1996-09-24 | 1998-08-25 | Allvoice Computing Plc | Automated proofreading using interface linking recognized words to their audio data while text is being changed |
US5873064A (en) * | 1996-11-08 | 1999-02-16 | International Business Machines Corporation | Multi-action voice macro method |
US6611802B2 (en) * | 1999-06-11 | 2003-08-26 | International Business Machines Corporation | Method and system for proofreading and correcting dictated text |
US20010018653A1 (en) * | 1999-12-20 | 2001-08-30 | Heribert Wutte | Synchronous reproduction in a speech recognition system |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030212554A1 (en) * | 2002-05-09 | 2003-11-13 | Vatland Danny James | Method and apparatus for processing voice data |
US7590534B2 (en) * | 2002-05-09 | 2009-09-15 | Healthsense, Inc. | Method and apparatus for processing voice data |
US20050091064A1 (en) * | 2003-10-22 | 2005-04-28 | Weeks Curtis A. | Speech recognition module providing real time graphic display capability for a speech recognition engine |
US20080235014A1 (en) * | 2005-10-27 | 2008-09-25 | Koninklijke Philips Electronics, N.V. | Method and System for Processing Dictated Information |
US8452594B2 (en) * | 2005-10-27 | 2013-05-28 | Nuance Communications Austria Gmbh | Method and system for processing dictated information |
US8712772B2 (en) | 2005-10-27 | 2014-04-29 | Nuance Communications, Inc. | Method and system for processing dictated information |
US10423721B2 (en) * | 2006-06-29 | 2019-09-24 | Nuance Communications, Inc. | Insertion of standard text in transcription |
US11586808B2 (en) | 2006-06-29 | 2023-02-21 | Deliverhealth Solutions Llc | Insertion of standard text in transcription |
US20160078865A1 (en) * | 2014-09-16 | 2016-03-17 | Lenovo (Beijing) Co., Ltd. | Information Processing Method And Electronic Device |
US10699712B2 (en) * | 2014-09-16 | 2020-06-30 | Lenovo (Beijing) Co., Ltd. | Processing method and electronic device for determining logic boundaries between speech information using information input in a different collection manner |
US10665231B1 (en) * | 2019-09-06 | 2020-05-26 | Verbit Software Ltd. | Real time machine learning-based indication of whether audio quality is suitable for transcription |
Also Published As
Publication number | Publication date |
---|---|
CN1585969A (en) | 2005-02-23 |
EP1456838A1 (en) | 2004-09-15 |
WO2003042975A1 (en) | 2003-05-22 |
JP2005509906A (en) | 2005-04-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11704434B2 (en) | Transcription data security | |
US11699456B2 (en) | Automated transcript generation from multi-channel audio | |
US9396166B2 (en) | System and method for structuring speech recognized text into a pre-selected document format | |
US8407049B2 (en) | Systems and methods for conversation enhancement | |
KR101143034B1 (en) | Centralized method and system for clarifying voice commands | |
JP4558308B2 (en) | Voice recognition system, data processing apparatus, data processing method thereof, and program | |
US6915258B2 (en) | Method and apparatus for displaying and manipulating account information using the human voice | |
US20130298016A1 (en) | Multi-cursor transcription editing | |
US8612231B2 (en) | Method and system for speech based document history tracking | |
EA004352B1 (en) | Automated transcription system and method using two speech converting instances and computer-assisted correction | |
WO2004072846A2 (en) | Automatic processing of templates with speech recognition | |
US20050209859A1 (en) | Method for aiding and enhancing verbal communication | |
US20190121860A1 (en) | Conference And Call Center Speech To Text Machine Translation Engine | |
US20030097253A1 (en) | Device to edit a text in predefined windows | |
JP2002099530A (en) | Minutes production device, method and storage medium using it | |
JP3936351B2 (en) | Voice response service equipment | |
US20210280193A1 (en) | Electronic Speech to Text Court Reporting System Utilizing Numerous Microphones And Eliminating Bleeding Between the Numerous Microphones | |
JP2001325250A (en) | Minutes preparation device, minutes preparation method and recording medium | |
US20070067168A1 (en) | Method and device for transcribing an audio signal | |
JP2005025571A (en) | Business support device, business support method and program thereof | |
KR101883365B1 (en) | Pronunciation learning system able to be corrected by an expert | |
JP2021076729A (en) | Transcription support method and transcription support device | |
JP2017212667A (en) | Language information providing device | |
US7590689B2 (en) | Associating multi-lingual audio recordings with objects in Internet presentation | |
US20050091064A1 (en) | Speech recognition module providing real time graphic display capability for a speech recognition engine |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: KONINKLIJKE PHILIPS ELECTRONICS N. V., NETHERLANDS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:HOI, DIETER;REEL/FRAME:013504/0512 Effective date: 20021025 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |