US10748533B2 - Proximity aware voice agent - Google Patents
Proximity aware voice agent Download PDFInfo
- Publication number
- US10748533B2 US10748533B2 US15/807,055 US201715807055A US10748533B2 US 10748533 B2 US10748533 B2 US 10748533B2 US 201715807055 A US201715807055 A US 201715807055A US 10748533 B2 US10748533 B2 US 10748533B2
- Authority
- US
- United States
- Prior art keywords
- name
- companion device
- user
- companion
- room location
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active, expires
Links
- 230000004044 response Effects 0.000 claims abstract description 29
- 230000009471 action Effects 0.000 claims abstract description 17
- 238000000034 method Methods 0.000 claims description 32
- 230000008569 process Effects 0.000 description 22
- 238000004891 communication Methods 0.000 description 6
- 230000005236 sound signal Effects 0.000 description 5
- 230000007246 mechanism Effects 0.000 description 4
- 238000012795 verification Methods 0.000 description 4
- 230000000007 visual effect Effects 0.000 description 3
- 230000001755 vocal effect Effects 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000001413 cellular effect Effects 0.000 description 2
- 238000013500 data storage Methods 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000003321 amplification Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B15/00—Systems controlled by a computer
- G05B15/02—Systems controlled by a computer electric
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
- G06F3/167—Audio in a user interface, e.g. using voice commands for navigating, audio feedback
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R5/00—Stereophonic arrangements
- H04R5/027—Spatial or constructional arrangements of microphones, e.g. in dummy heads
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1822—Parsing for meaning understanding
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
Definitions
- aspects of the disclosure generally relate to a proximity aware voice agent.
- voice agent devices such as voice agent devices are becoming increasingly popular. These devices may include voice controlled personal assistants that implement artificial intelligence based on user audio commands. Some examples of voice agent devices may include Amazon Echo, Amazon Dot, Google At Home, etc. Such voice agents may use voice commands as the main interface with processors of the same. The audio commands may be received at a microphone within the device. The audio commands may then be transmitted to the processor for implementation of the command.
- a personal assistant device configured to control companion devices may include a memory configured to maintain a companion device library including a plurality of companion devices each associated with at least one long-name, short-cut name and companion device room location, and a processor.
- the processor may be configured to receive a user command from a microphone, extract a companion device name and action from the user command, determine whether the companion device name includes a unique name, and command a companion device associated with the unique name to perform the action from the user command in response to the user command including the unique name.
- a personal assistant device configured to control companion devices may include a memory configured to maintain a companion device library including a plurality of companion device each associated with at least one long-name, short-cut name and companion device room location, a microphone configured to receive a user command, and a processor.
- the processor may be configured to receive the user command from the microphone, identify a user room location from the user command, extract a companion device name from the user command, identify a companion device room location based on the companion device name within the companion device library, determine whether the user room location is the same as the companion device room location, and provide at least one command to a companion device associated with the companion device name in response to the user room location being the same as the companion device room location.
- a method may include receiving a user command and extracting a companion device name and an action from the user command.
- the method may further include identifying a companion device room location based on the companion device name, determining whether the user command was received from a user room location that the same as the companion device room location; and providing at least one command to a companion device associated with the companion device name in response to the user room location being the same as the companion device room location.
- FIG. 1 illustrates a system including an example intelligent personal assistant device, in accordance with one or more embodiments
- FIG. 2 illustrates an example companion device library
- FIG. 3 illustrates an example home including a plurality of rooms, personal assistant devices, and companion devices
- FIG. 4 illustrates an example database for personal assistant identification
- FIG. 5 illustrates an example process for the personal assistant device to identify a room and provide a command to one of the companion devices.
- Personal assistant devices may include voice controlled personal assistants that implement artificial intelligence based on user audio commands.
- voice agent devices may include Amazon Echo, Amazon Dot, Google At Home, etc.
- voice agents may use voice commands as the main interface with processors of the same.
- the audio commands may be received at a microphone within the device.
- the audio commands may then be transmitted to the processor for implementation of the command.
- the audio commands may be transmitted externally, to a cloud based processor, such as those used by Amazon Echo, Amazon Dot, Google At Home, etc.
- a single home may include more than one personal assistant device.
- a home may include a personal assistant device located each of the kitchen, bedroom, home office, etc.
- the personal assistant devices may also be portable and may be moved from room to room within a home.
- the location of the personal assistant device may give valuable context and enable the device to better tailor the information that it provides, as well as control other device according to the location.
- each may be able to control other companion devices such as speakers, lights, alarm systems, etc.
- each of the kitchen and the bedroom may have lights controlled via audio commands via the personal assistant devices.
- the personal assistant device When a user is in the bedroom and says, “turn on the lights,” then the bedroom lights may turn on.
- the personal assistant device may instruct the wrong light to turn on, e.g., the kitchen lights to turn on.
- Users may label such lights as “bedroom lights.” However, such labeling or clustering may require that the user remember each label.
- guests and children may not know the label associated with each device or each room.
- a group or cluster could not have the same name such as “lights,” but would instead require distinct names for each group, such as “kitchen lights”.
- the voice agent device may then perform several verifications. The first verification may be determining whether the companion device name is recognized. For example, the voice agent device may determine whether it controls a companion device by the name given in the user command. The second verification may be determining whether the companion device name is a unique name. That is, is the device name one associated with a specific companion device or group of devices. For example, “kitchen light” would be a unique device name, whereas “light” would likely not be. The third verification may include determining whether the companion device defined by the device name is located within the room or location.
- the personal assistant device may determine which companion devices the user wishes to control. If the user says a command that include a devices name of a companion device that is not located within the user's current room, the personal assistant device may generate an alert and indicate the same to the user. Thus, less learning and memorizing on the part of the user is required, a more accurate control of companion device is achieved, and an overall more flexible and easier to use system is appreciated.
- FIG. 1 illustrates a system 100 including an example intelligent personal assistant device 102 .
- the personal assistant device 102 receives audio through a microphone 104 or other audio input, and passes the audio through an analog to digital (A/D) converter 106 to be identified or otherwise processed by an audio processor 108 .
- the audio processor 108 also generates speech or other audio output, which may be passed through a digital to analog (D/A) converter 112 and amplifier 114 for reproduction by one or more loudspeakers 116 .
- the personal assistant device 102 also includes a controller 118 connected to the audio processor 108 and configured to manage various companion devices via the companion device library 132 .
- the controller 118 also interfaces with a wireless transceiver 124 to facilitate communication of the personal assistant device 102 with a communications network 126 .
- the controller 118 also is connected to one or more Human Machine Interface (HMI) controls 128 to receive user input, as well as a display screen 130 to provide visual output.
- HMI Human Machine Interface
- the illustrated system 100 is merely an example, and more, fewer, and/or differently located elements may be used.
- the A/D converter 106 receives audio input signals from the microphone 104 .
- the A/D converter 106 converts the received signals from an analog format into a digital signal in a digital format for further processing by the audio processor 108 .
- the audio processors 108 may be included in the personal assistant device 102 .
- the audio processors 108 may be one or more computing devices capable of processing audio and/or video signals, such as a computer processor, microprocessor, a digital signal processor, or any other device, series of devices or other mechanisms capable of performing logical operations.
- the audio processors 108 may operate in association with a memory 110 to execute instructions stored in the memory 110 .
- the instructions may be in the form of software, firmware, computer code, or some combination thereof, and when executed by the audio processors 108 may provide the audio recognition and audio generation functionality of the personal assistant device 102 .
- the instructions may further provide for audio cleanup (e.g., noise reduction, filtering, etc.) prior to the recognition processing of the received audio.
- the memory 110 may be any form of one or more data storage devices, such as volatile memory, non-volatile memory, electronic memory, magnetic memory, optical memory, or any other form of data storage device.
- operational parameters and data may also be stored in the memory 110 , such as a phonemic vocabulary for the creation of speech from textual data.
- the D/A converter 112 receives the digital output signal from the audio processor 108 and converts it from a digital format to an output signal in an analog format. The output signal may then be made available for use by the amplifier 114 or other analog components for further processing.
- the amplifier 114 may be any circuit or standalone device that receives audio input signals of relatively small magnitude, and outputs similar audio signals of relatively larger magnitude. Audio input signals may be received by the amplifier 114 and output on one or more connections to the loudspeakers 116 . In addition to amplification of the amplitude of the audio signals, the amplifier 114 may also include signal processing capability to shift phase, adjust frequency equalization, adjust delay or perform any other form of manipulation or adjustment of the audio signals in preparation for being provided to the loudspeakers 116 . For instance, the loudspeakers 116 can be the primary medium of instruction when the device 102 has no display screen 130 or the user desires interaction that does not involve looking at the device. The signal processing functionality may additionally or alternately occur within the domain of the audio processor 108 . Also, the amplifier 114 may include capability to adjust volume, balance and/or fade of the audio signals provided to the loudspeakers 116 .
- the amplifier 114 may be omitted, such as when the loudspeakers 116 are in the form of a set of headphones, or when the audio output channels serve as the inputs to another audio device, such as an audio storage device or a further audio processor device.
- the loudspeakers 116 may include the amplifier 114 , such that the loudspeakers 116 are self-powered.
- the loudspeakers 116 may be of various sizes and may operate over various ranges of frequencies. Each of the loudspeakers 116 may include a single transducer, or in other cases multiple transducers. The loudspeakers 116 may also be operated in different frequency ranges such as a subwoofer, a woofer, a midrange and a tweeter. Multiple loudspeakers 116 may be included in the personal assistant device 102 .
- the controller 118 may include various types of computing apparatus in support of performance of the functions of the personal assist device 102 described herein.
- the controller 118 may include one or more processors 120 configured to execute computer instructions, and a storage medium 122 (or storage 122 ) on which the computer-executable instructions and/or data may be maintained.
- a computer-readable storage medium also referred to as a processor-readable medium or storage 122
- a processor 120 receives instructions and/or data, e.g., from the storage 122 , etc., to a memory and executes the instructions using the data, thereby performing one or more processes, including one or more of the processes described herein.
- Computer-executable instructions may be compiled or interpreted from computer programs created using a variety of programming languages and/or technologies including, without limitation, and either alone or in combination, Java, C, C++, C #, Assembly, Fortran, Pascal, Visual Basic, Python, Java Script, Perl, PL/SQL, etc.
- the processor 120 may be located within a cloud, another server, another one of the devices 102 , etc.
- the controller 118 may include a wireless transceiver 124 or other network hardware configured to facilitate communication between the controller 118 and other networked devices over the communications network 126 .
- the wireless transceiver 124 may be a cellular network transceiver configured to communicate data over a cellular telephone network.
- the wireless transceiver 124 may be a Wi-Fi transceiver configured to connect to a local-area wireless network to access the communications network 126 .
- the controller 118 may receive input from human machine interface (HMI) controls 128 to provide for user interaction with personal assistant device 102 .
- HMI human machine interface
- the controller 118 may interface with one or more buttons or other HMI controls 128 configured to invoke functions of the controller 118 .
- the controller 118 may also drive or otherwise communicate with one or more displays 130 configured to provide visual output to users, e.g., by way of a video controller.
- the display 130 also referred to herein as the display screen 130
- the display 130 may be a touch screen further configured to receive user touch input via the video controller, while in other cases the display 130 may be a display only, without touch input capabilities.
- the companion device library 132 includes a database of companion devices each identified by short-cut name, long-name, and a room location.
- the room location may be the room in which the virtual assistant or the companion device is located.
- An example companion device library 132 is illustrated in FIG. 2 .
- the companion device library 132 may be stored within the device 102 , as well as a separate servicer, cloud-based computing system, etc.
- the companion device library 132 may include a plurality of short-cut names 220 , each associated with a long-name 222 and a companion device room location 224 .
- the companion device room location 224 may be the room in which the companion device is located.
- Each room may be associated with certain audio settings applied to the audio signal when the device 102 is located at that location. That is, the audio settings may be specific to each location. For example, the starting music genre and volume associated with an outdoor space may be louder than that associated with the home office.
- Other audio processing attributes such as equalization, filtering, etc., may be specific to each location and defined within the companion device library for that location.
- the short-cut names may include generic names that may be applicable to more than one companion device, such as “lights” or “tv”.
- the long-names may be unique names that identify a specific device and are not duplicative, such as Notably, some of the short-cut 220 names may be the same as the long-names 222 .
- the companion device in order to control a certain companion device such as lights or speakers, the companion device may be called by name. This name may include either the short-cut name or the long-name.
- the database maintains the names and locations in order to efficiently and accurately respond to user commands received at the microphone 104 of the device 102 .
- the location of the user may also be relevant in controlling the various companion devices 240 (e.g., devices 240 - 1 , 240 - 2 , 240 - 3 , 240 - 4 , 240 - 5 , etc., as illustrated in FIG. 3 ).
- the processor 120 may more accurately control the companion devices 240 .
- the user room location may be identified in various ways.
- the user room location may be identified by a room sample collected from the microphone 104 of the personal assistant device.
- the room sample may be collected upon start-up of the device 102 .
- a stimulus noise may be emitted from the loudspeaker 116 and the room sample may be subsequently recorded.
- the room sample may include unique room impulse responses (RIR). These impulse responses may be unique to each room and therefore be used to identify the room as the device is moved between various locations.
- the RIRs may include an amplitude envelope (i.e., amplitude over time).
- a RIR of a room may vary slightly depending on the exact location of the device 102 within the room. However, a RIR of two different rooms may vary dramatically.
- the RIR acquired by the room sample may be used to classify or identify a room or location of the device 102 .
- a sample RIR of a room sample may be compared to stored RIRs. If a certain number of amplitudes of the sample response aligned or match with that of a stored response associated with a known room, then the room may be identified based on the stored response. This is discussed in more detail herein.
- FIG. 3 illustrates an example home 200 including a plurality of rooms 202 .
- the rooms 202 may include, for example, a bedroom 202 - 1 , a home office 202 - 2 , a kitchen 202 - 3 , a living room 202 - 4 , and an outdoor space or patio 202 - 5 .
- Various other rooms and locations may be appreciated and included.
- a home gym, basement, etc. may also be included in the home 200 .
- Multiple personal assistant devices 102 may be included in the rooms throughout the home 200 .
- a first device 102 - 1 is located in the bedroom 202 - 1 .
- a second device 102 - 2 is located within the home office 202 - 2 .
- a third device 102 - 3 is located in the living room 202 - 4 , and so on.
- each room may also include various companion devices 240 .
- a companion device 240 may include a device that may interface and communicate with the personal assistant device 102 .
- the personal assistant device 102 may provide instructions or commands to the companion devices 240 in response to user commands.
- the companion device 240 may be a light 240 - 6 and may respond to the command “turn on the kitchen light.”
- Other examples of companion devices may include televisions, speakers, monitors, security cameras, outlets, thermostats, etc.
- the companion devices 240 may communication with the personal assistant device 102 via a wireless network such as a home network.
- the companion device 240 may be registered and paired with the personal assistant device 102 upon configuration, and thereafter may respond to commands provided by the assistant device 102 .
- each room 202 may include one or more companion devices 240 .
- the bedroom 202 - 1 may include a tv 240 - 1 , a lamp 240 - 2 , and an overhead light 240 - 3 .
- the office 202 - 2 may include, a light 240 - 4 , a tv 240 - 5 , another light 240 - 6 , etc.
- Each of the companion devices 240 may be associated with a short-cut name and a long-name, as explained above with respect to FIG. 2 .
- the processor 120 may determine the location of the user, and then turn on one of the bedroom tv 240 - 1 , and the office tv 240 - 5 , based on the user's location.
- FIG. 4 illustrates an example database 400 for personal assistant device 102 identification.
- This database 400 or chart may be part of the companion device library 132 .
- Each device 102 may be determined to be in a certain room 202 .
- the database 400 may store the relationship between the personal assistant device 102 and the room location.
- Each device 102 may be associated with a unique identification. Once a device 102 , as identified by its unique identification, is determined to be within a certain room, the association thereof is saved in database 400 .
- FIG. 5 illustrates an example process 500 for the personal assistant device 102 to identify a room 202 and provide a command to one of the companion devices 240 .
- a user may have multiple devices 102 throughout his or her home. Each device 102 may receive various commands. In some situations, more than one device 102 may receive a single command.
- the process 500 allows for a better user experience by allowing user to control various devices without the need to memorize device names.
- the verbal commands from the users may be parsed and analyzed to control the desired device based on the user's location.
- the processor 120 may identify the room based on the received voice command. This may include analyzing the voice command by parsing the voice command to extract various command information from the command. Some of the voice commands may include phrases that include long-names for the device to be controlled, such as “den light.” Other commands simply include a short-cut name such as “light.” For example, if the phrase is “turn on the den light” or “turn the den light on,” the device name may be the device associated with “den,” the action may be “on,” and the room location may be the den. Thus, this phrase includes a long-name “den light” and both the device and room location are identified by use of the long-name.
- the voice command may include a short-cut name for a device.
- the device name may be “light”
- the action may be “on”
- the room location would need to be determined.
- the room location may be determined by a mechanism other than parsing out the room name from the phrase.
- One such mechanism may include using the RIR, as explained above. Once the room is identified based on the RIR, The RIR may be included as part of the personal assistant device ID 420 . Thus, the RIR may be used to look up the room location when the room is not included in the voice command.
- the process 500 illustrates an example process for identifying a room location based on a voice command.
- the process 500 begins at block 502 .
- the processor 120 receives a voice command from the microphone 104 .
- the voice command may include a verbal instruction from the user to control various companion devices 240 such as “turn on the lights,” or “tune the TV to channel 7.”
- the processor 120 may parse the received voice command in order to extract command information such as the companion device name, action, room location, etc. For example, if the phrase is “turn on the den light” or “turn the den light on,” the device name may be “den light,” the action may be “on,” and the room location may be the den. Thus, this phrase includes a long-name “den light” and both the device and room location are identified by use of the long-name. In another example, the phrase may be “turn the light on.” In this phrase, the device name may be “light,” the action may be “on.” Because a short-cut name was used, the room location is not available from the parsed command information.
- the processor 120 may determine whether the device name is recognized.
- the processor 120 may compare the parsed name acquired in block 504 with the list of device names 220 , 222 stored in the companion device library 132 .
- the processor 120 may compare the parsed device name with the short-cut names 220 , as well as the long-names 222 . If the processor 120 matches the parsed device name with one of the names 220 , 222 , then the process 500 proceeds to block 508 . If not, the process 500 proceeds to block 510 .
- the processor 120 may determine whether the parsed device name is a long-name 222 . If the parsed device name includes a long name, such as “den light,” then the parsed device name is considered a unique name because the device (e.g., “den light”) is specifically identified instead of generically identified (e.g., “light”). If the parsed device name is a long-name 222 , the process 500 proceeds to block 512 . If not, the process 500 proceeds to block 514 .
- a long name such as “den light”
- the processor may instruct the speaker 116 to output a device error message.
- the device error message may indicate that a companion device by the parsed device name could not be located. For example, if the parsed device name was “den light,” then the device error message may include an audible command such as “I'm sorry, I could't find a device or group named den light.” In another example, the error message may specifically ask for the device name, such as “I'm sorry, I could't find a device or group name den light, could you please repeat the name of the device?”
- the processor 120 may identify the user room location, or the room that the user is currently in. This may be determined by looking at the RIR and determining the current user room location. The RIR may indicate which room the user is currently in. The room location may also be determined by which personal assistant device 102 received the command, or which device 102 received the highest quality of the command, indicating which device 102 the user is closest to.
- the processor 120 may determine whether the companion device 240 associated with the long-name 222 is in the current room as the user. In other words, is the companion device room location the same as the user room location. In the example above, the processor 120 may determine whether the user is in the den. If the processor 120 determines that the user is in the same room as the companion device 240 identified in the command, the process 500 proceeds to block 516 . If not, the process 500 proceeds to block 518 .
- the processor 120 may determine the current room of the user, similar to block 512 . As explained above, this may be done using the RIR, or determining which device 102 received the voice command in block 502 .
- the processor 120 may determine whether a companion device 240 by the parsed device name is associated with the current room. For example, if the short-cut name is “den,” and the user is currently in the den or the master bedroom, then the short-cut name will be recognized. However, since there is no device by the name of “light” within the garage, the device name will not be recognized. If the device name is associated with a companion device name within the current room, the process proceeds to block 516 . If not, the process 500 may proceed to block 522 .
- the processor 120 may apply the parsed action to the companion device 240 associated with the parsed device name in the parsed or identified room.
- the companion device may be the den light, which may be turned on.
- the processor 120 in response to determining that the user is not in the same room as the companion device associated with the parsed device name, may issue a room warning or message. This message may alert the user that a device by the parsed device name exists, but not within the room that the user is currently in. For example, the processor 120 may instruct the speaker 116 to emit “a device by that name is not located within this room, would you still like to control the device?”
- the processor 120 may determine whether the microphone 104 has received an affirmative user answer to the inquiry transmitted at block 518 . For example, the user may respond “yes” to the question of whether he or she would like to control a device not within the same room as the user. If so, the process proceeds to block 516 . If an affirmative response is not received, for example if the user responds “no,” the process 500 may end.
- the processor 120 may instruct the speaker 116 to emit a device and room error message. For example, the speaker 116 may emit “I'm sorry, but I could't find a device by that name in this room.”
- the user voice command may provide a query of potential devices.
- the user command may be “what devices are in this room?”
- the processor 120 may determine the room location using RIR, express identification or other mechanisms, and may scan the companion device library 132 to determine which devices are associated with the current room. Once determined, the processor 120 may instruct the speaker 116 to emit a list of companion devices 240 , as well as the short-cut names associated therewith, for example “the devices in this room are the den light which can be referred to as light, and the den tv, which can be referred to as tv.”
- the system may also deduce a room location.
- a guest bedroom may include three companion devices named, guest bedroom light, guest bedroom tv and guest bedroom lamp. Commands given using the long-name would identify the current room as the guest bedroom. After receiving several commands using the long-names, the processor 120 may instruct the speaker 116 to emit a command such as “it appears we are in the guest bedroom, is this correct?” A positive or affirmative respond from the user would allow the server to add the “guest bedroom” as the room location for the devices in the database. Additionally or alternatively, during set up and during the addition of the companion devices to the database, the processor 120 may determine that all three devices are within the guest room based on all three having “guest bedroom” in their long-name. Thus, the processor 120 may deduce the room location during configuration of initial set up.
- this system may be beneficial for use in the medical industry, specifically hospitals having multiple rooms. When patients, doctors, nurses, etc., move from room to room, each may issue commands such as “turn on the lights,” or “turn on the tv.” Without having to know the specific name of the device, the user issuing the commands may control the various companion devices. This is due to the processor 120 determining the user's location and controlling the companion devices based on that location. This may also increase security and avoid undue interruptions by only permitting users to control the devices within the same room as the user.
- While the systems and methods above are described as being performed by the processor 120 of a personal assistant device 102 , the processes may be carried about by another device, or within a cloud computing system.
- the processor 120 may not necessarily be located within the room with a companion device, and may be remove of the home in general.
- companion devices that may be controlled via virtual assistant devices may be easily commanded by users not familiar with the specific device long-names associates with the companion devices.
- Short-cut names such as “lights” may be enough to control lights in near proximity to the user, e.g., in the same room as the user.
- the personal assistant device may react to user commands to efficiently, easily, and accurately control companion device.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Acoustics & Sound (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Automation & Control Theory (AREA)
- General Health & Medical Sciences (AREA)
- Signal Processing (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Selective Calling Equipment (AREA)
- User Interface Of Digital Computer (AREA)
- Telephonic Communication Services (AREA)
Abstract
Description
Claims (15)
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US15/807,055 US10748533B2 (en) | 2017-11-08 | 2017-11-08 | Proximity aware voice agent |
KR1020180133024A KR20190052618A (en) | 2017-11-08 | 2018-11-01 | Proximity aware voice agent |
CN201811312589.3A CN109754795B (en) | 2017-11-08 | 2018-11-06 | Proximity-aware voice agents |
EP18205168.0A EP3483722B1 (en) | 2017-11-08 | 2018-11-08 | Proximity aware voice agent |
US16/932,904 US11721337B2 (en) | 2017-11-08 | 2020-07-20 | Proximity aware voice agent |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US15/807,055 US10748533B2 (en) | 2017-11-08 | 2017-11-08 | Proximity aware voice agent |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/932,904 Continuation US11721337B2 (en) | 2017-11-08 | 2020-07-20 | Proximity aware voice agent |
Publications (2)
Publication Number | Publication Date |
---|---|
US20190139542A1 US20190139542A1 (en) | 2019-05-09 |
US10748533B2 true US10748533B2 (en) | 2020-08-18 |
Family
ID=64267664
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/807,055 Active 2038-06-07 US10748533B2 (en) | 2017-11-08 | 2017-11-08 | Proximity aware voice agent |
US16/932,904 Active 2039-02-14 US11721337B2 (en) | 2017-11-08 | 2020-07-20 | Proximity aware voice agent |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/932,904 Active 2039-02-14 US11721337B2 (en) | 2017-11-08 | 2020-07-20 | Proximity aware voice agent |
Country Status (4)
Country | Link |
---|---|
US (2) | US10748533B2 (en) |
EP (1) | EP3483722B1 (en) |
KR (1) | KR20190052618A (en) |
CN (1) | CN109754795B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200349945A1 (en) * | 2017-11-08 | 2020-11-05 | Harman International Industries, Incorporated | Proximity aware voice agent |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI689865B (en) * | 2017-04-28 | 2020-04-01 | 塞席爾商元鼎音訊股份有限公司 | Smart voice system, method of adjusting output voice and computre readable memory medium |
US11275553B2 (en) * | 2018-12-07 | 2022-03-15 | Google Llc | Conditionally assigning various automated assistant function(s) to interaction with a peripheral assistant control device |
EP4055998B1 (en) * | 2019-11-04 | 2024-03-06 | Signify Holding B.V. | Defining one or more groups in a configurable system based on device name similarity |
WO2023080341A1 (en) * | 2021-11-02 | 2023-05-11 | Samsung Electronics Co., Ltd. | Dynamic positioning of ai speaker in an iot ecosystem |
CN114093365A (en) * | 2021-11-11 | 2022-02-25 | 四川虹美智能科技有限公司 | Method, server, terminal and system for updating corpus in real time |
Citations (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060247913A1 (en) * | 2005-04-29 | 2006-11-02 | International Business Machines Corporation | Method, apparatus, and computer program product for one-step correction of voice interaction |
US20120163603A1 (en) * | 2009-09-14 | 2012-06-28 | Sony Corporation | Server and method, non-transitory computer readable storage medium, and mobile client terminal and method |
US20130024018A1 (en) * | 2011-07-22 | 2013-01-24 | Htc Corporation | Multimedia control method and multimedia control system |
US8453058B1 (en) * | 2012-02-20 | 2013-05-28 | Google Inc. | Crowd-sourced audio shortcuts |
US20130179173A1 (en) * | 2012-01-11 | 2013-07-11 | Samsung Electronics Co., Ltd. | Method and apparatus for executing a user function using voice recognition |
US20140064501A1 (en) * | 2012-08-29 | 2014-03-06 | Bang & Olufsen A/S | Method and a system of providing information to a user |
US20140244013A1 (en) * | 2013-02-26 | 2014-08-28 | Sonos, Inc. | Pre-caching of Audio Content |
US20150010169A1 (en) * | 2012-01-30 | 2015-01-08 | Echostar Ukraine Llc | Apparatus, systems and methods for adjusting output audio volume based on user location |
US20150104037A1 (en) * | 2013-10-10 | 2015-04-16 | Samsung Electronics Co., Ltd. | Audio system, method of outputting audio, and speaker apparatus |
US20150189438A1 (en) * | 2014-01-02 | 2015-07-02 | Harman International Industries, Incorporated | Context-Based Audio Tuning |
US20150222987A1 (en) * | 2014-02-06 | 2015-08-06 | Sol Republic Inc. | Methods for operating audio speaker systems |
WO2015183401A1 (en) | 2014-05-30 | 2015-12-03 | Apple Inc. | Intelligent assistant for home automation |
US9484030B1 (en) * | 2015-12-02 | 2016-11-01 | Amazon Technologies, Inc. | Audio triggered commands |
US20160353218A1 (en) * | 2015-05-29 | 2016-12-01 | Sound United, LLC | System and method for providing user location-based multi-zone media |
US20170242653A1 (en) * | 2016-02-22 | 2017-08-24 | Sonos, Inc. | Voice Control of a Media Playback System |
EP3483722A1 (en) | 2017-11-08 | 2019-05-15 | Harman International Industries, Incorporated | Proximity aware voice agent |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8126172B2 (en) | 2007-12-06 | 2012-02-28 | Harman International Industries, Incorporated | Spatial processing stereo system |
US9031268B2 (en) * | 2011-05-09 | 2015-05-12 | Dts, Inc. | Room characterization and correction for multi-channel audio |
US9332363B2 (en) | 2011-12-30 | 2016-05-03 | The Nielsen Company (Us), Llc | System and method for determining meter presence utilizing ambient fingerprints |
EP3691179A1 (en) * | 2012-12-18 | 2020-08-05 | Samsung Electronics Co., Ltd. | Method and device for controlling home device remotely in home network system |
CN105308679A (en) * | 2013-05-28 | 2016-02-03 | 汤姆逊许可公司 | Method and system for identifying location associated with voice command to control home appliance |
US9698999B2 (en) * | 2013-12-02 | 2017-07-04 | Amazon Technologies, Inc. | Natural language control of secondary device |
US20170069309A1 (en) * | 2015-09-03 | 2017-03-09 | Google Inc. | Enhanced speech endpointing |
KR102366617B1 (en) * | 2017-03-28 | 2022-02-23 | 삼성전자주식회사 | Method for operating speech recognition service and electronic device supporting the same |
-
2017
- 2017-11-08 US US15/807,055 patent/US10748533B2/en active Active
-
2018
- 2018-11-01 KR KR1020180133024A patent/KR20190052618A/en not_active Ceased
- 2018-11-06 CN CN201811312589.3A patent/CN109754795B/en active Active
- 2018-11-08 EP EP18205168.0A patent/EP3483722B1/en active Active
-
2020
- 2020-07-20 US US16/932,904 patent/US11721337B2/en active Active
Patent Citations (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060247913A1 (en) * | 2005-04-29 | 2006-11-02 | International Business Machines Corporation | Method, apparatus, and computer program product for one-step correction of voice interaction |
US20120163603A1 (en) * | 2009-09-14 | 2012-06-28 | Sony Corporation | Server and method, non-transitory computer readable storage medium, and mobile client terminal and method |
US20130024018A1 (en) * | 2011-07-22 | 2013-01-24 | Htc Corporation | Multimedia control method and multimedia control system |
US20130179173A1 (en) * | 2012-01-11 | 2013-07-11 | Samsung Electronics Co., Ltd. | Method and apparatus for executing a user function using voice recognition |
US20150010169A1 (en) * | 2012-01-30 | 2015-01-08 | Echostar Ukraine Llc | Apparatus, systems and methods for adjusting output audio volume based on user location |
US8453058B1 (en) * | 2012-02-20 | 2013-05-28 | Google Inc. | Crowd-sourced audio shortcuts |
US20140064501A1 (en) * | 2012-08-29 | 2014-03-06 | Bang & Olufsen A/S | Method and a system of providing information to a user |
US20140244013A1 (en) * | 2013-02-26 | 2014-08-28 | Sonos, Inc. | Pre-caching of Audio Content |
US20150104037A1 (en) * | 2013-10-10 | 2015-04-16 | Samsung Electronics Co., Ltd. | Audio system, method of outputting audio, and speaker apparatus |
US20150189438A1 (en) * | 2014-01-02 | 2015-07-02 | Harman International Industries, Incorporated | Context-Based Audio Tuning |
US20150222987A1 (en) * | 2014-02-06 | 2015-08-06 | Sol Republic Inc. | Methods for operating audio speaker systems |
WO2015183401A1 (en) | 2014-05-30 | 2015-12-03 | Apple Inc. | Intelligent assistant for home automation |
US20160353218A1 (en) * | 2015-05-29 | 2016-12-01 | Sound United, LLC | System and method for providing user location-based multi-zone media |
US9484030B1 (en) * | 2015-12-02 | 2016-11-01 | Amazon Technologies, Inc. | Audio triggered commands |
US20170242653A1 (en) * | 2016-02-22 | 2017-08-24 | Sonos, Inc. | Voice Control of a Media Playback System |
US9826306B2 (en) * | 2016-02-22 | 2017-11-21 | Sonos, Inc. | Default playback device designation |
EP3483722A1 (en) | 2017-11-08 | 2019-05-15 | Harman International Industries, Incorporated | Proximity aware voice agent |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200349945A1 (en) * | 2017-11-08 | 2020-11-05 | Harman International Industries, Incorporated | Proximity aware voice agent |
US11721337B2 (en) * | 2017-11-08 | 2023-08-08 | Harman International Industries, Incorporated | Proximity aware voice agent |
Also Published As
Publication number | Publication date |
---|---|
KR20190052618A (en) | 2019-05-16 |
US20190139542A1 (en) | 2019-05-09 |
CN109754795B (en) | 2024-08-06 |
CN109754795A (en) | 2019-05-14 |
US11721337B2 (en) | 2023-08-08 |
EP3483722A1 (en) | 2019-05-15 |
EP3483722B1 (en) | 2022-07-06 |
US20200349945A1 (en) | 2020-11-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11721337B2 (en) | Proximity aware voice agent | |
US11830495B2 (en) | Networked devices, systems, and methods for intelligently deactivating wake-word engines | |
US11132989B2 (en) | Networked microphone devices, systems, and methods of localized arbitration | |
US10582322B2 (en) | Audio playback settings for voice interaction | |
US11183181B2 (en) | Systems and methods of multiple voice services | |
US9729984B2 (en) | Dynamic calibration of an audio system | |
US20220104015A1 (en) | Intelligent Setup for Playback Devices | |
CN109473095B (en) | Intelligent household control system and control method | |
JP2021516790A (en) | System and method of selective wake word detection using neural network model | |
KR20220093280A (en) | Media playback system with voice assistance | |
US10735803B2 (en) | Playback device setup | |
EP3484183B1 (en) | Location classification for intelligent personal assistant | |
US20200388268A1 (en) | Information processing apparatus, information processing system, and information processing method, and program | |
US20220101829A1 (en) | Neural network speech recognition system | |
CN117319888A (en) | Sound effect control method, device and system | |
US10505879B2 (en) | Communication support device, communication support method, and computer program product | |
JP7617971B2 (en) | Language data processing system, language data processing method, and computer program | |
US12081964B2 (en) | Terminal and method for outputting multi-channel audio by using plurality of audio devices | |
KR20240054021A (en) | Electronic device capable of proposing behavioral patterns for each situation and control method therefor | |
EP4409933A1 (en) | Enabling and disabling microphones and voice assistants | |
EP3537728A1 (en) | Connection state determination system for speakers, acoustic device, and connection state determination method for speakers | |
KR20220139125A (en) | Hearing loop device with various functions | |
JP2019062520A (en) | Recording device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: HARMAN INTERNATIONAL INDUSTRIES, INCORPORATED, CON Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:GUNTHER, CRAIG;REEL/FRAME:044074/0696 Effective date: 20171030 Owner name: HARMAN INTERNATIONAL INDUSTRIES, INCORPORATED, CONNECTICUT Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:GUNTHER, CRAIG;REEL/FRAME:044074/0696 Effective date: 20171030 |
|
FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 4 |