US20190304443A1 - Electronic message transmission - Google Patents
Electronic message transmission Download PDFInfo
- Publication number
- US20190304443A1 US20190304443A1 US15/941,395 US201815941395A US2019304443A1 US 20190304443 A1 US20190304443 A1 US 20190304443A1 US 201815941395 A US201815941395 A US 201815941395A US 2019304443 A1 US2019304443 A1 US 2019304443A1
- Authority
- US
- United States
- Prior art keywords
- electronic message
- trigger
- trigger phrase
- audio segment
- devices
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
- G06F3/167—Audio in a user interface, e.g. using voice commands for navigating, audio feedback
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W52/00—Power management, e.g. Transmission Power Control [TPC] or power classes
- H04W52/02—Power saving arrangements
- H04W52/0209—Power saving arrangements in terminal devices
- H04W52/0225—Power saving arrangements in terminal devices using monitoring of external events, e.g. the presence of a signal
- H04W52/0229—Power saving arrangements in terminal devices using monitoring of external events, e.g. the presence of a signal where the received signal is a wanted signal
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W84/00—Network topologies
- H04W84/02—Hierarchically pre-organised networks, e.g. paging networks, cellular networks, WLAN [Wireless Local Area Network] or WLL [Wireless Local Loop]
- H04W84/10—Small scale networks; Flat hierarchical networks
- H04W84/12—WLAN [Wireless Local Area Networks]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/70—Reducing energy consumption in communication networks in wireless communication networks
Definitions
- Many devices such as smartphones, tablets, voice command devices and/or (e.g., other types of) virtual assistant devices may allow a user to provide a command (e.g., using a conversational interface) and/or perform an action based upon the command.
- a device may rely on the user to say a trigger phrase in order to receive the command.
- the device may continuously (e.g., constantly) monitor audio (e.g., of an area around the device) using one or more microphones.
- the (e.g., continuous) monitoring may deplete a power source (e.g., a battery) of the device and/or the (e.g., continuous) monitoring of the device (e.g., which may be connected to the internet) may compromise privacy of the user.
- audio received via a microphone of a first device may be monitored. Responsive to detecting a first trigger phrase in a first audio segment identified during the monitoring, a first electronic message comprising instructions to activate a microphone function of a second device may be generated and the first electronic message may be transmitted to the second device. Responsive to detecting a second trigger phrase in a second audio segment identified during the monitoring, a second electronic message comprising instructions to activate a microphone function of a third device may be generated and the second electronic message may be transmitted to the third device.
- audio received via a microphone of a first device may be monitored. Responsive to detecting a trigger phrase in a first audio segment identified during the monitoring and a command in a second audio segment identified during the monitoring, an electronic message comprising instructions to perform an action associated with the command may be generated and the electronic message may be transmitted to a second device.
- audio received via a microphone of a first device may be monitored. Responsive to detecting a trigger phrase in a first audio segment identified during the monitoring, a first electronic message comprising instructions to activate an input function of a second device may be generated and the first electronic message may be transmitted to the second device.
- FIG. 1 is an illustration of a scenario involving various examples of networks that may connect servers and clients.
- FIG. 2 is an illustration of a scenario involving an example configuration of a server that may utilize and/or implement at least a portion of the techniques presented herein.
- FIG. 3 is an illustration of a scenario involving an example configuration of a client that may utilize and/or implement at least a portion of the techniques presented herein.
- FIG. 4A is a flow chart illustrating an example method for detecting trigger phrases and transmitting electronic messages to devices.
- FIG. 4B is a flow chart illustrating an example method for detecting trigger phrases, detecting commands and transmitting electronic messages to devices.
- FIG. 5A is a component block diagram illustrating an example system for detecting a trigger phrase and transmitting an electronic message to a device, where a second trigger phrase in a first audio segment is detected.
- FIG. 5B is a component block diagram illustrating an example system for detecting a trigger phrase and transmitting an electronic message to a device, where a first audio segment is compared to a first trigger phrase and/or a second trigger phrase.
- FIG. 5C is a component block diagram illustrating an example system for detecting a trigger phrase and transmitting an electronic message to a device, where a first electronic message is transmitted to a third device.
- FIG. 6A is a component block diagram illustrating an example system for detecting a trigger phrase, detecting a command and transmitting an electronic message to a device, where a first trigger phrase in a first audio segment and a first command in a second audio segment are detected.
- FIG. 6B is a component block diagram illustrating an example system for detecting a trigger phrase, detecting a command and transmitting an electronic message to a device, where a first audio segment is compared to a second trigger phrase and/or a second audio segment is transcribed to generate a text transcription.
- FIG. 6C is a component block diagram illustrating an example system for detecting a trigger phrase, detecting a command and transmitting an electronic message to a device, where a first electronic message is transmitted to a second device.
- FIG. 7 is an illustration of a scenario featuring an example non-transitory machine readable medium in accordance with one or more of the provisions set forth herein.
- FIG. 1 is an interaction diagram of a scenario 100 illustrating a service 102 provided by a set of servers 104 to a set of client devices 110 via various types of networks.
- the servers 104 and/or client devices 110 may be capable of transmitting, receiving, processing, and/or storing many types of signals, such as in memory as physical memory states.
- the servers 104 of the service 102 may be internally connected via a local area network 106 (LAN), such as a wired network where network adapters on the respective servers 104 are interconnected via cables (e.g., coaxial and/or fiber optic cabling), and may be connected in various topologies (e.g., buses, token rings, meshes, and/or trees).
- LAN local area network
- the servers 104 may be interconnected directly, or through one or more other networking devices, such as routers, switches, and/or repeaters.
- the servers 104 may utilize a variety of physical networking protocols (e.g., Ethernet and/or Fiber Channel) and/or logical networking protocols (e.g., variants of an Internet Protocol (IP), a Transmission Control Protocol (TCP), and/or a User Datagram Protocol (UDP).
- IP Internet Protocol
- TCP Transmission Control Protocol
- UDP User Datagram Protocol
- the local area network 106 may include, e.g., analog telephone lines, such as a twisted wire pair, a coaxial cable, full or fractional digital lines including T1, T2, T3, or T4 type lines, Integrated Services Digital Networks (ISDNs), Digital Subscriber Lines (DSLs), wireless links including satellite links, or other communication links or channels, such as may be known to those skilled in the art.
- ISDNs Integrated Services Digital Networks
- DSLs Digital Subscriber Lines
- the local area network 106 may be organized according to one or more network architectures, such as server/client, peer-to-peer, and/or mesh architectures, and/or a variety of roles, such as administrative servers, authentication servers, security monitor servers, data stores for objects such as files and databases, business logic servers, time synchronization servers, and/or front-end servers providing a user-facing interface for the service 102 .
- network architectures such as server/client, peer-to-peer, and/or mesh architectures, and/or a variety of roles, such as administrative servers, authentication servers, security monitor servers, data stores for objects such as files and databases, business logic servers, time synchronization servers, and/or front-end servers providing a user-facing interface for the service 102 .
- the local area network 106 may comprise one or more sub-networks, such as may employ differing architectures, may be compliant or compatible with differing protocols and/or may interoperate within the local area network 106 . Additionally, a variety of local area networks 106 may be interconnected; e.g., a router may provide a link between otherwise separate and independent local area networks 106 .
- the local area network 106 of the service 102 is connected to a wide area network 108 (WAN) that allows the service 102 to exchange data with other services 102 and/or client devices 110 .
- the wide area network 108 may encompass various combinations of devices with varying levels of distribution and exposure, such as a public wide-area network (e.g., the Internet) and/or a private network (e.g., a virtual private network (VPN) of a distributed enterprise).
- a public wide-area network e.g., the Internet
- a private network e.g., a virtual private network (VPN) of a distributed enterprise.
- VPN virtual private network
- the service 102 may be accessed via the wide area network 108 by a user 112 of one or more client devices 110 , such as a portable media player (e.g., an electronic text reader, an audio device, or a portable gaming, exercise, or navigation device); a portable communication device (e.g., a camera, a phone, a wearable or a text chatting device); a workstation; and/or a laptop form factor computer.
- client devices 110 may communicate with the service 102 via various connections to the wide area network 108 .
- one or more client devices 110 may comprise a cellular communicator and may communicate with the service 102 by connecting to the wide area network 108 via a wireless local area network 106 provided by a cellular provider.
- one or more client devices 110 may communicate with the service 102 by connecting to the wide area network 108 via a wireless local area network 106 provided by a location such as the user's home or workplace (e.g., a WiFi (Institute of Electrical and Electronics Engineers (IEEE) Standard 802.11) network or a Bluetooth (IEEE Standard 802.15.1) personal area network).
- the servers 104 and the client devices 110 may communicate over various types of networks.
- Other types of networks that may be accessed by the servers 104 and/or client devices 110 include mass storage, such as network attached storage (NAS), a storage area network (SAN), or other forms of computer or machine readable media.
- NAS network attached storage
- SAN storage area network
- FIG. 2 presents a schematic architecture diagram 200 of a server 104 that may utilize at least a portion of the techniques provided herein.
- a server 104 may vary widely in configuration or capabilities, alone or in conjunction with other servers, in order to provide a service such as the service 102 .
- the server 104 may comprise one or more processors 210 that process instructions.
- the one or more processors 210 may optionally include a plurality of cores; one or more coprocessors, such as a mathematics coprocessor or an integrated graphical processing unit (GPU); and/or one or more layers of local cache memory.
- the server 104 may comprise memory 202 storing various forms of applications, such as an operating system 204 ; one or more server applications 206 , such as a hypertext transport protocol (HTTP) server, a file transfer protocol (FTP) server, or a simple mail transport protocol (SMTP) server; and/or various forms of data, such as a database 208 or a file system.
- HTTP hypertext transport protocol
- FTP file transfer protocol
- SMTP simple mail transport protocol
- the server 104 may comprise a variety of peripheral components, such as a wired and/or wireless network adapter 214 connectible to a local area network and/or wide area network; one or more storage components 216 , such as a hard disk drive, a solid-state storage device (SSD), a flash memory device, and/or a magnetic and/or optical disk reader.
- peripheral components such as a wired and/or wireless network adapter 214 connectible to a local area network and/or wide area network
- storage components 216 such as a hard disk drive, a solid-state storage device (SSD), a flash memory device, and/or a magnetic and/or optical disk reader.
- the server 104 may comprise a mainboard featuring one or more communication buses 212 that interconnect the processor 210 , the memory 202 , and various peripherals, using a variety of bus technologies, such as a variant of a serial or parallel AT Attachment (ATA) bus protocol; a Uniform Serial Bus (USB) protocol; and/or Small Computer System Interface (SCI) bus protocol.
- a communication bus 212 may interconnect the server 104 with at least one other server.
- Other components that may optionally be included with the server 104 (though not shown in the schematic diagram 200 of FIG.
- a display such as a graphical processing unit (GPU); input peripherals, such as a keyboard and/or mouse; and a flash memory device that may store a basic input/output system (BIOS) routine that facilitates booting the server 104 to a state of readiness.
- a display adapter such as a graphical processing unit (GPU)
- input peripherals such as a keyboard and/or mouse
- a flash memory device that may store a basic input/output system (BIOS) routine that facilitates booting the server 104 to a state of readiness.
- BIOS basic input/output system
- the server 104 may operate in various physical enclosures, such as a desktop or tower, and/or may be integrated with a display as an “all-in-one” device.
- the server 104 may be mounted horizontally and/or in a cabinet or rack, and/or may simply comprise an interconnected set of components.
- the server 104 may comprise a dedicated and/or shared power supply 218 that supplies and/or regulates power for the other components.
- the server 104 may provide power to and/or receive power from another server and/or other devices.
- the server 104 may comprise a shared and/or dedicated climate control unit 220 that regulates climate properties, such as temperature, humidity, and/or airflow. Many such servers 104 may be configured and/or adapted to utilize at least a portion of the techniques presented herein.
- FIG. 3 presents a schematic architecture diagram 300 of a client device 110 whereupon at least a portion of the techniques presented herein may be implemented.
- client device 110 may vary widely in configuration or capabilities, in order to provide a variety of functionality to a user such as the user 112 .
- the client device 110 may be provided in a variety of form factors, such as a desktop or tower workstation; an “all-in-one” device integrated with a display 308 ; a laptop, tablet, convertible tablet, or palmtop device; a wearable device mountable in a headset, eyeglass, earpiece, and/or wristwatch, and/or integrated with an article of clothing; and/or a component of a piece of furniture, such as a tabletop, and/or of another device, such as a vehicle or residence.
- the client device 110 may serve the user in a variety of roles, such as a workstation, kiosk, media player, gaming device, and/or appliance.
- the client device 110 may comprise one or more processors 310 that process instructions.
- the one or more processors 310 may optionally include a plurality of cores; one or more coprocessors, such as a mathematics coprocessor or an integrated graphical processing unit (GPU); and/or one or more layers of local cache memory.
- the client device 110 may comprise memory 301 storing various forms of applications, such as an operating system 303 ; one or more user applications 302 , such as document applications, media applications, file and/or data access applications, communication applications such as web browsers and/or email clients, utilities, and/or games; and/or drivers for various peripherals.
- the client device 110 may comprise a variety of peripheral components, such as a wired and/or wireless network adapter 306 connectible to a local area network and/or wide area network; one or more output components, such as a display 308 coupled with a display adapter (optionally including a graphical processing unit (GPU)), a sound adapter coupled with a speaker, and/or a printer; input devices for receiving input from the user, such as a keyboard 311 , a mouse, a microphone, a camera, and/or a touch-sensitive component of the display 308 ; and/or environmental sensors, such as a global positioning system (GPS) receiver 319 that detects the location, velocity, and/or acceleration of the client device 110 , a compass, accelerometer, and/or gyroscope that detects a physical orientation of the client device 110 .
- GPS global positioning system
- Other components that may optionally be included with the client device 110 include one or more storage components, such as a hard disk drive, a solid-state storage device (SSD), a flash memory device, and/or a magnetic and/or optical disk reader; and/or a flash memory device that may store a basic input/output system (BIOS) routine that facilitates booting the client device 110 to a state of readiness; and a climate control unit that regulates climate properties, such as temperature, humidity, and airflow.
- storage components such as a hard disk drive, a solid-state storage device (SSD), a flash memory device, and/or a magnetic and/or optical disk reader; and/or a flash memory device that may store a basic input/output system (BIOS) routine that facilitates booting the client device 110 to a state of readiness
- BIOS basic input/output system
- climate control unit that regulates climate properties, such as temperature, humidity, and airflow.
- the client device 110 may comprise a mainboard featuring one or more communication buses 312 that interconnect the processor 310 , the memory 301 , and various peripherals, using a variety of bus technologies, such as a variant of a serial or parallel AT Attachment (ATA) bus protocol; the Uniform Serial Bus (USB) protocol; and/or the Small Computer System Interface (SCI) bus protocol.
- the client device 110 may comprise a dedicated and/or shared power supply 318 that supplies and/or regulates power for other components, and/or a battery 304 that stores power for use while the client device 110 is not connected to a power source via the power supply 318 .
- the client device 110 may provide power to and/or receive power from other client devices.
- descriptive content in the form of signals or stored physical states within memory may be identified.
- Descriptive content may be stored, typically along with contextual content.
- the source of a phone number e.g., a communication received from another user via an instant messenger application
- Contextual content may identify circumstances surrounding receipt of a phone number (e.g., the date or time that the phone number was received), and may be associated with descriptive content.
- Contextual content may, for example, be used to subsequently search for associated descriptive content. For example, a search for phone numbers received from specific individuals, received via an instant messenger application or at a given date or time, may be initiated.
- the client device 110 may include one or more servers that may locally serve the client device 110 and/or other client devices of the user 112 and/or other individuals.
- a locally installed webserver may provide web content in response to locally submitted web requests.
- Many such client devices 110 may be configured and/or adapted to utilize at least a portion of the techniques presented herein.
- One or more computing devices and/or techniques for detecting trigger phrases and transmitting electronic messages to devices are provided.
- many devices e.g., smartphones, tablets, computers, smart speakers, voice command devices and/or other types of virtual assistant devices
- Such a device may rely on detecting a trigger phrase (e.g., corresponding to a virtual assistant of the device) in order to activate the virtual assistant and/or receive the command.
- the device may continuously (e.g., constantly) monitor audio (e.g., of an area around the device) using a microphone (e.g., of the device).
- the (e.g., continuous) monitoring may deplete a power source (e.g., a battery) of the device and/or the (e.g., continuous) monitoring of the device may compromise privacy and/or security of the user.
- the device may be connected to the internet and/or one or more servers. Accordingly, the device may transmit one or more audio segments of the audio to the one or more servers (e.g., without knowledge and/or consent of the user). Alternatively and/or additionally, the audio may be accessed (e.g., and/or hacked) by entities via the internet.
- a second device may monitor audio received via a second microphone of the second device. Responsive to detecting a trigger phrase, the second device may generate an electronic message comprising instructions to activate a microphone function of the device (e.g., and/or the virtual assistant of the device). The second device may (e.g., then) transmit the electronic message to the device. Accordingly, the device may not rely on continuously monitoring audio using the microphone of the device to detect the trigger phrase and/or activate the virtual assistant.
- the second device may be a trusted device that the user may believe does not compromise privacy and/or security of the user. In some examples, the second device may not be connected (e.g., directly) to the internet.
- a user such as user Jill, may access and/or interact with a plurality of virtual assistants using a plurality of devices (e.g., smartphones, tablets, computers, smart speakers, voice command devices, etc.).
- a plurality of devices e.g., smartphones, tablets, computers, smart speakers, voice command devices, etc.
- Each device of the plurality of devices may have a virtual assistant of the plurality of virtual assistants (e.g., installed) and/or may be connected to a first device.
- the first device may interact with a second device (e.g., and/or activate a microphone function of the second device) of the plurality of devices responsive to detecting a first trigger phrase corresponding to the second device (e.g., while monitoring audio received via a first microphone of the first device).
- the second device and/or the first trigger phrase may be (e.g., determined to be) associated with a first virtual assistant (e.g., and thus the first trigger phase may be determined to correspond to the second device).
- the first virtual assistant may be installed onto the second device.
- the first device may interact with a third device (e.g., and/or activate a microphone function of the third device) of the plurality of devices responsive to detecting a second trigger phrase corresponding to the third device (e.g., while monitoring audio received via a first microphone of the first device).
- a third device e.g., and/or activate a microphone function of the third device
- the third device and/or the second trigger phrase may be (e.g., determined to be) associated with a second virtual assistant (e.g., and thus the second trigger phase may be determined to correspond to the third device).
- the second virtual assistant may be installed onto the third device.
- the first device may be selected by the user.
- the first device may be a trusted device that the user may believe does not compromise privacy and/or security of the user.
- the first device may be selected using a device of the plurality of devices.
- the second device may scan an environment such as a local area network, a personal area network, etc. The second device may then determine a second plurality of devices that are connected to the local area network, the personal area network, etc. The second plurality of devices may (e.g., then) be ranked based upon security.
- the second plurality of devices may be (e.g., further) ranked based upon a capability for monitoring audio and/or interacting with devices of the plurality of devices responsive to detecting trigger phrases corresponding to the devices.
- the first device may be ranked higher than a fourth device (e.g., of the second plurality of devices) because the first device may not be (e.g., continuously) connected to the internet and the fourth device may be (e.g., continuously) connected to the internet.
- the first device may be ranked higher than the fourth device because the first device is manufactured by a first manufacturer and the second device is manufactured by a second manufacturer, wherein the first manufacturer may be associated with a first security level (e.g., and/or a first trust level) and the second manufacturer may be associated with a second security level (e.g., and/or a second trust level).
- the first security level e.g., and/or the first trust level
- the first device may be ranked highest (e.g., of the second plurality of devices) and/or selected automatically by the second device.
- the second device may present a ranked list of the second plurality of devices. The first device may be selected by the user from the ranked list of the second plurality of devices.
- the first device may comprise a dedicated power source (e.g., a battery) and/or may be connected to a power supply and/or an adaptor.
- the first device e.g., and/or the second plurality of devices
- the first device may comprise an (e.g., home) appliance and/or an (e.g., consumer) electronic device, such as a television, a video game console, a laptop, a desktop computer, a motor-vehicle computer, a smartphone, a tablet, an e-reader, etc.
- the first device e.g., and/or the second plurality of devices
- the first device may be configured and/or manufactured specifically for monitoring audio received via the first microphone and/or interacting with devices of the plurality of devices responsive to detecting trigger phrases corresponding to the devices.
- the first device may be configured to connect to the internet and/or access one or more servers (e.g., accessed by the first device via a network connection).
- the first device may have various security settings.
- a high security setting may require that the first device does not connect to (e.g., and/or access) the internet.
- a medium security setting may require that the first device connects to (e.g., and/or accesses) the internet (e.g., only) for voice recognition information.
- a low security setting may allow the first device to connect to the internet for voice recognition information and/or to receive and/or transmit (e.g., other types of) information using the internet.
- the security settings may be changed (e.g., by the user) by using a security interface.
- audio received via the first microphone of the first device may be monitored (e.g., by the first device).
- the first microphone may be (e.g., continuously) active and/or may receive audio of an area around the first microphone (e.g., and/or the first device).
- the first device may be configured to detect speech during the monitoring of the audio. Responsive to detecting speech, an audio segment comprising the speech may be extracted and/or analyzed and the audio segment may be compared to a plurality of trigger phrases associated with the plurality of devices and/or the plurality of virtual assistants. For example, each trigger phrase of the plurality of trigger phrases may be associated with a device of the plurality of devices and/or a virtual assistant of the plurality of virtual assistants.
- the audio segment Responsive to determining that the audio segment does not comprise speech similar to (e.g., any trigger phrase of) the plurality of trigger phrases, the audio segment may be discarded (e.g., and/or the first device may continue monitoring audio received via the first microphone).
- information associated with the plurality of trigger phrases may be comprised within a voice recognition database.
- the voice recognition database may comprise voice recognition information corresponding to the plurality of trigger phrases.
- the voice recognition database may comprise a data structure of the plurality of trigger phrases, wherein each trigger phrase of the trigger phrases is linked to a set of voice recognition information of the trigger phrase.
- the voice recognition database may be stored in one or more servers accessed by the first device via a network connection. Alternatively and/or additionally, the voice recognition database may be stored in a memory device of the first device.
- voice recognition information may be extracted from the voice recognition database to compare the audio segment to the plurality of trigger phrases.
- a first electronic message may be generated.
- the first electronic message may (e.g., then) be transmitted to the second device.
- the first electronic message may comprise instructions to activate a microphone function (e.g., and/or an input function) of the second device.
- the first electronic message may comprise a first push notification.
- the first audio segment may be compared with (e.g., each trigger phrase of) the plurality of trigger phrases. For example, the first device may determine that the first audio segment comprises speech similar to the first trigger phrase. In an example, the first device may determine that the first audio segment comprises speech 89% similar to the first trigger phrase and/or the first audio segment comprises speech 12% similar to the second trigger phrase. In this example, the first trigger phrase may be identified based upon the determination that the first audio segment comprises speech 89% similar to the first trigger phrase.
- the first trigger phrase may be identified responsive to determining that a similarity of at least a portion of the first audio segment and the first trigger phrase is above a trigger phrase threshold (e.g., 70%, 80%, 85%, 90%, etc.).
- a trigger phrase threshold e.g. 70%, 80%, 85%, 90%, etc.
- the first device may stop comparing the first audio segment with trigger phrases of the plurality of trigger phrases.
- each trigger phrase of the plurality of trigger phrases may be associated with a device of the plurality of devices.
- the first trigger phrase may be associated with the second device.
- the second device may be selected (e.g., for transmission of the first electronic message) from amongst the plurality of devices based upon the first trigger phrase.
- each trigger phrase may be associated with a virtual assistant of the plurality of virtual assistants. Accordingly, the first virtual assistant may be selected from amongst the plurality of virtual assistants based upon the first trigger phrase. Further, the second device may be selected from amongst the plurality of devices based upon the first virtual assistant.
- the first electronic message may (e.g., further) comprise instructions to activate a speaker function of the second device and/or a screen function of the second device. In some examples, the first electronic message may (e.g., merely) comprise an indication that the first trigger phrase was detected.
- the second device may interpret the indication that the first trigger phrase was detected as instructions (e.g., and/or a trigger) to activate the microphone function of the second device, instructions (e.g., and/or a trigger) to activate the input function of the second device (e.g., such as a camera function), instructions (e.g., and/or a trigger) to activate the speaker function of the second device and/or instructions (e.g., and/or a trigger) to activate the screen function of the second device.
- instructions e.g., and/or a trigger
- the second device may prompt the user to provide a first command to the second device.
- the second device may activate (e.g., turn on) a screen of the second device, display a first graphical object using the screen of the second device, output a first audio message and/or a first sound effect using a speaker of the second device, activate a first camera of the second device (e.g., in order to receive the first command) and/or activate a second microphone of the second device (e.g., in order to receive the first command).
- the first graphical object, the first audio message and/or the first sound effect may indicate to the user that the second device is ready to receive the first command via the first camera and/or the second microphone.
- the first graphical object (e.g., displayed on the screen of the second device) may comprise “Listening for your command” and/or the first audio message (e.g., outputted using the speaker of the second device) may comprise “Please state your command”.
- the second device may (e.g., then) receive the first command using the second microphone, using the first camera of the second device and/or via text received using a keyboard (e.g., and/or an on-screen keyboard) of the second device.
- the second device may (e.g., then) perform a first action based upon the first command.
- a second electronic message of the third device may be generated.
- the second electronic message may (e.g., then) be transmitted to the third device.
- the second electronic message may comprise instructions to activate a microphone function (e.g., and/or an input function).
- the second electronic message may comprise a second push notification.
- the first device may determine that the second audio segment comprises speech similar to the second trigger phrase.
- the first device may determine that the second audio segment comprises speech 93% similar to the second trigger phrase and/or the second audio segment comprises speech 13% similar to the first trigger phrase.
- the second trigger phrase may be identified based upon the determination that the second audio segment comprises speech 93% similar to the second trigger phrase.
- the second trigger phrase may be identified responsive to determining that a similarity of at least a portion of the second audio segment and the second trigger phrase is above the trigger phrase threshold.
- the first device may stop comparing the second audio segment with trigger phrases of the plurality of trigger phrases.
- the second virtual assistant and/or the third device may be selected based upon the second trigger phrase.
- the second electronic message may (e.g., further) comprise instructions to activate a speaker function of the third device and/or a screen function of the third device. In some examples, the second electronic message may (e.g., merely) comprise an indication that the second trigger phrase was detected.
- the third device may interpret the indication that the second trigger phrase was detected as instructions (e.g., and/or a trigger) to activate the microphone function of the third device, instructions (e.g., and/or a trigger) to activate the input function of the third device (e.g., such as a camera function), instructions (e.g., and/or a trigger) to activate the speaker function of the third device and/or instructions (e.g., and/or a trigger) to activate the screen function of the third device.
- instructions e.g., and/or a trigger
- the third device may prompt the user to provide a second command to the third device.
- the third device may activate (e.g., turn on) a screen of the third device, display a second graphical object using the screen of the third device, output a second audio message and/or a second sound effect using a speaker of the third device, activate a second camera of the third device (e.g., in order to receive the second command) and/or activate a third microphone of the third device (e.g., in order to receive the second command).
- the second graphical object, the second audio message and/or the second sound effect may indicate to the user that the third device is ready to receive the second command via the second camera and/or the third microphone.
- the third device may (e.g., then) receive the first command using the third microphone, using the second camera and/or via text received using a keyboard (e.g., and/or an on-screen keyboard) of the third device.
- the third device may (e.g., then) perform a second action based upon the second command.
- the first device, the second device and/or the third device may be connected to the local area network (e.g., via Ethernet, Wi-Fi, etc.). Accordingly, the first device may transmit the first electronic message to the second device via the local area network. Alternatively and/or additionally, the first device may transmit the second electronic message to the third device via the local area network. Alternatively and/or additionally, the first device may be connected to the second device using a first wireless connection (e.g., a Bluetooth connection, a personal area network, etc.). For example, the first device may be paired to the second device using Bluetooth. Accordingly, the first device may transmit the first electronic message to the second device using the first wireless connection.
- a first wireless connection e.g., a Bluetooth connection, a personal area network, etc.
- the first device may be paired to the second device using Bluetooth. Accordingly, the first device may transmit the first electronic message to the second device using the first wireless connection.
- the first device may be connected to the third device using a second wireless connection (e.g., a Bluetooth connection, a personal area network, etc.).
- a second wireless connection e.g., a Bluetooth connection, a personal area network, etc.
- the first device may be paired to the third device using Bluetooth. Accordingly, the first device may transmit the second electronic message to the third device using the second wireless connection.
- a user such as user Jack, may access and/or interact with a plurality of virtual assistants using a plurality of devices (e.g., smartphones, tablets, computers, smart speakers, voice command devices, etc.).
- Each device of the plurality of devices may have a virtual assistant of the plurality of virtual assistants (e.g., installed) and/or may be connected to a first device.
- the first device may interact with a second device of the plurality of devices responsive to detecting a first trigger phrase and/or a first command corresponding to the second device (e.g., while monitoring audio received via a first microphone of the first device).
- the second device and/or the first trigger phrase may be associated with a first virtual assistant.
- the first virtual assistant may be installed onto the second device.
- the first device may be selected by the user.
- the first device may be a trusted device that the user may believe does not compromise privacy of the user.
- the first device may comprise a dedicated power source (e.g., a battery) and/or may be connected to a power supply and/or an adaptor.
- the first device may be selected automatically using a device of the plurality of devices.
- the first device may be configured to connect to the internet and/or access one or more servers for voice recognition information.
- the first device may have various security settings.
- a high security setting may require that the first device does not connect to (e.g., and/or access) the internet.
- a medium security setting may require that the first device connects to (e.g., and/or accesses) the internet (e.g., only) for voice recognition information.
- a low security setting may allow the first device to connect to the internet for voice recognition information and/or to receive and/or transmit (e.g., other types of) information using the internet.
- audio received via the first microphone of the first device may be monitored (e.g., by the first device).
- the first microphone may be (e.g., continuously) active and/or may receive audio of an area around the first microphone (e.g., and/or the first device).
- the first device may be configured to detect speech during the monitoring of the audio. Responsive to detecting speech, an audio segment comprising the speech may be analyzed and/or extracted and the audio segment may be compared to a plurality of trigger phrases associated with the plurality of devices and/or the plurality of virtual assistants. For example, each trigger phrase of the plurality of trigger phrases may be associated with a device of the plurality of devices and/or a virtual assistant of the plurality of virtual assistants.
- the audio segment Responsive to determining that the audio segment does not comprise speech similar to (e.g., any trigger phrase of) the plurality of trigger phrases, the audio segment may be discarded (e.g., and/or the first device may continue monitoring audio received via the first microphone).
- a first electronic message may be generated.
- the first electronic message may be transmitted to the second device.
- the first electronic message may comprise instructions to perform an action associated with the first command.
- the first electronic message may comprise a first push notification.
- at least a portion of the first audio segment may be compared with (e.g., each trigger phrase of) the plurality of trigger phrases. For example, the first device may determine that the first audio segment comprises speech similar to the first trigger phrase.
- the first device may prompt the user to provide the first command to the first device.
- the first device may display a first graphical object using a screen of the first device and/or output a first audio message and/or a first sound effect using a speaker of the first device.
- the first device may detect the first trigger phrase and the first command within a threshold period of time and/or with less than a threshold number of words between the first trigger phrase and the first command (e.g., consecutively) and/or may not prompt the user to provide the first command to the first device.
- the first device may (e.g., continue) monitoring audio received via the first microphone to detect the first command in the second audio segment.
- the second audio segment may be transcribed to generate a text transcription (e.g., of the second audio segment).
- the second audio segment may be transcribed using voice recognition information.
- the voice recognition information may be comprised within a voice recognition database.
- the voice recognition database may be stored in one or more servers accessed by the first device via a network connection. Alternatively and/or additionally, the voice recognition database may be stored in a memory device of the first device.
- voice recognition information may be extracted from the voice recognition database to transcribe the second audio segment.
- the voice recognition database may be exclusive (e.g., generated and/or maintained exclusively) for the user.
- the voice recognition database may be dynamically updated to add new voice recognition information (e.g., based upon analyzing speech of the user, by receiving the new voice recognition information from one or more servers accessed by the first device via a network connection, by receiving the new voice recognition information from the second device, etc.).
- the first command may be determined by the first device based upon an evaluation of the text transcription.
- the instructions to perform the action may comprise the text transcription (e.g., of the second audio segment).
- the second audio segment may comprise the user saying, “Give me directions to Brooklyn Park”. Accordingly, the second audio segment may be transcribed to generate the text transcription comprising, “Give me directions to Brooklyn Park”.
- the first electronic message (e.g., and/or the instructions to perform the action) may comprise (e.g., a representation of) the text transcription.
- the second audio segment may not be transcribed (e.g., to generate the text transcription) by the first device.
- the first device may generate an audio file based upon the second audio segment.
- the first electronic message e.g., and/or the instructions to perform the action
- the second device may transcribe the audio file to generate the text transcription.
- each trigger phrase of the plurality of trigger phrases may be associated with a device of the plurality of devices.
- the first trigger phrase may be associated with the second device.
- the second device may be selected (e.g., for transmission of the first electronic message) from amongst the plurality of devices based upon the first trigger phrase.
- the second device may be selected (e.g., for transmission of the first electronic message) from amongst the plurality of devices based upon an analysis of the first command.
- the first device may analyze the first command, the second audio segment, the text transcription and/or the audio file to determine a context, a subject matter, etc. of the first command.
- the second device may (e.g., then) be selected based upon a determination that the second device is (e.g., best) fit for performing the action responsive to receiving the first electronic message.
- the second device may perform the action. For example, the second device may activate a screen of the second device, display a second graphical object associated with the action using the screen of the second device and/or output a second audio message associated with the action using a speaker of the second device.
- the second device may display the second graphical object comprising a map interface and/or GPS directions corresponding to the first electronic message and/or the second device may output the second audio message corresponding to the first electronic message using the speaker of the second device, comprising “Here are directions to Brooklyn Park”.
- the second device may output an inquiry associated with the action using the speaker of the second device and/or the screen of the second device.
- the second device may output an inquiry using the speaker of the second device comprising “Did you mean Brooklyn Park in New York City or Brooklyn Park in Kansas City?”.
- the second device may transmit a second electronic message to the first device comprising instructions to monitor audio received via the first microphone of the first device to detect a second command, associated with the first command.
- the second device may not transmit the second electronic message to the first device.
- the first trigger phrase may be detected by the first device (e.g., after the inquiry is outputted by the speaker of the second device).
- the first device may monitor audio received via the first microphone of the first device to detect the second command.
- the first device may generate a third electronic message comprising instructions to perform a second action associated with the second command.
- the third electronic message may comprise a third push notification.
- the third audio segment may be transcribed to generate a second text transcription (e.g., of the third audio segment).
- the third electronic message e.g., and/or the instructions to perform the second action
- the third audio segment may comprise the user saying, “New York City”. Accordingly, the third audio segment may be transcribed to generate the second text transcription comprising, “New York City”.
- the third audio segment may not be transcribed (e.g., to generate the second text transcription) by the first device.
- the first device may generate a second audio file based upon the third audio segment.
- the third electronic message (e.g., and/or the instructions to perform the second action) may comprise the second audio file rather than the second text transcription.
- the second device may transcribe the second audio file to generate the second text transcription.
- the first device may transmit the third electronic message to the second device.
- the second device may perform the second action. For example, the second device may activate the screen of the second device, display a fourth graphical object associated with the second action using the screen of the second device and/or output a third audio message associated with the second action using a speaker of the second device.
- the second device may display the fourth graphical object comprising a map interface and/or GPS directions corresponding to the third electronic message.
- the second device may (e.g., further) output the third audio message corresponding to the third electronic message using the speaker of the second device, comprising “Here are directions to Brooklyn Park in New York City”.
- the first device may interact with a third device of the plurality of devices responsive to detecting a second trigger phrase and/or a third command corresponding to the third device.
- the third device and/or the second trigger phrase may be associated with a second virtual assistant.
- the second virtual assistant may be installed onto the third device.
- a fourth electronic message may be generated.
- the fourth electronic message may (e.g., then) be transmitted to the third device.
- the fourth electronic message may comprise instructions to perform a third action associated with the third command.
- the fourth electronic message may comprise a fourth push notification.
- the fourth audio segment may be compared with (e.g., each trigger phrase of) the plurality of trigger phrases. For example, the first device may determine that the fourth audio segment comprises speech similar to the second trigger phrase.
- the first device may prompt the user to provide the third command to the first device. For example, the first device may display a fifth graphical object using the screen of the first device and/or output a fourth audio message and/or a second sound effect using the speaker of the first device.
- the first device may (e.g., continue) monitoring audio received via the first microphone to detect the third command in the fifth audio segment.
- the fifth audio segment may be transcribed to generate a third text transcription (e.g., of the fifth audio segment).
- the fourth electronic message (e.g., and/or the instructions to perform the third action) may comprise the third text transcription.
- the fifth audio segment may not be transcribed (e.g., to generate the third transcription) by the first device.
- the first device may generate a third audio file based upon the fifth audio segment.
- the fourth electronic message (e.g., and/or the instructions to perform the third action) may comprise the third audio file rather than the third text transcription.
- the third device may transcribe the third audio file to generate the third text transcription.
- the fifth audio segment may comprise the user saying, “Please read me my unread text messages”.
- the second trigger phrase may be associated with the third device. Accordingly, the third device may be selected (e.g., for transmission of the fourth electronic message) from amongst the plurality of devices based upon the second trigger phrase.
- the third device may be selected (e.g., for transmission of the fourth electronic message) from amongst the plurality of devices based upon an analysis of the third command.
- the first device may analyze the third command, the fifth audio segment, the third text transcription and/or the third audio file to determine a context, a subject matter, etc. of the third command.
- the third device may (e.g., then) be selected based upon a determination that the third device is (e.g., best) fit and/or match (e.g., relative to one or more other devices) for performing the third action responsive to receiving the fourth electronic message.
- the fit and/or match of each device may be determined (e.g., scored) based upon one or more criteria, which may be retrieved from a database, or (e.g., manually) input/defined by a user.
- the third device may perform the third action. For example, the third device may activate a screen of the third device, display a sixth graphical object associated with the third action using the screen of the third device, and/or output a fifth audio message associated with the third action using a speaker of the third device.
- the third device may display the sixth graphical object comprising a list of unread text messages (e.g., comprising one or more unread text messages received by the third device) and/or the third device may output one or more audio messages using the speaker of the second device, wherein each audio message of the one or more audio messages may correspond to an unread text message of the one or more unread text messages.
- the first device may detect the first trigger phrase, the first command, the second trigger phrase and/or the third command, consecutively. For example, while the second device performs the first action (e.g., and/or the second action), the first device may detect the second trigger phrase and/or the third command. Accordingly, the first device may transmit the fourth electronic message (e.g., corresponding to the third command) to the third device while the second device performs the first action (e.g., and/or the second action). In this way, the first action (e.g., and/or the second action) may be performed by the second device and the third action may be performed by the third device, simultaneously.
- the first action e.g., and/or the second action
- the third action may be performed by the third device, simultaneously.
- the first device, the second device and/or the third device may be connected to a local area network (e.g., via Ethernet, Wi-Fi, etc.). Accordingly, the first device may transmit the first electronic message and/or the third electronic message to the second device via the local area network. Alternatively and/or additionally, the second device may transmit the second electronic message to the first device via the local area network. Alternatively and/or additionally, the first device may transmit the fourth electronic message to the third device via the local area network. Alternatively and/or additionally, the first device may be connected to the second device using a first wireless connection (e.g., a Bluetooth connection, a personal area network, etc.). For example, the first device may be paired to the second device using Bluetooth.
- a first wireless connection e.g., a Bluetooth connection, a personal area network, etc.
- the first device may transmit the first electronic message and/or the third electronic message to the second device using the first wireless connection.
- the second device may transmit the second electronic message to the first device using the first wireless connection.
- the first device may be connected to the third device using a second wireless connection (e.g., a Bluetooth connection, a personal area network, etc.).
- the first device may be paired to the third device using Bluetooth. Accordingly, the first device may transmit the fourth electronic message to the third device using the second wireless connection.
- FIGS. 5A-5C illustrate examples of a system 501 for detecting a trigger phrase and transmitting an electronic message (e.g., associated with the trigger phrase) to a device.
- a user such as user James, may access and/or interact with a plurality of virtual assistants using a plurality of devices (e.g., smartphones, tablets, computers, smart speakers, voice command devices, etc.).
- Each device of the plurality of devices may have a virtual assistant of the plurality of virtual assistants (e.g., installed) and/or may be connected to a first device 502 .
- the first device 502 may monitor audio received via a first microphone 522 of the first device 502 .
- the plurality of devices may comprise a second device 504 (e.g., a smartphone) associated with a first virtual assistant (e.g., of the plurality of virtual assistants), a third device 506 (e.g., a tablet) associated with a second virtual assistant (e.g., of the plurality of virtual assistants) and/or one or more (e.g., other) devices associated with one or more (e.g., other) virtual assistants.
- a second device 504 e.g., a smartphone
- a third device 506 e.g., a tablet
- a second virtual assistant e.g., of the plurality of virtual assistants
- one or more (e.g., other) devices associated with one or more (e.g., other) virtual assistants.
- the second device 504 may comprise a camera 528 , a screen 524 , a microphone 510 , a button 512 and/or a speaker 514 .
- the third device 506 may comprise a camera 530 , a screen 526 , a microphone 516 , a button 518 and/or a speaker 520 .
- the second device 504 and/or the first virtual assistant may be associated with a first trigger phrase of a plurality of trigger phrases corresponding to the plurality of virtual assistants.
- the third device 506 and/or the second virtual assistant may be associated with a second trigger phrase of the plurality of trigger phrases.
- the first trigger phrase may comprise “Hello Alpha” and the second trigger phrase may comprise “Hey Beta”.
- FIG. 5A illustrates the first device 502 detecting the second trigger phrase in a first audio segment 508 identified during the monitoring.
- the first audio segment 508 may comprise the user saying “Hey Beta”.
- at least a portion of the first audio segment 508 may be compared with (e.g., each trigger phrase of) the plurality of trigger phrases (e.g., associated with the plurality of virtual assistants).
- FIG. 5B illustrates a backend system 550 (e.g., on the first device 502 , on a server connected to the first device 502 via a network, etc.) that may compare the first audio segment 508 to the first trigger phrase and/or the second trigger phrase.
- the first audio segment 508 may be compared to the first trigger phrase to determine that the first audio segment 508 comprises speech 26% similar to the first trigger phrase.
- the first audio segment 508 may be compared to the second trigger phrase to determine that the first audio segment 508 comprises speech 92% similar to the second trigger phrase.
- the second trigger phrase may be identified responsive to determining that a similarity of at least a portion of the first audio segment 508 and the second trigger phrase is above a trigger phrase threshold (e.g., 85%).
- the second virtual assistant may be selected (e.g., from amongst the plurality of virtual assistants) based upon the second trigger phrase.
- the third device 506 may be selected (e.g., from amongst the plurality of devices) based upon the second virtual assistant (e.g., and/or the second trigger phrase).
- FIG. 5C illustrates the first device 502 transmitting a first electronic message 532 to the third device 506 .
- the first device 502 may generate the first electronic message 532 comprising instructions to activate an input function of the third device 506 .
- the first electronic message 532 may comprise a first push notification.
- the first electronic message 532 may (e.g., further) comprise instructions to activate a speaker function of the third device 506 and/or a screen function of the third device 506 .
- the first electronic message 532 may (e.g., merely) comprise an indication that the second trigger phrase was detected.
- the third device 506 may interpret the indication that the second trigger phrase was detected as instructions (e.g., and/or a trigger) to activate the microphone 516 , instructions (e.g., and/or a trigger) to activate the camera 530 , instructions (e.g., and/or a trigger) to activate the speaker 520 and/or instructions (e.g., and/or a trigger) to activate the screen 526 .
- instructions e.g., and/or a trigger
- instructions e.g., and/or a trigger
- the speaker 520 e.g., and/or a trigger
- the third device 506 may prompt the user to provide a first command to the third device 506 .
- the third device may activate (e.g., turn on) the screen 526 , display a first graphical object 536 using the screen 526 , output a first audio message and/or a first sound effect using the speaker 520 , activate the camera 530 (e.g., in order to receive the first command) and/or activate the microphone 516 (e.g., in order to receive the first command).
- the first graphical object 536 , the first audio message and/or the first sound effect may indicate to the user that the third device 506 is ready to receive the first command.
- the first graphical object 536 may comprise “LISTENING FOR YOUR COMMAND” and/or the first audio message (e.g., outputted using the speaker 520 ) may comprise “Please state your command”.
- the third device 506 may (e.g., then) receive the first command in a second audio segment 534 using the microphone 516 .
- the third device 506 may receive the first command using the camera 530 and/or via text received using a keyboard (e.g., and/or an on-screen keyboard) of the third device 506 .
- the third device 506 may (e.g., then) perform a first action based upon the first command.
- FIGS. 6A-6C illustrate examples of a system 601 for detecting a trigger phrase, detecting a command and transmitting an electronic message (e.g., associated with the command) to a device.
- a user such as user Janet, may access and/or interact with a plurality of virtual assistants using a plurality of devices (e.g., smartphones, tablets, computers, smart speakers, voice command devices, etc.).
- Each device of the plurality of devices may have a virtual assistant of the plurality of virtual assistants (e.g., installed) and/or may be connected to a first device 602 .
- the first device 602 may monitor audio received via a first microphone 622 of the first device 602 .
- the plurality of devices may comprise a second device 604 (e.g., a smartphone) associated with a first virtual assistant (e.g., of the plurality of virtual assistants), a third device 606 (e.g., a tablet) associated with a second virtual assistant (e.g., of the plurality of virtual assistants) and/or one or more (e.g., other) devices associated with one or more (e.g., other) virtual assistants.
- a second device 604 e.g., a smartphone
- a third device 606 e.g., a tablet
- a second virtual assistant e.g., of the plurality of virtual assistants
- one or more (e.g., other) devices associated with one or more (e.g., other) virtual assistants.
- the second device 604 may comprise a camera 628 , a screen 624 , a microphone 610 , a button 612 and/or a speaker 614 .
- the third device 606 may comprise a camera 630 , a screen 626 , a microphone 616 , a button 618 and/or a speaker 620 .
- the second device 604 and/or the first virtual assistant may be associated with a first trigger phrase of a plurality of trigger phrases corresponding to the plurality of virtual assistants.
- the third device 606 and/or the second virtual assistant may be associated with a second trigger phrase of the plurality of trigger phrases.
- the first trigger phrase may comprise “Hello Alpha” and the second trigger phrase may comprise “Hey Beta”.
- FIG. 6A illustrates the first device 602 detecting the first trigger phrase in a first audio segment 632 identified during the monitoring and a first command in a second audio segment 634 identified during the monitoring.
- the first audio segment 632 may comprise the using saying “Hello Alpha”.
- at least a portion of the first audio segment 632 may be compared with (e.g., each trigger phrase of) a plurality of trigger phrases associated with the plurality of virtual assistants.
- the second audio segment 634 may comprise the user saying “How do you make pancakes?”.
- FIG. 6B illustrates a backend system 650 (e.g., on the first device 602 , on a server connected to the first device 602 via a network, etc.) that may compare the first audio segment 632 to the second trigger phrase and/or transcribe the second audio segment 634 to generate a text transcription.
- the first audio segment 632 may be compared to the first trigger phrase to determine that the first audio segment 632 comprises speech 88% similar to the first trigger phrase.
- the first trigger phrase may be identified responsive to determining that a similarity of at least a portion of the first audio segment 632 and the first trigger phrase is above a trigger phrase threshold (e.g., 85%).
- a trigger phrase threshold e.g., 85%
- the first device 602 may stop comparing the first audio segment 632 with trigger phrases of the plurality of trigger phrases.
- the second device 604 may be selected (e.g., from amongst the plurality of devices) based upon the first trigger phrase.
- the second audio segment 634 may (e.g., then) be transcribed to generate a text transcription (e.g., of the second audio segment 634 ).
- a text transcription e.g., of the second audio segment 634
- the text transcription may comprise “HOW DO YOU MAKE PANCAKES”.
- an audio file may be generated based upon the second audio segment 634 .
- FIG. 6C illustrates the first device 602 transmitting a first electronic message 638 to the second device 604 .
- the first device 602 may generate the first electronic message 638 comprising instructions to perform an action associated with the first command.
- the first electronic message 638 may comprise the text transcription and/or the audio file.
- the first electronic message 638 may comprise a first push notification.
- the second device 604 may perform the action.
- the second device 604 may activate the screen 624 , display a first graphical object 640 associated with the action using the screen 624 , and/or output a first audio message associated with the action using the speaker 614 .
- the first graphical object 640 may comprise “Pancake Recipe” (e.g., and/or the first audio message may comprise cooking instructions for preparing pancakes).
- the disclosed subject matter may assist a user in interacting with a plurality of virtual assistants by monitoring audio received using a microphone of a first device to detect trigger phrases and/or commands and/or by transmitting electronic messages based upon the trigger phrases and/or the commands to one or more devices of a plurality of devices (e.g., smartphones, tablets, computers, smart speakers, voice command devices, etc.) associated with the plurality of virtual assistants.
- a plurality of devices e.g., smartphones, tablets, computers, smart speakers, voice command devices, etc.
- Implementation of at least some of the disclosed subject matter may lead to benefits including, but not limited to, a reduction in power consumption and/or an increase in battery life of the plurality of devices (e.g., as a result of the first device comprising a dedicated power source and/or being connected to a power supply, as a result of the microphone of the first device being activated to monitor audio received using the microphone, as a result of detecting trigger phrases and/or commands during the monitoring, as a result of transmitting electronic messages to the plurality of devices corresponding to the trigger phrases and/or the commands, as a result of a plurality of microphones of the plurality of devices being deactivated, as a result of the plurality of devices not monitoring audio received by the plurality of microphones, etc.).
- a reduction in power consumption and/or an increase in battery life of the plurality of devices e.g., as a result of the first device comprising a dedicated power source and/or being connected to a power supply, as a result of the microphone of the first device being
- implementation of at least some of the disclosed subject matter may lead to benefits including greater security and/or privacy for the user (e.g., as a result of the first device not being connected to the internet, as a result of the first device only connecting to the internet for voice recognition information responsive to detecting trigger phrases and/or commands, as a result of microphones of devices of the plurality of devices that are connected to the internet being deactivated, as a result of the devices that are connected to the internet not monitoring audio received by the microphones, etc.).
- implementation of at least some of the disclosed subject matter may lead to benefits including a reduction in bandwidth (e.g., as a result of identifying the trigger phrases and/or transcribing the commands by accessing a voice recognition database stored in a memory device of the first device rather than in one or more servers, etc.).
- implementation of at least some of the disclosed subject matter may lead to benefits including an increase in speed and usability of the plurality of devices (e.g., as a result of fewer operations performed by the plurality of devices without monitoring audio received by the plurality of microphones, etc.).
- At least some of the disclosed subject matter may be implemented on a client device, and in some examples, at least some of the disclosed subject matter may be implemented on a server (e.g., hosting a service accessible via a network, such as the Internet).
- a server e.g., hosting a service accessible via a network, such as the Internet.
- FIG. 7 is an illustration of a scenario 700 involving an example non-transitory machine readable medium 702 .
- the non-transitory machine readable medium 702 may comprise processor-executable instructions 712 that when executed by a processor 716 cause performance (e.g., by the processor 716 ) of at least some of the provisions herein (e.g., embodiment 714 ).
- the non-transitory machine readable medium 702 may comprise a memory semiconductor (e.g., a semiconductor utilizing static random access memory (SRAM), dynamic random access memory (DRAM), and/or synchronous dynamic random access memory (SDRAM) technologies), a platter of a hard disk drive, a flash memory device, or a magnetic or optical disc (such as a compact disc (CD), digital versatile disc (DVD), or floppy disk).
- a memory semiconductor e.g., a semiconductor utilizing static random access memory (SRAM), dynamic random access memory (DRAM), and/or synchronous dynamic random access memory (SDRAM) technologies
- SSDRAM synchronous dynamic random access memory
- the example non-transitory machine readable medium 702 stores computer-readable data 704 that, when subjected to reading 706 by a reader 710 of a device 708 (e.g., a read head of a hard disk drive, or a read operation invoked on a solid-state storage device), express the processor-executable instructions 712 .
- the processor-executable instructions 712 when executed, cause performance of operations, such as at least some of the example method 400 of FIG. 4A and/or the example method 450 of FIG. 4B , for example.
- the processor-executable instructions 712 are configured to cause implementation of a system, such as at least some of the example system 501 of FIGS. 5A-5C and/or the example system 601 of FIGS. 6A-6C , for example.
- ком ⁇ онент As used in this application, “component,” “module,” “system”, “interface”, and/or the like are generally intended to refer to a computer-related entity, either hardware, a combination of hardware and software, software, or software in execution.
- a component may be, but is not limited to being, a process running on a processor, a processor, an object, an executable, a thread of execution, a program, and/or a computer.
- an application running on a controller and the controller can be a component.
- One or more components may reside within a process and/or thread of execution and a component may be localized on one computer and/or distributed between two or more computers.
- first,” “second,” and/or the like are not intended to imply a temporal aspect, a spatial aspect, an ordering, etc. Rather, such terms are merely used as identifiers, names, etc. for features, elements, items, etc.
- a first object and a second object generally correspond to object A and object B or two different or two identical objects or the same object.
- example is used herein to mean serving as an instance, illustration, etc., and not necessarily as advantageous.
- “or” is intended to mean an inclusive “or” rather than an exclusive “or”.
- “a” and “an” as used in this application are generally be construed to mean “one or more” unless specified otherwise or clear from context to be directed to a singular form.
- at least one of A and B and/or the like generally means A or B or both A and B.
- such terms are intended to be inclusive in a manner similar to the term “comprising”.
- the claimed subject matter may be implemented as a method, apparatus, or article of manufacture using standard programming and/or engineering techniques to produce software, firmware, hardware, or any combination thereof to control a computer to implement the disclosed subject matter.
- article of manufacture as used herein is intended to encompass a computer program accessible from any computer-readable device, carrier, or media.
- one or more of the operations described may constitute computer readable instructions stored on one or more computer and/or machine readable media, which if executed will cause the operations to be performed.
- the order in which some or all of the operations are described should not be construed as to imply that these operations are necessarily order dependent. Alternative ordering will be appreciated by one skilled in the art having the benefit of this description. Further, it will be understood that not all operations are necessarily present in each embodiment provided herein. Also, it will be understood that not all operations are necessary in some embodiments.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Computational Linguistics (AREA)
- Theoretical Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
Description
- Many devices, such as smartphones, tablets, voice command devices and/or (e.g., other types of) virtual assistant devices may allow a user to provide a command (e.g., using a conversational interface) and/or perform an action based upon the command. Such a device may rely on the user to say a trigger phrase in order to receive the command. Accordingly, the device may continuously (e.g., constantly) monitor audio (e.g., of an area around the device) using one or more microphones. However, the (e.g., continuous) monitoring may deplete a power source (e.g., a battery) of the device and/or the (e.g., continuous) monitoring of the device (e.g., which may be connected to the internet) may compromise privacy of the user.
- In accordance with the present disclosure, one or more computing devices and/or methods are provided. In an example, audio received via a microphone of a first device may be monitored. Responsive to detecting a first trigger phrase in a first audio segment identified during the monitoring, a first electronic message comprising instructions to activate a microphone function of a second device may be generated and the first electronic message may be transmitted to the second device. Responsive to detecting a second trigger phrase in a second audio segment identified during the monitoring, a second electronic message comprising instructions to activate a microphone function of a third device may be generated and the second electronic message may be transmitted to the third device.
- In an example, audio received via a microphone of a first device may be monitored. Responsive to detecting a trigger phrase in a first audio segment identified during the monitoring and a command in a second audio segment identified during the monitoring, an electronic message comprising instructions to perform an action associated with the command may be generated and the electronic message may be transmitted to a second device.
- In an example, audio received via a microphone of a first device may be monitored. Responsive to detecting a trigger phrase in a first audio segment identified during the monitoring, a first electronic message comprising instructions to activate an input function of a second device may be generated and the first electronic message may be transmitted to the second device.
- While the techniques presented herein may be embodied in alternative forms, the particular embodiments illustrated in the drawings are only a few examples that are supplemental of the description provided herein. These embodiments are not to be interpreted in a limiting manner, such as limiting the claims appended hereto.
-
FIG. 1 is an illustration of a scenario involving various examples of networks that may connect servers and clients. -
FIG. 2 is an illustration of a scenario involving an example configuration of a server that may utilize and/or implement at least a portion of the techniques presented herein. -
FIG. 3 is an illustration of a scenario involving an example configuration of a client that may utilize and/or implement at least a portion of the techniques presented herein. -
FIG. 4A is a flow chart illustrating an example method for detecting trigger phrases and transmitting electronic messages to devices. -
FIG. 4B is a flow chart illustrating an example method for detecting trigger phrases, detecting commands and transmitting electronic messages to devices. -
FIG. 5A is a component block diagram illustrating an example system for detecting a trigger phrase and transmitting an electronic message to a device, where a second trigger phrase in a first audio segment is detected. -
FIG. 5B is a component block diagram illustrating an example system for detecting a trigger phrase and transmitting an electronic message to a device, where a first audio segment is compared to a first trigger phrase and/or a second trigger phrase. -
FIG. 5C is a component block diagram illustrating an example system for detecting a trigger phrase and transmitting an electronic message to a device, where a first electronic message is transmitted to a third device. -
FIG. 6A is a component block diagram illustrating an example system for detecting a trigger phrase, detecting a command and transmitting an electronic message to a device, where a first trigger phrase in a first audio segment and a first command in a second audio segment are detected. -
FIG. 6B is a component block diagram illustrating an example system for detecting a trigger phrase, detecting a command and transmitting an electronic message to a device, where a first audio segment is compared to a second trigger phrase and/or a second audio segment is transcribed to generate a text transcription. -
FIG. 6C is a component block diagram illustrating an example system for detecting a trigger phrase, detecting a command and transmitting an electronic message to a device, where a first electronic message is transmitted to a second device. -
FIG. 7 is an illustration of a scenario featuring an example non-transitory machine readable medium in accordance with one or more of the provisions set forth herein. - Subject matter will now be described more fully hereinafter with reference to the accompanying drawings, which form a part hereof, and which show, by way of illustration, specific example embodiments. This description is not intended as an extensive or detailed discussion of known concepts. Details that are known generally to those of ordinary skill in the relevant art may have been omitted, or may be handled in summary fashion.
- The following subject matter may be embodied in a variety of different forms, such as methods, devices, components, and/or systems. Accordingly, this subject matter is not intended to be construed as limited to any example embodiments set forth herein. Rather, example embodiments are provided merely to be illustrative. Such embodiments may, for example, take the form of hardware, software, firmware or any combination thereof.
- 1. Computing Scenario
- The following provides a discussion of some types of computing scenarios in which the disclosed subject matter may be utilized and/or implemented.
- 1.1. Networking
-
FIG. 1 is an interaction diagram of ascenario 100 illustrating aservice 102 provided by a set ofservers 104 to a set ofclient devices 110 via various types of networks. Theservers 104 and/orclient devices 110 may be capable of transmitting, receiving, processing, and/or storing many types of signals, such as in memory as physical memory states. - The
servers 104 of theservice 102 may be internally connected via a local area network 106 (LAN), such as a wired network where network adapters on therespective servers 104 are interconnected via cables (e.g., coaxial and/or fiber optic cabling), and may be connected in various topologies (e.g., buses, token rings, meshes, and/or trees). Theservers 104 may be interconnected directly, or through one or more other networking devices, such as routers, switches, and/or repeaters. Theservers 104 may utilize a variety of physical networking protocols (e.g., Ethernet and/or Fiber Channel) and/or logical networking protocols (e.g., variants of an Internet Protocol (IP), a Transmission Control Protocol (TCP), and/or a User Datagram Protocol (UDP). Thelocal area network 106 may include, e.g., analog telephone lines, such as a twisted wire pair, a coaxial cable, full or fractional digital lines including T1, T2, T3, or T4 type lines, Integrated Services Digital Networks (ISDNs), Digital Subscriber Lines (DSLs), wireless links including satellite links, or other communication links or channels, such as may be known to those skilled in the art. Thelocal area network 106 may be organized according to one or more network architectures, such as server/client, peer-to-peer, and/or mesh architectures, and/or a variety of roles, such as administrative servers, authentication servers, security monitor servers, data stores for objects such as files and databases, business logic servers, time synchronization servers, and/or front-end servers providing a user-facing interface for theservice 102. - Likewise, the
local area network 106 may comprise one or more sub-networks, such as may employ differing architectures, may be compliant or compatible with differing protocols and/or may interoperate within thelocal area network 106. Additionally, a variety oflocal area networks 106 may be interconnected; e.g., a router may provide a link between otherwise separate and independentlocal area networks 106. - In the
scenario 100 ofFIG. 1 , thelocal area network 106 of theservice 102 is connected to a wide area network 108 (WAN) that allows theservice 102 to exchange data withother services 102 and/orclient devices 110. Thewide area network 108 may encompass various combinations of devices with varying levels of distribution and exposure, such as a public wide-area network (e.g., the Internet) and/or a private network (e.g., a virtual private network (VPN) of a distributed enterprise). - In the
scenario 100 ofFIG. 1 , theservice 102 may be accessed via thewide area network 108 by auser 112 of one ormore client devices 110, such as a portable media player (e.g., an electronic text reader, an audio device, or a portable gaming, exercise, or navigation device); a portable communication device (e.g., a camera, a phone, a wearable or a text chatting device); a workstation; and/or a laptop form factor computer. Therespective client devices 110 may communicate with theservice 102 via various connections to thewide area network 108. As a first such example, one ormore client devices 110 may comprise a cellular communicator and may communicate with theservice 102 by connecting to thewide area network 108 via a wirelesslocal area network 106 provided by a cellular provider. As a second such example, one ormore client devices 110 may communicate with theservice 102 by connecting to thewide area network 108 via a wirelesslocal area network 106 provided by a location such as the user's home or workplace (e.g., a WiFi (Institute of Electrical and Electronics Engineers (IEEE) Standard 802.11) network or a Bluetooth (IEEE Standard 802.15.1) personal area network). In this manner, theservers 104 and theclient devices 110 may communicate over various types of networks. Other types of networks that may be accessed by theservers 104 and/orclient devices 110 include mass storage, such as network attached storage (NAS), a storage area network (SAN), or other forms of computer or machine readable media. - 1.2. Server Configuration
-
FIG. 2 presents a schematic architecture diagram 200 of aserver 104 that may utilize at least a portion of the techniques provided herein. Such aserver 104 may vary widely in configuration or capabilities, alone or in conjunction with other servers, in order to provide a service such as theservice 102. - The
server 104 may comprise one ormore processors 210 that process instructions. The one ormore processors 210 may optionally include a plurality of cores; one or more coprocessors, such as a mathematics coprocessor or an integrated graphical processing unit (GPU); and/or one or more layers of local cache memory. Theserver 104 may comprisememory 202 storing various forms of applications, such as anoperating system 204; one ormore server applications 206, such as a hypertext transport protocol (HTTP) server, a file transfer protocol (FTP) server, or a simple mail transport protocol (SMTP) server; and/or various forms of data, such as adatabase 208 or a file system. Theserver 104 may comprise a variety of peripheral components, such as a wired and/orwireless network adapter 214 connectible to a local area network and/or wide area network; one ormore storage components 216, such as a hard disk drive, a solid-state storage device (SSD), a flash memory device, and/or a magnetic and/or optical disk reader. - The
server 104 may comprise a mainboard featuring one ormore communication buses 212 that interconnect theprocessor 210, thememory 202, and various peripherals, using a variety of bus technologies, such as a variant of a serial or parallel AT Attachment (ATA) bus protocol; a Uniform Serial Bus (USB) protocol; and/or Small Computer System Interface (SCI) bus protocol. In a multibus scenario, acommunication bus 212 may interconnect theserver 104 with at least one other server. Other components that may optionally be included with the server 104 (though not shown in the schematic diagram 200 ofFIG. 2 ) include a display; a display adapter, such as a graphical processing unit (GPU); input peripherals, such as a keyboard and/or mouse; and a flash memory device that may store a basic input/output system (BIOS) routine that facilitates booting theserver 104 to a state of readiness. - The
server 104 may operate in various physical enclosures, such as a desktop or tower, and/or may be integrated with a display as an “all-in-one” device. Theserver 104 may be mounted horizontally and/or in a cabinet or rack, and/or may simply comprise an interconnected set of components. Theserver 104 may comprise a dedicated and/or sharedpower supply 218 that supplies and/or regulates power for the other components. Theserver 104 may provide power to and/or receive power from another server and/or other devices. Theserver 104 may comprise a shared and/or dedicatedclimate control unit 220 that regulates climate properties, such as temperature, humidity, and/or airflow. Manysuch servers 104 may be configured and/or adapted to utilize at least a portion of the techniques presented herein. - 1.3. Client Device Configuration
-
FIG. 3 presents a schematic architecture diagram 300 of aclient device 110 whereupon at least a portion of the techniques presented herein may be implemented. Such aclient device 110 may vary widely in configuration or capabilities, in order to provide a variety of functionality to a user such as theuser 112. Theclient device 110 may be provided in a variety of form factors, such as a desktop or tower workstation; an “all-in-one” device integrated with adisplay 308; a laptop, tablet, convertible tablet, or palmtop device; a wearable device mountable in a headset, eyeglass, earpiece, and/or wristwatch, and/or integrated with an article of clothing; and/or a component of a piece of furniture, such as a tabletop, and/or of another device, such as a vehicle or residence. Theclient device 110 may serve the user in a variety of roles, such as a workstation, kiosk, media player, gaming device, and/or appliance. - The
client device 110 may comprise one ormore processors 310 that process instructions. The one ormore processors 310 may optionally include a plurality of cores; one or more coprocessors, such as a mathematics coprocessor or an integrated graphical processing unit (GPU); and/or one or more layers of local cache memory. Theclient device 110 may comprisememory 301 storing various forms of applications, such as anoperating system 303; one ormore user applications 302, such as document applications, media applications, file and/or data access applications, communication applications such as web browsers and/or email clients, utilities, and/or games; and/or drivers for various peripherals. Theclient device 110 may comprise a variety of peripheral components, such as a wired and/orwireless network adapter 306 connectible to a local area network and/or wide area network; one or more output components, such as adisplay 308 coupled with a display adapter (optionally including a graphical processing unit (GPU)), a sound adapter coupled with a speaker, and/or a printer; input devices for receiving input from the user, such as akeyboard 311, a mouse, a microphone, a camera, and/or a touch-sensitive component of thedisplay 308; and/or environmental sensors, such as a global positioning system (GPS) receiver 319 that detects the location, velocity, and/or acceleration of theclient device 110, a compass, accelerometer, and/or gyroscope that detects a physical orientation of theclient device 110. Other components that may optionally be included with the client device 110 (though not shown in the schematic architecture diagram 300 ofFIG. 3 ) include one or more storage components, such as a hard disk drive, a solid-state storage device (SSD), a flash memory device, and/or a magnetic and/or optical disk reader; and/or a flash memory device that may store a basic input/output system (BIOS) routine that facilitates booting theclient device 110 to a state of readiness; and a climate control unit that regulates climate properties, such as temperature, humidity, and airflow. - The
client device 110 may comprise a mainboard featuring one ormore communication buses 312 that interconnect theprocessor 310, thememory 301, and various peripherals, using a variety of bus technologies, such as a variant of a serial or parallel AT Attachment (ATA) bus protocol; the Uniform Serial Bus (USB) protocol; and/or the Small Computer System Interface (SCI) bus protocol. Theclient device 110 may comprise a dedicated and/or sharedpower supply 318 that supplies and/or regulates power for other components, and/or abattery 304 that stores power for use while theclient device 110 is not connected to a power source via thepower supply 318. Theclient device 110 may provide power to and/or receive power from other client devices. - In some scenarios, as a
user 112 interacts with a software application on a client device 110 (e.g., an instant messenger and/or electronic mail application), descriptive content in the form of signals or stored physical states within memory (e.g., an email address, instant messenger identifier, phone number, postal address, message content, date, and/or time) may be identified. Descriptive content may be stored, typically along with contextual content. For example, the source of a phone number (e.g., a communication received from another user via an instant messenger application) may be stored as contextual content associated with the phone number. Contextual content, therefore, may identify circumstances surrounding receipt of a phone number (e.g., the date or time that the phone number was received), and may be associated with descriptive content. Contextual content, may, for example, be used to subsequently search for associated descriptive content. For example, a search for phone numbers received from specific individuals, received via an instant messenger application or at a given date or time, may be initiated. Theclient device 110 may include one or more servers that may locally serve theclient device 110 and/or other client devices of theuser 112 and/or other individuals. For example, a locally installed webserver may provide web content in response to locally submitted web requests. Manysuch client devices 110 may be configured and/or adapted to utilize at least a portion of the techniques presented herein. - 2. Presented Techniques
- One or more computing devices and/or techniques for detecting trigger phrases and transmitting electronic messages to devices are provided. For example, many devices (e.g., smartphones, tablets, computers, smart speakers, voice command devices and/or other types of virtual assistant devices) may allow a user to provide a command (e.g., using a conversational interface) and/or may (e.g., then) perform an action based upon the command. Such a device may rely on detecting a trigger phrase (e.g., corresponding to a virtual assistant of the device) in order to activate the virtual assistant and/or receive the command. Accordingly, the device may continuously (e.g., constantly) monitor audio (e.g., of an area around the device) using a microphone (e.g., of the device).
- However, the (e.g., continuous) monitoring may deplete a power source (e.g., a battery) of the device and/or the (e.g., continuous) monitoring of the device may compromise privacy and/or security of the user. For example, the device may be connected to the internet and/or one or more servers. Accordingly, the device may transmit one or more audio segments of the audio to the one or more servers (e.g., without knowledge and/or consent of the user). Alternatively and/or additionally, the audio may be accessed (e.g., and/or hacked) by entities via the internet.
- Thus, in accordance with one or more of the techniques presented herein, a second device may monitor audio received via a second microphone of the second device. Responsive to detecting a trigger phrase, the second device may generate an electronic message comprising instructions to activate a microphone function of the device (e.g., and/or the virtual assistant of the device). The second device may (e.g., then) transmit the electronic message to the device. Accordingly, the device may not rely on continuously monitoring audio using the microphone of the device to detect the trigger phrase and/or activate the virtual assistant. The second device may be a trusted device that the user may believe does not compromise privacy and/or security of the user. In some examples, the second device may not be connected (e.g., directly) to the internet.
- An embodiment of detecting trigger phrases and transmitting electronic messages (e.g., associated with the trigger phrases) to devices is illustrated by an
example method 400 ofFIG. 4A . A user, such as user Jill, may access and/or interact with a plurality of virtual assistants using a plurality of devices (e.g., smartphones, tablets, computers, smart speakers, voice command devices, etc.). Each device of the plurality of devices may have a virtual assistant of the plurality of virtual assistants (e.g., installed) and/or may be connected to a first device. In some examples, the first device may interact with a second device (e.g., and/or activate a microphone function of the second device) of the plurality of devices responsive to detecting a first trigger phrase corresponding to the second device (e.g., while monitoring audio received via a first microphone of the first device). In some examples, the second device and/or the first trigger phrase may be (e.g., determined to be) associated with a first virtual assistant (e.g., and thus the first trigger phase may be determined to correspond to the second device). For example, the first virtual assistant may be installed onto the second device. Alternatively and/or additionally, the first device may interact with a third device (e.g., and/or activate a microphone function of the third device) of the plurality of devices responsive to detecting a second trigger phrase corresponding to the third device (e.g., while monitoring audio received via a first microphone of the first device). In some examples, the third device and/or the second trigger phrase may be (e.g., determined to be) associated with a second virtual assistant (e.g., and thus the second trigger phase may be determined to correspond to the third device). For example, the second virtual assistant may be installed onto the third device. - In some examples, the first device may be selected by the user. For example, the first device may be a trusted device that the user may believe does not compromise privacy and/or security of the user. In some examples, the first device may be selected using a device of the plurality of devices. For example, the second device may scan an environment such as a local area network, a personal area network, etc. The second device may then determine a second plurality of devices that are connected to the local area network, the personal area network, etc. The second plurality of devices may (e.g., then) be ranked based upon security. Alternatively and/or additionally, the second plurality of devices may be (e.g., further) ranked based upon a capability for monitoring audio and/or interacting with devices of the plurality of devices responsive to detecting trigger phrases corresponding to the devices. In an example, the first device may be ranked higher than a fourth device (e.g., of the second plurality of devices) because the first device may not be (e.g., continuously) connected to the internet and the fourth device may be (e.g., continuously) connected to the internet. In another example, the first device may be ranked higher than the fourth device because the first device is manufactured by a first manufacturer and the second device is manufactured by a second manufacturer, wherein the first manufacturer may be associated with a first security level (e.g., and/or a first trust level) and the second manufacturer may be associated with a second security level (e.g., and/or a second trust level). The first security level (e.g., and/or the first trust level) may be higher than the second security level (e.g., and/or the second trust level). In some examples, the first device may be ranked highest (e.g., of the second plurality of devices) and/or selected automatically by the second device. Alternatively and/or additionally, the second device may present a ranked list of the second plurality of devices. The first device may be selected by the user from the ranked list of the second plurality of devices.
- In some examples, the first device may comprise a dedicated power source (e.g., a battery) and/or may be connected to a power supply and/or an adaptor. In some examples, the first device (e.g., and/or the second plurality of devices) may comprise an (e.g., home) appliance and/or an (e.g., consumer) electronic device, such as a television, a video game console, a laptop, a desktop computer, a motor-vehicle computer, a smartphone, a tablet, an e-reader, etc. Alternatively and/or additionally, the first device (e.g., and/or the second plurality of devices) may be configured and/or manufactured specifically for monitoring audio received via the first microphone and/or interacting with devices of the plurality of devices responsive to detecting trigger phrases corresponding to the devices.
- In some examples, the first device may be configured to connect to the internet and/or access one or more servers (e.g., accessed by the first device via a network connection). In some examples, the first device may have various security settings. A high security setting may require that the first device does not connect to (e.g., and/or access) the internet. A medium security setting may require that the first device connects to (e.g., and/or accesses) the internet (e.g., only) for voice recognition information. A low security setting may allow the first device to connect to the internet for voice recognition information and/or to receive and/or transmit (e.g., other types of) information using the internet. In some examples, the security settings may be changed (e.g., by the user) by using a security interface.
- At 405, audio received via the first microphone of the first device may be monitored (e.g., by the first device). For example, the first microphone may be (e.g., continuously) active and/or may receive audio of an area around the first microphone (e.g., and/or the first device). In some examples, the first device may be configured to detect speech during the monitoring of the audio. Responsive to detecting speech, an audio segment comprising the speech may be extracted and/or analyzed and the audio segment may be compared to a plurality of trigger phrases associated with the plurality of devices and/or the plurality of virtual assistants. For example, each trigger phrase of the plurality of trigger phrases may be associated with a device of the plurality of devices and/or a virtual assistant of the plurality of virtual assistants. Responsive to determining that the audio segment does not comprise speech similar to (e.g., any trigger phrase of) the plurality of trigger phrases, the audio segment may be discarded (e.g., and/or the first device may continue monitoring audio received via the first microphone).
- In some examples, information associated with the plurality of trigger phrases may be comprised within a voice recognition database. The voice recognition database may comprise voice recognition information corresponding to the plurality of trigger phrases. For example, the voice recognition database may comprise a data structure of the plurality of trigger phrases, wherein each trigger phrase of the trigger phrases is linked to a set of voice recognition information of the trigger phrase. In some examples, the voice recognition database may be stored in one or more servers accessed by the first device via a network connection. Alternatively and/or additionally, the voice recognition database may be stored in a memory device of the first device. In some examples, voice recognition information may be extracted from the voice recognition database to compare the audio segment to the plurality of trigger phrases.
- At 410, responsive to detecting the first trigger phrase in a first audio segment identified during the monitoring, a first electronic message may be generated. The first electronic message may (e.g., then) be transmitted to the second device. The first electronic message may comprise instructions to activate a microphone function (e.g., and/or an input function) of the second device. In some examples, the first electronic message may comprise a first push notification.
- In some examples, at least a portion of the first audio segment may be compared with (e.g., each trigger phrase of) the plurality of trigger phrases. For example, the first device may determine that the first audio segment comprises speech similar to the first trigger phrase. In an example, the first device may determine that the first audio segment comprises speech 89% similar to the first trigger phrase and/or the first audio segment comprises speech 12% similar to the second trigger phrase. In this example, the first trigger phrase may be identified based upon the determination that the first audio segment comprises speech 89% similar to the first trigger phrase. For example, the first trigger phrase may be identified responsive to determining that a similarity of at least a portion of the first audio segment and the first trigger phrase is above a trigger phrase threshold (e.g., 70%, 80%, 85%, 90%, etc.). In some examples, responsive to identifying the first trigger phrase, the first device may stop comparing the first audio segment with trigger phrases of the plurality of trigger phrases.
- In some examples, each trigger phrase of the plurality of trigger phrases may be associated with a device of the plurality of devices. For example, the first trigger phrase may be associated with the second device. Accordingly, the second device may be selected (e.g., for transmission of the first electronic message) from amongst the plurality of devices based upon the first trigger phrase. Alternatively and/or additionally, each trigger phrase may be associated with a virtual assistant of the plurality of virtual assistants. Accordingly, the first virtual assistant may be selected from amongst the plurality of virtual assistants based upon the first trigger phrase. Further, the second device may be selected from amongst the plurality of devices based upon the first virtual assistant.
- In some examples, the first electronic message may (e.g., further) comprise instructions to activate a speaker function of the second device and/or a screen function of the second device. In some examples, the first electronic message may (e.g., merely) comprise an indication that the first trigger phrase was detected. The second device may interpret the indication that the first trigger phrase was detected as instructions (e.g., and/or a trigger) to activate the microphone function of the second device, instructions (e.g., and/or a trigger) to activate the input function of the second device (e.g., such as a camera function), instructions (e.g., and/or a trigger) to activate the speaker function of the second device and/or instructions (e.g., and/or a trigger) to activate the screen function of the second device.
- Responsive to receiving the first electronic message, the second device may prompt the user to provide a first command to the second device. For example, the second device may activate (e.g., turn on) a screen of the second device, display a first graphical object using the screen of the second device, output a first audio message and/or a first sound effect using a speaker of the second device, activate a first camera of the second device (e.g., in order to receive the first command) and/or activate a second microphone of the second device (e.g., in order to receive the first command). In an example, the first graphical object, the first audio message and/or the first sound effect may indicate to the user that the second device is ready to receive the first command via the first camera and/or the second microphone. For example, the first graphical object (e.g., displayed on the screen of the second device) may comprise “Listening for your command” and/or the first audio message (e.g., outputted using the speaker of the second device) may comprise “Please state your command”. The second device may (e.g., then) receive the first command using the second microphone, using the first camera of the second device and/or via text received using a keyboard (e.g., and/or an on-screen keyboard) of the second device. The second device may (e.g., then) perform a first action based upon the first command.
- At 415, responsive to detecting the second trigger phrase in a second audio segment identified during the monitoring, a second electronic message of the third device may be generated. The second electronic message may (e.g., then) be transmitted to the third device. The second electronic message may comprise instructions to activate a microphone function (e.g., and/or an input function). In some examples, the second electronic message may comprise a second push notification.
- In some examples, at least a portion of the second audio segment may be compared with (e.g., each trigger phrase of) the plurality of trigger phrases. For example, the first device may determine that the second audio segment comprises speech similar to the second trigger phrase. In an example, the first device may determine that the second audio segment comprises speech 93% similar to the second trigger phrase and/or the second audio segment comprises speech 13% similar to the first trigger phrase. In this example, the second trigger phrase may be identified based upon the determination that the second audio segment comprises speech 93% similar to the second trigger phrase. For example, the second trigger phrase may be identified responsive to determining that a similarity of at least a portion of the second audio segment and the second trigger phrase is above the trigger phrase threshold. In some examples, responsive to identifying the second trigger phrase, the first device may stop comparing the second audio segment with trigger phrases of the plurality of trigger phrases. In some examples, the second virtual assistant and/or the third device may be selected based upon the second trigger phrase.
- In some examples, the second electronic message may (e.g., further) comprise instructions to activate a speaker function of the third device and/or a screen function of the third device. In some examples, the second electronic message may (e.g., merely) comprise an indication that the second trigger phrase was detected. The third device may interpret the indication that the second trigger phrase was detected as instructions (e.g., and/or a trigger) to activate the microphone function of the third device, instructions (e.g., and/or a trigger) to activate the input function of the third device (e.g., such as a camera function), instructions (e.g., and/or a trigger) to activate the speaker function of the third device and/or instructions (e.g., and/or a trigger) to activate the screen function of the third device.
- Responsive to receiving the second electronic message, the third device may prompt the user to provide a second command to the third device. For example, the third device may activate (e.g., turn on) a screen of the third device, display a second graphical object using the screen of the third device, output a second audio message and/or a second sound effect using a speaker of the third device, activate a second camera of the third device (e.g., in order to receive the second command) and/or activate a third microphone of the third device (e.g., in order to receive the second command). In an example, the second graphical object, the second audio message and/or the second sound effect may indicate to the user that the third device is ready to receive the second command via the second camera and/or the third microphone. The third device may (e.g., then) receive the first command using the third microphone, using the second camera and/or via text received using a keyboard (e.g., and/or an on-screen keyboard) of the third device. The third device may (e.g., then) perform a second action based upon the second command.
- In some examples, the first device, the second device and/or the third device may be connected to the local area network (e.g., via Ethernet, Wi-Fi, etc.). Accordingly, the first device may transmit the first electronic message to the second device via the local area network. Alternatively and/or additionally, the first device may transmit the second electronic message to the third device via the local area network. Alternatively and/or additionally, the first device may be connected to the second device using a first wireless connection (e.g., a Bluetooth connection, a personal area network, etc.). For example, the first device may be paired to the second device using Bluetooth. Accordingly, the first device may transmit the first electronic message to the second device using the first wireless connection. Alternatively and/or additionally, the first device may be connected to the third device using a second wireless connection (e.g., a Bluetooth connection, a personal area network, etc.). For example, the first device may be paired to the third device using Bluetooth. Accordingly, the first device may transmit the second electronic message to the third device using the second wireless connection.
- An embodiment of detecting trigger phrases, detecting commands and/or transmitting electronic messages to devices is illustrated by an
example method 450 ofFIG. 4B . A user, such as user Jack, may access and/or interact with a plurality of virtual assistants using a plurality of devices (e.g., smartphones, tablets, computers, smart speakers, voice command devices, etc.). Each device of the plurality of devices may have a virtual assistant of the plurality of virtual assistants (e.g., installed) and/or may be connected to a first device. In some examples, the first device may interact with a second device of the plurality of devices responsive to detecting a first trigger phrase and/or a first command corresponding to the second device (e.g., while monitoring audio received via a first microphone of the first device). In some examples, the second device and/or the first trigger phrase may be associated with a first virtual assistant. For example, the first virtual assistant may be installed onto the second device. - In some examples, the first device may be selected by the user. In some examples, the first device may be a trusted device that the user may believe does not compromise privacy of the user. In some examples, the first device may comprise a dedicated power source (e.g., a battery) and/or may be connected to a power supply and/or an adaptor. In some examples, the first device may be selected automatically using a device of the plurality of devices.
- In some examples, the first device may be configured to connect to the internet and/or access one or more servers for voice recognition information. In some examples, the first device may have various security settings. A high security setting may require that the first device does not connect to (e.g., and/or access) the internet. A medium security setting may require that the first device connects to (e.g., and/or accesses) the internet (e.g., only) for voice recognition information. A low security setting may allow the first device to connect to the internet for voice recognition information and/or to receive and/or transmit (e.g., other types of) information using the internet.
- At 455, audio received via the first microphone of the first device may be monitored (e.g., by the first device). For example, the first microphone may be (e.g., continuously) active and/or may receive audio of an area around the first microphone (e.g., and/or the first device). In some examples, the first device may be configured to detect speech during the monitoring of the audio. Responsive to detecting speech, an audio segment comprising the speech may be analyzed and/or extracted and the audio segment may be compared to a plurality of trigger phrases associated with the plurality of devices and/or the plurality of virtual assistants. For example, each trigger phrase of the plurality of trigger phrases may be associated with a device of the plurality of devices and/or a virtual assistant of the plurality of virtual assistants. Responsive to determining that the audio segment does not comprise speech similar to (e.g., any trigger phrase of) the plurality of trigger phrases, the audio segment may be discarded (e.g., and/or the first device may continue monitoring audio received via the first microphone).
- At 460, responsive to detecting the first trigger phrase in a first audio segment identified during the monitoring and the first command in a second audio segment identified during the monitoring, a first electronic message may be generated. The first electronic message may be transmitted to the second device. The first electronic message may comprise instructions to perform an action associated with the first command. In some examples, the first electronic message may comprise a first push notification. In some examples, at least a portion of the first audio segment may be compared with (e.g., each trigger phrase of) the plurality of trigger phrases. For example, the first device may determine that the first audio segment comprises speech similar to the first trigger phrase. In some examples, responsive to detecting the first trigger phrase, the first device may prompt the user to provide the first command to the first device. For example, the first device may display a first graphical object using a screen of the first device and/or output a first audio message and/or a first sound effect using a speaker of the first device. Alternatively and/or additionally, the first device may detect the first trigger phrase and the first command within a threshold period of time and/or with less than a threshold number of words between the first trigger phrase and the first command (e.g., consecutively) and/or may not prompt the user to provide the first command to the first device.
- The first device may (e.g., continue) monitoring audio received via the first microphone to detect the first command in the second audio segment. In some examples, responsive to detecting the first command in the second audio segment identified during the monitoring, the second audio segment may be transcribed to generate a text transcription (e.g., of the second audio segment). In some examples, the second audio segment may be transcribed using voice recognition information. The voice recognition information may be comprised within a voice recognition database. In some examples, the voice recognition database may be stored in one or more servers accessed by the first device via a network connection. Alternatively and/or additionally, the voice recognition database may be stored in a memory device of the first device. In some examples, voice recognition information may be extracted from the voice recognition database to transcribe the second audio segment. In some examples, the voice recognition database may be exclusive (e.g., generated and/or maintained exclusively) for the user. In some examples, the voice recognition database may be dynamically updated to add new voice recognition information (e.g., based upon analyzing speech of the user, by receiving the new voice recognition information from one or more servers accessed by the first device via a network connection, by receiving the new voice recognition information from the second device, etc.).
- In some examples, the first command may be determined by the first device based upon an evaluation of the text transcription. Alternatively and/or additionally, the instructions to perform the action may comprise the text transcription (e.g., of the second audio segment). In an example, the second audio segment may comprise the user saying, “Give me directions to Brooklyn Park”. Accordingly, the second audio segment may be transcribed to generate the text transcription comprising, “Give me directions to Brooklyn Park”. The first electronic message (e.g., and/or the instructions to perform the action) may comprise (e.g., a representation of) the text transcription.
- In some examples, the second audio segment may not be transcribed (e.g., to generate the text transcription) by the first device. For example, the first device may generate an audio file based upon the second audio segment. In some examples, the first electronic message (e.g., and/or the instructions to perform the action) may comprise the audio file rather than the text transcription. In some examples, the second device may transcribe the audio file to generate the text transcription.
- In some examples, each trigger phrase of the plurality of trigger phrases may be associated with a device of the plurality of devices. For example, the first trigger phrase may be associated with the second device. Accordingly, the second device may be selected (e.g., for transmission of the first electronic message) from amongst the plurality of devices based upon the first trigger phrase. Alternatively and/or additionally, the second device may be selected (e.g., for transmission of the first electronic message) from amongst the plurality of devices based upon an analysis of the first command. For example, the first device may analyze the first command, the second audio segment, the text transcription and/or the audio file to determine a context, a subject matter, etc. of the first command. The second device may (e.g., then) be selected based upon a determination that the second device is (e.g., best) fit for performing the action responsive to receiving the first electronic message.
- In some examples, responsive to receiving the first electronic message, the second device may perform the action. For example, the second device may activate a screen of the second device, display a second graphical object associated with the action using the screen of the second device and/or output a second audio message associated with the action using a speaker of the second device. In an example, responsive to receiving the first electronic message, the second device may display the second graphical object comprising a map interface and/or GPS directions corresponding to the first electronic message and/or the second device may output the second audio message corresponding to the first electronic message using the speaker of the second device, comprising “Here are directions to Brooklyn Park”.
- Alternatively and/or additionally, the second device may output an inquiry associated with the action using the speaker of the second device and/or the screen of the second device. In an example, responsive to receiving the first electronic message comprising “Give me directions to Brooklyn Park”, the second device may output an inquiry using the speaker of the second device comprising “Did you mean Brooklyn Park in New York City or Brooklyn Park in Kansas City?”. In some examples, the second device may transmit a second electronic message to the first device comprising instructions to monitor audio received via the first microphone of the first device to detect a second command, associated with the first command.
- Alternatively and/or additionally, the second device may not transmit the second electronic message to the first device. For example, the first trigger phrase may be detected by the first device (e.g., after the inquiry is outputted by the speaker of the second device). Responsive to receiving the second electronic message and/or detecting the first trigger phrase, the first device may monitor audio received via the first microphone of the first device to detect the second command. In some examples, responsive to detecting the second command in a third audio segment identified during the monitoring, the first device may generate a third electronic message comprising instructions to perform a second action associated with the second command. In some examples, the third electronic message may comprise a third push notification.
- In some examples, the third audio segment may be transcribed to generate a second text transcription (e.g., of the third audio segment). In some examples, the third electronic message (e.g., and/or the instructions to perform the second action) may comprise the second text transcription. In an example, the third audio segment may comprise the user saying, “New York City”. Accordingly, the third audio segment may be transcribed to generate the second text transcription comprising, “New York City”.
- In some examples, the third audio segment may not be transcribed (e.g., to generate the second text transcription) by the first device. For example, the first device may generate a second audio file based upon the third audio segment. In some examples, the third electronic message (e.g., and/or the instructions to perform the second action) may comprise the second audio file rather than the second text transcription. In some examples, the second device may transcribe the second audio file to generate the second text transcription.
- The first device may transmit the third electronic message to the second device. In some examples, responsive to receiving the third electronic message, the second device may perform the second action. For example, the second device may activate the screen of the second device, display a fourth graphical object associated with the second action using the screen of the second device and/or output a third audio message associated with the second action using a speaker of the second device. In an example, the second device may display the fourth graphical object comprising a map interface and/or GPS directions corresponding to the third electronic message. The second device may (e.g., further) output the third audio message corresponding to the third electronic message using the speaker of the second device, comprising “Here are directions to Brooklyn Park in New York City”.
- In some examples, the first device may interact with a third device of the plurality of devices responsive to detecting a second trigger phrase and/or a third command corresponding to the third device. In some examples, the third device and/or the second trigger phrase may be associated with a second virtual assistant. For example, the second virtual assistant may be installed onto the third device.
- Responsive to detecting the second trigger phrase in a fourth audio segment identified during the monitoring and the third command in a fifth audio segment identified during the monitoring, a fourth electronic message may be generated. The fourth electronic message may (e.g., then) be transmitted to the third device. The fourth electronic message may comprise instructions to perform a third action associated with the third command. In some examples, the fourth electronic message may comprise a fourth push notification.
- In some examples, at least a portion of the fourth audio segment may be compared with (e.g., each trigger phrase of) the plurality of trigger phrases. For example, the first device may determine that the fourth audio segment comprises speech similar to the second trigger phrase. In some examples, responsive to detecting the second trigger phase, the first device may prompt the user to provide the third command to the first device. For example, the first device may display a fifth graphical object using the screen of the first device and/or output a fourth audio message and/or a second sound effect using the speaker of the first device.
- The first device may (e.g., continue) monitoring audio received via the first microphone to detect the third command in the fifth audio segment. In some examples, responsive to detecting the third command, the fifth audio segment may be transcribed to generate a third text transcription (e.g., of the fifth audio segment). In some examples, the fourth electronic message (e.g., and/or the instructions to perform the third action) may comprise the third text transcription. In some examples, the fifth audio segment may not be transcribed (e.g., to generate the third transcription) by the first device. For example, the first device may generate a third audio file based upon the fifth audio segment. In some examples, the fourth electronic message (e.g., and/or the instructions to perform the third action) may comprise the third audio file rather than the third text transcription. In some examples, the third device may transcribe the third audio file to generate the third text transcription. In an example, the fifth audio segment may comprise the user saying, “Please read me my unread text messages”.
- In some examples, the second trigger phrase may be associated with the third device. Accordingly, the third device may be selected (e.g., for transmission of the fourth electronic message) from amongst the plurality of devices based upon the second trigger phrase.
- Alternatively and/or additionally, the third device may be selected (e.g., for transmission of the fourth electronic message) from amongst the plurality of devices based upon an analysis of the third command. For example, the first device may analyze the third command, the fifth audio segment, the third text transcription and/or the third audio file to determine a context, a subject matter, etc. of the third command. The third device may (e.g., then) be selected based upon a determination that the third device is (e.g., best) fit and/or match (e.g., relative to one or more other devices) for performing the third action responsive to receiving the fourth electronic message. The fit and/or match of each device may be determined (e.g., scored) based upon one or more criteria, which may be retrieved from a database, or (e.g., manually) input/defined by a user.
- In some examples, responsive to receiving the fourth electronic message, the third device may perform the third action. For example, the third device may activate a screen of the third device, display a sixth graphical object associated with the third action using the screen of the third device, and/or output a fifth audio message associated with the third action using a speaker of the third device. In an example, responsive to receiving the fourth electronic message, the third device may display the sixth graphical object comprising a list of unread text messages (e.g., comprising one or more unread text messages received by the third device) and/or the third device may output one or more audio messages using the speaker of the second device, wherein each audio message of the one or more audio messages may correspond to an unread text message of the one or more unread text messages.
- In an example, the first device may detect the first trigger phrase, the first command, the second trigger phrase and/or the third command, consecutively. For example, while the second device performs the first action (e.g., and/or the second action), the first device may detect the second trigger phrase and/or the third command. Accordingly, the first device may transmit the fourth electronic message (e.g., corresponding to the third command) to the third device while the second device performs the first action (e.g., and/or the second action). In this way, the first action (e.g., and/or the second action) may be performed by the second device and the third action may be performed by the third device, simultaneously.
- In some examples, the first device, the second device and/or the third device may be connected to a local area network (e.g., via Ethernet, Wi-Fi, etc.). Accordingly, the first device may transmit the first electronic message and/or the third electronic message to the second device via the local area network. Alternatively and/or additionally, the second device may transmit the second electronic message to the first device via the local area network. Alternatively and/or additionally, the first device may transmit the fourth electronic message to the third device via the local area network. Alternatively and/or additionally, the first device may be connected to the second device using a first wireless connection (e.g., a Bluetooth connection, a personal area network, etc.). For example, the first device may be paired to the second device using Bluetooth. Accordingly, the first device may transmit the first electronic message and/or the third electronic message to the second device using the first wireless connection. Alternatively and/or additionally, the second device may transmit the second electronic message to the first device using the first wireless connection. Alternatively and/or additionally, the first device may be connected to the third device using a second wireless connection (e.g., a Bluetooth connection, a personal area network, etc.). For example, the first device may be paired to the third device using Bluetooth. Accordingly, the first device may transmit the fourth electronic message to the third device using the second wireless connection.
-
FIGS. 5A-5C illustrate examples of asystem 501 for detecting a trigger phrase and transmitting an electronic message (e.g., associated with the trigger phrase) to a device. A user, such as user James, may access and/or interact with a plurality of virtual assistants using a plurality of devices (e.g., smartphones, tablets, computers, smart speakers, voice command devices, etc.). Each device of the plurality of devices may have a virtual assistant of the plurality of virtual assistants (e.g., installed) and/or may be connected to afirst device 502. Thefirst device 502 may monitor audio received via afirst microphone 522 of thefirst device 502. The plurality of devices may comprise a second device 504 (e.g., a smartphone) associated with a first virtual assistant (e.g., of the plurality of virtual assistants), a third device 506 (e.g., a tablet) associated with a second virtual assistant (e.g., of the plurality of virtual assistants) and/or one or more (e.g., other) devices associated with one or more (e.g., other) virtual assistants. - The
second device 504 may comprise acamera 528, ascreen 524, amicrophone 510, abutton 512 and/or aspeaker 514. Alternatively and/or additionally, thethird device 506 may comprise acamera 530, ascreen 526, amicrophone 516, abutton 518 and/or aspeaker 520. In some examples, thesecond device 504 and/or the first virtual assistant may be associated with a first trigger phrase of a plurality of trigger phrases corresponding to the plurality of virtual assistants. Alternatively and/or additionally, thethird device 506 and/or the second virtual assistant may be associated with a second trigger phrase of the plurality of trigger phrases. The first trigger phrase may comprise “Hello Alpha” and the second trigger phrase may comprise “Hey Beta”. -
FIG. 5A illustrates thefirst device 502 detecting the second trigger phrase in afirst audio segment 508 identified during the monitoring. For example, thefirst audio segment 508 may comprise the user saying “Hey Beta”. In some examples, at least a portion of thefirst audio segment 508 may be compared with (e.g., each trigger phrase of) the plurality of trigger phrases (e.g., associated with the plurality of virtual assistants). -
FIG. 5B illustrates a backend system 550 (e.g., on thefirst device 502, on a server connected to thefirst device 502 via a network, etc.) that may compare thefirst audio segment 508 to the first trigger phrase and/or the second trigger phrase. For example, thefirst audio segment 508 may be compared to the first trigger phrase to determine that thefirst audio segment 508 comprisesspeech 26% similar to the first trigger phrase. Alternatively and/or additionally, thefirst audio segment 508 may be compared to the second trigger phrase to determine that thefirst audio segment 508 comprisesspeech 92% similar to the second trigger phrase. Accordingly, the second trigger phrase may be identified responsive to determining that a similarity of at least a portion of thefirst audio segment 508 and the second trigger phrase is above a trigger phrase threshold (e.g., 85%). The second virtual assistant may be selected (e.g., from amongst the plurality of virtual assistants) based upon the second trigger phrase. Alternatively and/or additionally, thethird device 506 may be selected (e.g., from amongst the plurality of devices) based upon the second virtual assistant (e.g., and/or the second trigger phrase). -
FIG. 5C illustrates thefirst device 502 transmitting a firstelectronic message 532 to thethird device 506. In some examples, thefirst device 502 may generate the firstelectronic message 532 comprising instructions to activate an input function of thethird device 506. The firstelectronic message 532 may comprise a first push notification. The firstelectronic message 532 may (e.g., further) comprise instructions to activate a speaker function of thethird device 506 and/or a screen function of thethird device 506. Alternatively and/or additionally, the firstelectronic message 532 may (e.g., merely) comprise an indication that the second trigger phrase was detected. Thethird device 506 may interpret the indication that the second trigger phrase was detected as instructions (e.g., and/or a trigger) to activate themicrophone 516, instructions (e.g., and/or a trigger) to activate thecamera 530, instructions (e.g., and/or a trigger) to activate thespeaker 520 and/or instructions (e.g., and/or a trigger) to activate thescreen 526. - Responsive to receiving the first
electronic message 532, thethird device 506 may prompt the user to provide a first command to thethird device 506. For example, the third device may activate (e.g., turn on) thescreen 526, display a firstgraphical object 536 using thescreen 526, output a first audio message and/or a first sound effect using thespeaker 520, activate the camera 530 (e.g., in order to receive the first command) and/or activate the microphone 516 (e.g., in order to receive the first command). In an example, the firstgraphical object 536, the first audio message and/or the first sound effect may indicate to the user that thethird device 506 is ready to receive the first command. For example, the firstgraphical object 536 may comprise “LISTENING FOR YOUR COMMAND” and/or the first audio message (e.g., outputted using the speaker 520) may comprise “Please state your command”. Thethird device 506 may (e.g., then) receive the first command in asecond audio segment 534 using themicrophone 516. Alternatively and/or additionally, thethird device 506 may receive the first command using thecamera 530 and/or via text received using a keyboard (e.g., and/or an on-screen keyboard) of thethird device 506. Thethird device 506 may (e.g., then) perform a first action based upon the first command. -
FIGS. 6A-6C illustrate examples of asystem 601 for detecting a trigger phrase, detecting a command and transmitting an electronic message (e.g., associated with the command) to a device. A user, such as user Janet, may access and/or interact with a plurality of virtual assistants using a plurality of devices (e.g., smartphones, tablets, computers, smart speakers, voice command devices, etc.). Each device of the plurality of devices may have a virtual assistant of the plurality of virtual assistants (e.g., installed) and/or may be connected to afirst device 602. Thefirst device 602 may monitor audio received via afirst microphone 622 of thefirst device 602. The plurality of devices may comprise a second device 604 (e.g., a smartphone) associated with a first virtual assistant (e.g., of the plurality of virtual assistants), a third device 606 (e.g., a tablet) associated with a second virtual assistant (e.g., of the plurality of virtual assistants) and/or one or more (e.g., other) devices associated with one or more (e.g., other) virtual assistants. - The
second device 604 may comprise acamera 628, ascreen 624, amicrophone 610, abutton 612 and/or aspeaker 614. Alternatively and/or additionally, thethird device 606 may comprise acamera 630, ascreen 626, amicrophone 616, abutton 618 and/or aspeaker 620. In some examples, thesecond device 604 and/or the first virtual assistant may be associated with a first trigger phrase of a plurality of trigger phrases corresponding to the plurality of virtual assistants. Alternatively and/or additionally, thethird device 606 and/or the second virtual assistant may be associated with a second trigger phrase of the plurality of trigger phrases. The first trigger phrase may comprise “Hello Alpha” and the second trigger phrase may comprise “Hey Beta”. -
FIG. 6A illustrates thefirst device 602 detecting the first trigger phrase in afirst audio segment 632 identified during the monitoring and a first command in asecond audio segment 634 identified during the monitoring. For example, thefirst audio segment 632 may comprise the using saying “Hello Alpha”. In some examples, at least a portion of thefirst audio segment 632 may be compared with (e.g., each trigger phrase of) a plurality of trigger phrases associated with the plurality of virtual assistants. Thesecond audio segment 634 may comprise the user saying “How do you make pancakes?”. -
FIG. 6B illustrates a backend system 650 (e.g., on thefirst device 602, on a server connected to thefirst device 602 via a network, etc.) that may compare thefirst audio segment 632 to the second trigger phrase and/or transcribe thesecond audio segment 634 to generate a text transcription. For example, thefirst audio segment 632 may be compared to the first trigger phrase to determine that thefirst audio segment 632 comprisesspeech 88% similar to the first trigger phrase. Accordingly, the first trigger phrase may be identified responsive to determining that a similarity of at least a portion of thefirst audio segment 632 and the first trigger phrase is above a trigger phrase threshold (e.g., 85%). In some examples, responsive to identifying the first trigger phrase, thefirst device 602 may stop comparing thefirst audio segment 632 with trigger phrases of the plurality of trigger phrases. Thesecond device 604 may be selected (e.g., from amongst the plurality of devices) based upon the first trigger phrase. - In some examples, the
second audio segment 634 may (e.g., then) be transcribed to generate a text transcription (e.g., of the second audio segment 634). For example, the text transcription may comprise “HOW DO YOU MAKE PANCAKES”. Alternatively and/or additionally, an audio file may be generated based upon thesecond audio segment 634. -
FIG. 6C illustrates thefirst device 602 transmitting a firstelectronic message 638 to thesecond device 604. In some examples, thefirst device 602 may generate the firstelectronic message 638 comprising instructions to perform an action associated with the first command. For example, the firstelectronic message 638 may comprise the text transcription and/or the audio file. The firstelectronic message 638 may comprise a first push notification. In some examples, responsive to receiving the firstelectronic message 638, thesecond device 604 may perform the action. For example, thesecond device 604 may activate thescreen 624, display a firstgraphical object 640 associated with the action using thescreen 624, and/or output a first audio message associated with the action using thespeaker 614. For example, the firstgraphical object 640 may comprise “Pancake Recipe” (e.g., and/or the first audio message may comprise cooking instructions for preparing pancakes). - It may be appreciated that the disclosed subject matter may assist a user in interacting with a plurality of virtual assistants by monitoring audio received using a microphone of a first device to detect trigger phrases and/or commands and/or by transmitting electronic messages based upon the trigger phrases and/or the commands to one or more devices of a plurality of devices (e.g., smartphones, tablets, computers, smart speakers, voice command devices, etc.) associated with the plurality of virtual assistants.
- Implementation of at least some of the disclosed subject matter may lead to benefits including, but not limited to, a reduction in power consumption and/or an increase in battery life of the plurality of devices (e.g., as a result of the first device comprising a dedicated power source and/or being connected to a power supply, as a result of the microphone of the first device being activated to monitor audio received using the microphone, as a result of detecting trigger phrases and/or commands during the monitoring, as a result of transmitting electronic messages to the plurality of devices corresponding to the trigger phrases and/or the commands, as a result of a plurality of microphones of the plurality of devices being deactivated, as a result of the plurality of devices not monitoring audio received by the plurality of microphones, etc.).
- Alternatively and/or additionally, implementation of at least some of the disclosed subject matter may lead to benefits including greater security and/or privacy for the user (e.g., as a result of the first device not being connected to the internet, as a result of the first device only connecting to the internet for voice recognition information responsive to detecting trigger phrases and/or commands, as a result of microphones of devices of the plurality of devices that are connected to the internet being deactivated, as a result of the devices that are connected to the internet not monitoring audio received by the microphones, etc.).
- Alternatively and/or additionally, implementation of at least some of the disclosed subject matter may lead to benefits including a reduction in bandwidth (e.g., as a result of identifying the trigger phrases and/or transcribing the commands by accessing a voice recognition database stored in a memory device of the first device rather than in one or more servers, etc.). Alternatively and/or additionally, implementation of at least some of the disclosed subject matter may lead to benefits including an increase in speed and usability of the plurality of devices (e.g., as a result of fewer operations performed by the plurality of devices without monitoring audio received by the plurality of microphones, etc.).
- In some examples, at least some of the disclosed subject matter may be implemented on a client device, and in some examples, at least some of the disclosed subject matter may be implemented on a server (e.g., hosting a service accessible via a network, such as the Internet).
-
FIG. 7 is an illustration of ascenario 700 involving an example non-transitory machinereadable medium 702. The non-transitory machinereadable medium 702 may comprise processor-executable instructions 712 that when executed by aprocessor 716 cause performance (e.g., by the processor 716) of at least some of the provisions herein (e.g., embodiment 714). The non-transitory machinereadable medium 702 may comprise a memory semiconductor (e.g., a semiconductor utilizing static random access memory (SRAM), dynamic random access memory (DRAM), and/or synchronous dynamic random access memory (SDRAM) technologies), a platter of a hard disk drive, a flash memory device, or a magnetic or optical disc (such as a compact disc (CD), digital versatile disc (DVD), or floppy disk). The example non-transitory machine readable medium 702 stores computer-readable data 704 that, when subjected to reading 706 by areader 710 of a device 708 (e.g., a read head of a hard disk drive, or a read operation invoked on a solid-state storage device), express the processor-executable instructions 712. In some embodiments, the processor-executable instructions 712, when executed, cause performance of operations, such as at least some of theexample method 400 ofFIG. 4A and/or theexample method 450 ofFIG. 4B , for example. In some embodiments, the processor-executable instructions 712 are configured to cause implementation of a system, such as at least some of theexample system 501 ofFIGS. 5A-5C and/or theexample system 601 ofFIGS. 6A-6C , for example. - 3. Usage of Terms
- As used in this application, “component,” “module,” “system”, “interface”, and/or the like are generally intended to refer to a computer-related entity, either hardware, a combination of hardware and software, software, or software in execution. For example, a component may be, but is not limited to being, a process running on a processor, a processor, an object, an executable, a thread of execution, a program, and/or a computer. By way of illustration, both an application running on a controller and the controller can be a component. One or more components may reside within a process and/or thread of execution and a component may be localized on one computer and/or distributed between two or more computers.
- Unless specified otherwise, “first,” “second,” and/or the like are not intended to imply a temporal aspect, a spatial aspect, an ordering, etc. Rather, such terms are merely used as identifiers, names, etc. for features, elements, items, etc. For example, a first object and a second object generally correspond to object A and object B or two different or two identical objects or the same object.
- Moreover, “example” is used herein to mean serving as an instance, illustration, etc., and not necessarily as advantageous. As used herein, “or” is intended to mean an inclusive “or” rather than an exclusive “or”. In addition, “a” and “an” as used in this application are generally be construed to mean “one or more” unless specified otherwise or clear from context to be directed to a singular form. Also, at least one of A and B and/or the like generally means A or B or both A and B. Furthermore, to the extent that “includes”, “having”, “has”, “with”, and/or variants thereof are used in either the detailed description or the claims, such terms are intended to be inclusive in a manner similar to the term “comprising”.
- Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing at least some of the claims.
- Furthermore, the claimed subject matter may be implemented as a method, apparatus, or article of manufacture using standard programming and/or engineering techniques to produce software, firmware, hardware, or any combination thereof to control a computer to implement the disclosed subject matter. The term “article of manufacture” as used herein is intended to encompass a computer program accessible from any computer-readable device, carrier, or media. Of course, many modifications may be made to this configuration without departing from the scope or spirit of the claimed subject matter.
- Various operations of embodiments are provided herein. In an embodiment, one or more of the operations described may constitute computer readable instructions stored on one or more computer and/or machine readable media, which if executed will cause the operations to be performed. The order in which some or all of the operations are described should not be construed as to imply that these operations are necessarily order dependent. Alternative ordering will be appreciated by one skilled in the art having the benefit of this description. Further, it will be understood that not all operations are necessarily present in each embodiment provided herein. Also, it will be understood that not all operations are necessary in some embodiments.
- Also, although the disclosure has been shown and described with respect to one or more implementations, equivalent alterations and modifications will occur to others skilled in the art based upon a reading and understanding of this specification and the annexed drawings. The disclosure includes all such modifications and alterations and is limited only by the scope of the following claims. In particular regard to the various functions performed by the above described components (e.g., elements, resources, etc.), the terms used to describe such components are intended to correspond, unless otherwise indicated, to any component which performs the specified function of the described component (e.g., that is functionally equivalent), even though not structurally equivalent to the disclosed structure. In addition, while a particular feature of the disclosure may have been disclosed with respect to only one of several implementations, such feature may be combined with one or more other features of the other implementations as may be desired and advantageous for any given or particular application.
Claims (20)
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US15/941,395 US11151991B2 (en) | 2018-03-30 | 2018-03-30 | Electronic message transmission |
US17/503,810 US11922937B2 (en) | 2018-03-30 | 2021-10-18 | Electronic message transmission |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US15/941,395 US11151991B2 (en) | 2018-03-30 | 2018-03-30 | Electronic message transmission |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/503,810 Continuation US11922937B2 (en) | 2018-03-30 | 2021-10-18 | Electronic message transmission |
Publications (2)
Publication Number | Publication Date |
---|---|
US20190304443A1 true US20190304443A1 (en) | 2019-10-03 |
US11151991B2 US11151991B2 (en) | 2021-10-19 |
Family
ID=68057113
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/941,395 Active 2039-01-22 US11151991B2 (en) | 2018-03-30 | 2018-03-30 | Electronic message transmission |
US17/503,810 Active 2038-07-23 US11922937B2 (en) | 2018-03-30 | 2021-10-18 | Electronic message transmission |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/503,810 Active 2038-07-23 US11922937B2 (en) | 2018-03-30 | 2021-10-18 | Electronic message transmission |
Country Status (1)
Country | Link |
---|---|
US (2) | US11151991B2 (en) |
Cited By (73)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190273963A1 (en) * | 2016-06-27 | 2019-09-05 | Amazon Technologies, Inc. | Systems and methods for routing content to an associated output device |
US10573321B1 (en) * | 2018-09-25 | 2020-02-25 | Sonos, Inc. | Voice detection optimization based on selected voice assistant service |
US10606555B1 (en) | 2017-09-29 | 2020-03-31 | Sonos, Inc. | Media playback system with concurrent voice assistance |
US10614807B2 (en) | 2016-10-19 | 2020-04-07 | Sonos, Inc. | Arbitration-based voice recognition |
US10692518B2 (en) | 2018-09-29 | 2020-06-23 | Sonos, Inc. | Linear filtering for noise-suppressed speech detection via multiple network microphone devices |
US10714115B2 (en) | 2016-06-09 | 2020-07-14 | Sonos, Inc. | Dynamic player selection for audio signal processing |
US10743101B2 (en) | 2016-02-22 | 2020-08-11 | Sonos, Inc. | Content mixing |
US20200278832A1 (en) * | 2019-02-28 | 2020-09-03 | Qualcomm Incorporated | Voice activation for computing devices |
US10847178B2 (en) | 2018-05-18 | 2020-11-24 | Sonos, Inc. | Linear filtering for noise-suppressed speech detection |
US10847143B2 (en) | 2016-02-22 | 2020-11-24 | Sonos, Inc. | Voice control of a media playback system |
US10847164B2 (en) | 2016-08-05 | 2020-11-24 | Sonos, Inc. | Playback device supporting concurrent voice assistants |
US10873819B2 (en) | 2016-09-30 | 2020-12-22 | Sonos, Inc. | Orientation-based playback device microphone selection |
US10871943B1 (en) | 2019-07-31 | 2020-12-22 | Sonos, Inc. | Noise classification for event detection |
US10880650B2 (en) | 2017-12-10 | 2020-12-29 | Sonos, Inc. | Network microphone devices with automatic do not disturb actuation capabilities |
US10878811B2 (en) | 2018-09-14 | 2020-12-29 | Sonos, Inc. | Networked devices, systems, and methods for intelligently deactivating wake-word engines |
US10880644B1 (en) | 2017-09-28 | 2020-12-29 | Sonos, Inc. | Three-dimensional beam forming with a microphone array |
US10891932B2 (en) | 2017-09-28 | 2021-01-12 | Sonos, Inc. | Multi-channel acoustic echo cancellation |
US10959029B2 (en) | 2018-05-25 | 2021-03-23 | Sonos, Inc. | Determining and adapting to changes in microphone performance of playback devices |
US10970035B2 (en) | 2016-02-22 | 2021-04-06 | Sonos, Inc. | Audio response playback |
US10971158B1 (en) * | 2018-10-05 | 2021-04-06 | Facebook, Inc. | Designating assistants in multi-assistant environment based on identified wake word received from a user |
US11017789B2 (en) | 2017-09-27 | 2021-05-25 | Sonos, Inc. | Robust Short-Time Fourier Transform acoustic echo cancellation during audio playback |
US11024311B2 (en) * | 2014-10-09 | 2021-06-01 | Google Llc | Device leadership negotiation among voice interface devices |
US11024331B2 (en) | 2018-09-21 | 2021-06-01 | Sonos, Inc. | Voice detection optimization using sound metadata |
US11042355B2 (en) | 2016-02-22 | 2021-06-22 | Sonos, Inc. | Handling of loss of pairing between networked devices |
US11076035B2 (en) | 2018-08-28 | 2021-07-27 | Sonos, Inc. | Do not disturb feature for audio notifications |
US11080005B2 (en) | 2017-09-08 | 2021-08-03 | Sonos, Inc. | Dynamic computation of system response volume |
US11100923B2 (en) | 2018-09-28 | 2021-08-24 | Sonos, Inc. | Systems and methods for selective wake word detection using neural network models |
US11132989B2 (en) | 2018-12-13 | 2021-09-28 | Sonos, Inc. | Networked microphone devices, systems, and methods of localized arbitration |
US11138975B2 (en) | 2019-07-31 | 2021-10-05 | Sonos, Inc. | Locally distributed keyword detection |
US11138969B2 (en) | 2019-07-31 | 2021-10-05 | Sonos, Inc. | Locally distributed keyword detection |
US11159880B2 (en) | 2018-12-20 | 2021-10-26 | Sonos, Inc. | Optimization of network microphone devices using noise classification |
US11170787B2 (en) * | 2018-04-12 | 2021-11-09 | Spotify Ab | Voice-based authentication |
US11175880B2 (en) | 2018-05-10 | 2021-11-16 | Sonos, Inc. | Systems and methods for voice-assisted media content selection |
US11184969B2 (en) | 2016-07-15 | 2021-11-23 | Sonos, Inc. | Contextualization of voice inputs |
US11183183B2 (en) | 2018-12-07 | 2021-11-23 | Sonos, Inc. | Systems and methods of operating media playback systems having multiple voice assistant services |
US11183181B2 (en) | 2017-03-27 | 2021-11-23 | Sonos, Inc. | Systems and methods of multiple voice services |
US11189286B2 (en) | 2019-10-22 | 2021-11-30 | Sonos, Inc. | VAS toggle based on device orientation |
US11197096B2 (en) | 2018-06-28 | 2021-12-07 | Sonos, Inc. | Systems and methods for associating playback devices with voice assistant services |
US11200900B2 (en) | 2019-12-20 | 2021-12-14 | Sonos, Inc. | Offline voice control |
US11200894B2 (en) | 2019-06-12 | 2021-12-14 | Sonos, Inc. | Network microphone device with command keyword eventing |
US11200889B2 (en) | 2018-11-15 | 2021-12-14 | Sonos, Inc. | Dilated convolutions and gating for efficient keyword spotting |
US11302326B2 (en) | 2017-09-28 | 2022-04-12 | Sonos, Inc. | Tone interference cancellation |
US11308962B2 (en) | 2020-05-20 | 2022-04-19 | Sonos, Inc. | Input detection windowing |
US11308958B2 (en) | 2020-02-07 | 2022-04-19 | Sonos, Inc. | Localized wakeword verification |
US11315556B2 (en) | 2019-02-08 | 2022-04-26 | Sonos, Inc. | Devices, systems, and methods for distributed voice processing by transmitting sound data associated with a wake word to an appropriate device for identification |
US11343614B2 (en) | 2018-01-31 | 2022-05-24 | Sonos, Inc. | Device designation of playback and network microphone device arrangements |
US11361756B2 (en) | 2019-06-12 | 2022-06-14 | Sonos, Inc. | Conditional wake word eventing based on environment |
US11380322B2 (en) | 2017-08-07 | 2022-07-05 | Sonos, Inc. | Wake-word detection suppression |
US20220215835A1 (en) * | 2021-01-06 | 2022-07-07 | Comcast Cable Communications, Llc | Evaluating user device activations |
US20220230634A1 (en) * | 2021-01-15 | 2022-07-21 | Harman International Industries, Incorporated | Systems and methods for voice exchange beacon devices |
US11405430B2 (en) | 2016-02-22 | 2022-08-02 | Sonos, Inc. | Networked microphone device control |
US11432030B2 (en) | 2018-09-14 | 2022-08-30 | Sonos, Inc. | Networked devices, systems, and methods for associating playback devices based on sound codes |
US20220301557A1 (en) * | 2021-03-19 | 2022-09-22 | Mitel Networks Corporation | Generating action items during a conferencing session |
US20220300249A1 (en) * | 2019-03-19 | 2022-09-22 | Spotify Ab | Refinement of voice query interpretation |
US11482224B2 (en) | 2020-05-20 | 2022-10-25 | Sonos, Inc. | Command keywords with input detection windowing |
US11482978B2 (en) | 2018-08-28 | 2022-10-25 | Sonos, Inc. | Audio notifications |
US11501773B2 (en) | 2019-06-12 | 2022-11-15 | Sonos, Inc. | Network microphone device with command keyword conditioning |
US11551700B2 (en) | 2021-01-25 | 2023-01-10 | Sonos, Inc. | Systems and methods for power-efficient keyword detection |
US11556307B2 (en) | 2020-01-31 | 2023-01-17 | Sonos, Inc. | Local voice data processing |
US11556306B2 (en) | 2016-02-22 | 2023-01-17 | Sonos, Inc. | Voice controlled media playback system |
US11562740B2 (en) | 2020-01-07 | 2023-01-24 | Sonos, Inc. | Voice verification for media playback |
US11641559B2 (en) | 2016-09-27 | 2023-05-02 | Sonos, Inc. | Audio playback settings for voice interaction |
US11646023B2 (en) | 2019-02-08 | 2023-05-09 | Sonos, Inc. | Devices, systems, and methods for distributed voice processing |
US11664023B2 (en) | 2016-07-15 | 2023-05-30 | Sonos, Inc. | Voice detection by multiple devices |
US11676590B2 (en) | 2017-12-11 | 2023-06-13 | Sonos, Inc. | Home graph |
US11698771B2 (en) | 2020-08-25 | 2023-07-11 | Sonos, Inc. | Vocal guidance engines for playback devices |
US11727919B2 (en) | 2020-05-20 | 2023-08-15 | Sonos, Inc. | Memory allocation for keyword spotting engines |
US11798553B2 (en) | 2019-05-03 | 2023-10-24 | Sonos, Inc. | Voice assistant persistence across multiple network microphone devices |
US11899519B2 (en) | 2018-10-23 | 2024-02-13 | Sonos, Inc. | Multiple stage network microphone device with reduced power consumption and processing load |
EP4301092A4 (en) * | 2021-02-25 | 2024-05-01 | Panasonic Intellectual Property Management Co., Ltd. | REGULATOR AND COOKING APPLIANCE WITH SAID REGULATOR |
US11984123B2 (en) | 2020-11-12 | 2024-05-14 | Sonos, Inc. | Network device interaction by range |
US12254884B2 (en) | 2014-10-09 | 2025-03-18 | Google Llc | Hotword detection on multiple devices |
US12283269B2 (en) | 2020-10-16 | 2025-04-22 | Sonos, Inc. | Intent inference in audiovisual communication sessions |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030210770A1 (en) * | 2002-05-10 | 2003-11-13 | Brian Krejcarek | Method and apparatus for peer-to-peer voice communication using voice recognition and proper noun identification |
US20110054900A1 (en) * | 2007-03-07 | 2011-03-03 | Phillips Michael S | Hybrid command and control between resident and remote speech recognition facilities in a mobile voice-to-speech application |
US20140229184A1 (en) * | 2013-02-14 | 2014-08-14 | Google Inc. | Waking other devices for additional data |
US20140278435A1 (en) * | 2013-03-12 | 2014-09-18 | Nuance Communications, Inc. | Methods and apparatus for detecting a voice command |
US20160217790A1 (en) * | 2014-10-09 | 2016-07-28 | Google Inc. | Hotword detection on multiple devices |
US20180247645A1 (en) * | 2015-08-19 | 2018-08-30 | Huawei Technologies Co., Ltd. | Communication Method, Server, and Device |
US20190325869A1 (en) * | 2018-04-23 | 2019-10-24 | Spotify Ab | Activation Trigger Processing |
Family Cites Families (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8340975B1 (en) * | 2011-10-04 | 2012-12-25 | Theodore Alfred Rosenberger | Interactive speech recognition device and system for hands-free building control |
US20140007117A1 (en) * | 2012-06-13 | 2014-01-02 | Bluebox | Methods and apparatus for modifying software applications |
US9697378B2 (en) | 2013-12-13 | 2017-07-04 | International Business Machines Corporation | Network encrypted data object stored on an encrypted file system |
KR20160023089A (en) * | 2014-08-21 | 2016-03-03 | 엘지전자 주식회사 | Digital device and method for controlling the same |
US9699550B2 (en) * | 2014-11-12 | 2017-07-04 | Qualcomm Incorporated | Reduced microphone power-up latency |
US9691378B1 (en) * | 2015-11-05 | 2017-06-27 | Amazon Technologies, Inc. | Methods and devices for selectively ignoring captured audio data |
KR20170086814A (en) * | 2016-01-19 | 2017-07-27 | 삼성전자주식회사 | Electronic device for providing voice recognition and method thereof |
US10264030B2 (en) * | 2016-02-22 | 2019-04-16 | Sonos, Inc. | Networked microphone device control |
US10638279B2 (en) * | 2016-07-08 | 2020-04-28 | Openback Limited | Method and system for generating local mobile device notifications |
KR102729069B1 (en) * | 2016-12-01 | 2024-11-13 | 삼성전자 주식회사 | Lamp device for inputting or outputting voice signals and a method of driving the lamp device |
US10916243B2 (en) * | 2016-12-27 | 2021-02-09 | Amazon Technologies, Inc. | Messaging from a shared device |
KR101889279B1 (en) * | 2017-01-16 | 2018-08-21 | 주식회사 케이티 | System and method for provining sercive in response to voice command |
US10524046B2 (en) * | 2017-12-06 | 2019-12-31 | Ademco Inc. | Systems and methods for automatic speech recognition |
US10672380B2 (en) * | 2017-12-27 | 2020-06-02 | Intel IP Corporation | Dynamic enrollment of user-defined wake-up key-phrase for speech enabled computer system |
-
2018
- 2018-03-30 US US15/941,395 patent/US11151991B2/en active Active
-
2021
- 2021-10-18 US US17/503,810 patent/US11922937B2/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030210770A1 (en) * | 2002-05-10 | 2003-11-13 | Brian Krejcarek | Method and apparatus for peer-to-peer voice communication using voice recognition and proper noun identification |
US20110054900A1 (en) * | 2007-03-07 | 2011-03-03 | Phillips Michael S | Hybrid command and control between resident and remote speech recognition facilities in a mobile voice-to-speech application |
US20140229184A1 (en) * | 2013-02-14 | 2014-08-14 | Google Inc. | Waking other devices for additional data |
US20140278435A1 (en) * | 2013-03-12 | 2014-09-18 | Nuance Communications, Inc. | Methods and apparatus for detecting a voice command |
US20160217790A1 (en) * | 2014-10-09 | 2016-07-28 | Google Inc. | Hotword detection on multiple devices |
US20180247645A1 (en) * | 2015-08-19 | 2018-08-30 | Huawei Technologies Co., Ltd. | Communication Method, Server, and Device |
US20190325869A1 (en) * | 2018-04-23 | 2019-10-24 | Spotify Ab | Activation Trigger Processing |
Cited By (147)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US12254884B2 (en) | 2014-10-09 | 2025-03-18 | Google Llc | Hotword detection on multiple devices |
US11670297B2 (en) * | 2014-10-09 | 2023-06-06 | Google Llc | Device leadership negotiation among voice interface devices |
US12046241B2 (en) * | 2014-10-09 | 2024-07-23 | Google Llc | Device leadership negotiation among voice interface devices |
US20210249015A1 (en) * | 2014-10-09 | 2021-08-12 | Google Llc | Device Leadership Negotiation Among Voice Interface Devices |
US11024311B2 (en) * | 2014-10-09 | 2021-06-01 | Google Llc | Device leadership negotiation among voice interface devices |
US11832068B2 (en) | 2016-02-22 | 2023-11-28 | Sonos, Inc. | Music service selection |
US11405430B2 (en) | 2016-02-22 | 2022-08-02 | Sonos, Inc. | Networked microphone device control |
US10764679B2 (en) | 2016-02-22 | 2020-09-01 | Sonos, Inc. | Voice control of a media playback system |
US11184704B2 (en) | 2016-02-22 | 2021-11-23 | Sonos, Inc. | Music service selection |
US11863593B2 (en) | 2016-02-22 | 2024-01-02 | Sonos, Inc. | Networked microphone device control |
US11556306B2 (en) | 2016-02-22 | 2023-01-17 | Sonos, Inc. | Voice controlled media playback system |
US10847143B2 (en) | 2016-02-22 | 2020-11-24 | Sonos, Inc. | Voice control of a media playback system |
US12047752B2 (en) | 2016-02-22 | 2024-07-23 | Sonos, Inc. | Content mixing |
US11983463B2 (en) | 2016-02-22 | 2024-05-14 | Sonos, Inc. | Metadata exchange involving a networked playback system and a networked microphone system |
US10743101B2 (en) | 2016-02-22 | 2020-08-11 | Sonos, Inc. | Content mixing |
US11726742B2 (en) | 2016-02-22 | 2023-08-15 | Sonos, Inc. | Handling of loss of pairing between networked devices |
US11736860B2 (en) | 2016-02-22 | 2023-08-22 | Sonos, Inc. | Voice control of a media playback system |
US11212612B2 (en) | 2016-02-22 | 2021-12-28 | Sonos, Inc. | Voice control of a media playback system |
US11042355B2 (en) | 2016-02-22 | 2021-06-22 | Sonos, Inc. | Handling of loss of pairing between networked devices |
US11750969B2 (en) | 2016-02-22 | 2023-09-05 | Sonos, Inc. | Default playback device designation |
US10970035B2 (en) | 2016-02-22 | 2021-04-06 | Sonos, Inc. | Audio response playback |
US10971139B2 (en) | 2016-02-22 | 2021-04-06 | Sonos, Inc. | Voice control of a media playback system |
US11513763B2 (en) | 2016-02-22 | 2022-11-29 | Sonos, Inc. | Audio response playback |
US11006214B2 (en) | 2016-02-22 | 2021-05-11 | Sonos, Inc. | Default playback device designation |
US11514898B2 (en) | 2016-02-22 | 2022-11-29 | Sonos, Inc. | Voice control of a media playback system |
US11133018B2 (en) | 2016-06-09 | 2021-09-28 | Sonos, Inc. | Dynamic player selection for audio signal processing |
US11545169B2 (en) | 2016-06-09 | 2023-01-03 | Sonos, Inc. | Dynamic player selection for audio signal processing |
US10714115B2 (en) | 2016-06-09 | 2020-07-14 | Sonos, Inc. | Dynamic player selection for audio signal processing |
US11064248B2 (en) * | 2016-06-27 | 2021-07-13 | Amazon Technologies, Inc. | Systems and methods for routing content to an associated output device |
US20190273963A1 (en) * | 2016-06-27 | 2019-09-05 | Amazon Technologies, Inc. | Systems and methods for routing content to an associated output device |
US11356730B2 (en) * | 2016-06-27 | 2022-06-07 | Amazon Technologies, Inc. | Systems and methods for routing content to an associated output device |
US11184969B2 (en) | 2016-07-15 | 2021-11-23 | Sonos, Inc. | Contextualization of voice inputs |
US11664023B2 (en) | 2016-07-15 | 2023-05-30 | Sonos, Inc. | Voice detection by multiple devices |
US11979960B2 (en) | 2016-07-15 | 2024-05-07 | Sonos, Inc. | Contextualization of voice inputs |
US10847164B2 (en) | 2016-08-05 | 2020-11-24 | Sonos, Inc. | Playback device supporting concurrent voice assistants |
US11531520B2 (en) | 2016-08-05 | 2022-12-20 | Sonos, Inc. | Playback device supporting concurrent voice assistants |
US11641559B2 (en) | 2016-09-27 | 2023-05-02 | Sonos, Inc. | Audio playback settings for voice interaction |
US11516610B2 (en) | 2016-09-30 | 2022-11-29 | Sonos, Inc. | Orientation-based playback device microphone selection |
US10873819B2 (en) | 2016-09-30 | 2020-12-22 | Sonos, Inc. | Orientation-based playback device microphone selection |
US11308961B2 (en) | 2016-10-19 | 2022-04-19 | Sonos, Inc. | Arbitration-based voice recognition |
US11727933B2 (en) | 2016-10-19 | 2023-08-15 | Sonos, Inc. | Arbitration-based voice recognition |
US10614807B2 (en) | 2016-10-19 | 2020-04-07 | Sonos, Inc. | Arbitration-based voice recognition |
US11183181B2 (en) | 2017-03-27 | 2021-11-23 | Sonos, Inc. | Systems and methods of multiple voice services |
US12217748B2 (en) | 2017-03-27 | 2025-02-04 | Sonos, Inc. | Systems and methods of multiple voice services |
US11900937B2 (en) | 2017-08-07 | 2024-02-13 | Sonos, Inc. | Wake-word detection suppression |
US11380322B2 (en) | 2017-08-07 | 2022-07-05 | Sonos, Inc. | Wake-word detection suppression |
US11500611B2 (en) | 2017-09-08 | 2022-11-15 | Sonos, Inc. | Dynamic computation of system response volume |
US11080005B2 (en) | 2017-09-08 | 2021-08-03 | Sonos, Inc. | Dynamic computation of system response volume |
US11017789B2 (en) | 2017-09-27 | 2021-05-25 | Sonos, Inc. | Robust Short-Time Fourier Transform acoustic echo cancellation during audio playback |
US11646045B2 (en) | 2017-09-27 | 2023-05-09 | Sonos, Inc. | Robust short-time fourier transform acoustic echo cancellation during audio playback |
US12047753B1 (en) | 2017-09-28 | 2024-07-23 | Sonos, Inc. | Three-dimensional beam forming with a microphone array |
US12236932B2 (en) | 2017-09-28 | 2025-02-25 | Sonos, Inc. | Multi-channel acoustic echo cancellation |
US11538451B2 (en) | 2017-09-28 | 2022-12-27 | Sonos, Inc. | Multi-channel acoustic echo cancellation |
US11302326B2 (en) | 2017-09-28 | 2022-04-12 | Sonos, Inc. | Tone interference cancellation |
US11769505B2 (en) | 2017-09-28 | 2023-09-26 | Sonos, Inc. | Echo of tone interferance cancellation using two acoustic echo cancellers |
US10880644B1 (en) | 2017-09-28 | 2020-12-29 | Sonos, Inc. | Three-dimensional beam forming with a microphone array |
US10891932B2 (en) | 2017-09-28 | 2021-01-12 | Sonos, Inc. | Multi-channel acoustic echo cancellation |
US10606555B1 (en) | 2017-09-29 | 2020-03-31 | Sonos, Inc. | Media playback system with concurrent voice assistance |
US11288039B2 (en) | 2017-09-29 | 2022-03-29 | Sonos, Inc. | Media playback system with concurrent voice assistance |
US11893308B2 (en) | 2017-09-29 | 2024-02-06 | Sonos, Inc. | Media playback system with concurrent voice assistance |
US11175888B2 (en) | 2017-09-29 | 2021-11-16 | Sonos, Inc. | Media playback system with concurrent voice assistance |
US11451908B2 (en) | 2017-12-10 | 2022-09-20 | Sonos, Inc. | Network microphone devices with automatic do not disturb actuation capabilities |
US10880650B2 (en) | 2017-12-10 | 2020-12-29 | Sonos, Inc. | Network microphone devices with automatic do not disturb actuation capabilities |
US11676590B2 (en) | 2017-12-11 | 2023-06-13 | Sonos, Inc. | Home graph |
US11689858B2 (en) | 2018-01-31 | 2023-06-27 | Sonos, Inc. | Device designation of playback and network microphone device arrangements |
US11343614B2 (en) | 2018-01-31 | 2022-05-24 | Sonos, Inc. | Device designation of playback and network microphone device arrangements |
US11170787B2 (en) * | 2018-04-12 | 2021-11-09 | Spotify Ab | Voice-based authentication |
US11797263B2 (en) | 2018-05-10 | 2023-10-24 | Sonos, Inc. | Systems and methods for voice-assisted media content selection |
US11175880B2 (en) | 2018-05-10 | 2021-11-16 | Sonos, Inc. | Systems and methods for voice-assisted media content selection |
US10847178B2 (en) | 2018-05-18 | 2020-11-24 | Sonos, Inc. | Linear filtering for noise-suppressed speech detection |
US11715489B2 (en) | 2018-05-18 | 2023-08-01 | Sonos, Inc. | Linear filtering for noise-suppressed speech detection |
US11792590B2 (en) | 2018-05-25 | 2023-10-17 | Sonos, Inc. | Determining and adapting to changes in microphone performance of playback devices |
US10959029B2 (en) | 2018-05-25 | 2021-03-23 | Sonos, Inc. | Determining and adapting to changes in microphone performance of playback devices |
US11696074B2 (en) | 2018-06-28 | 2023-07-04 | Sonos, Inc. | Systems and methods for associating playback devices with voice assistant services |
US11197096B2 (en) | 2018-06-28 | 2021-12-07 | Sonos, Inc. | Systems and methods for associating playback devices with voice assistant services |
US11563842B2 (en) | 2018-08-28 | 2023-01-24 | Sonos, Inc. | Do not disturb feature for audio notifications |
US11482978B2 (en) | 2018-08-28 | 2022-10-25 | Sonos, Inc. | Audio notifications |
US11076035B2 (en) | 2018-08-28 | 2021-07-27 | Sonos, Inc. | Do not disturb feature for audio notifications |
US11778259B2 (en) | 2018-09-14 | 2023-10-03 | Sonos, Inc. | Networked devices, systems and methods for associating playback devices based on sound codes |
US11551690B2 (en) | 2018-09-14 | 2023-01-10 | Sonos, Inc. | Networked devices, systems, and methods for intelligently deactivating wake-word engines |
US10878811B2 (en) | 2018-09-14 | 2020-12-29 | Sonos, Inc. | Networked devices, systems, and methods for intelligently deactivating wake-word engines |
US11432030B2 (en) | 2018-09-14 | 2022-08-30 | Sonos, Inc. | Networked devices, systems, and methods for associating playback devices based on sound codes |
US11790937B2 (en) | 2018-09-21 | 2023-10-17 | Sonos, Inc. | Voice detection optimization using sound metadata |
US11024331B2 (en) | 2018-09-21 | 2021-06-01 | Sonos, Inc. | Voice detection optimization using sound metadata |
US12230291B2 (en) | 2018-09-21 | 2025-02-18 | Sonos, Inc. | Voice detection optimization using sound metadata |
US10811015B2 (en) | 2018-09-25 | 2020-10-20 | Sonos, Inc. | Voice detection optimization based on selected voice assistant service |
US11031014B2 (en) * | 2018-09-25 | 2021-06-08 | Sonos, Inc. | Voice detection optimization based on selected voice assistant service |
US10573321B1 (en) * | 2018-09-25 | 2020-02-25 | Sonos, Inc. | Voice detection optimization based on selected voice assistant service |
US12165651B2 (en) | 2018-09-25 | 2024-12-10 | Sonos, Inc. | Voice detection optimization based on selected voice assistant service |
US11727936B2 (en) | 2018-09-25 | 2023-08-15 | Sonos, Inc. | Voice detection optimization based on selected voice assistant service |
US12165644B2 (en) | 2018-09-28 | 2024-12-10 | Sonos, Inc. | Systems and methods for selective wake word detection |
US11100923B2 (en) | 2018-09-28 | 2021-08-24 | Sonos, Inc. | Systems and methods for selective wake word detection using neural network models |
US11790911B2 (en) | 2018-09-28 | 2023-10-17 | Sonos, Inc. | Systems and methods for selective wake word detection using neural network models |
US10692518B2 (en) | 2018-09-29 | 2020-06-23 | Sonos, Inc. | Linear filtering for noise-suppressed speech detection via multiple network microphone devices |
US11501795B2 (en) | 2018-09-29 | 2022-11-15 | Sonos, Inc. | Linear filtering for noise-suppressed speech detection via multiple network microphone devices |
US12062383B2 (en) | 2018-09-29 | 2024-08-13 | Sonos, Inc. | Linear filtering for noise-suppressed speech detection via multiple network microphone devices |
US10971158B1 (en) * | 2018-10-05 | 2021-04-06 | Facebook, Inc. | Designating assistants in multi-assistant environment based on identified wake word received from a user |
US11899519B2 (en) | 2018-10-23 | 2024-02-13 | Sonos, Inc. | Multiple stage network microphone device with reduced power consumption and processing load |
US11741948B2 (en) | 2018-11-15 | 2023-08-29 | Sonos Vox France Sas | Dilated convolutions and gating for efficient keyword spotting |
US11200889B2 (en) | 2018-11-15 | 2021-12-14 | Sonos, Inc. | Dilated convolutions and gating for efficient keyword spotting |
US11557294B2 (en) | 2018-12-07 | 2023-01-17 | Sonos, Inc. | Systems and methods of operating media playback systems having multiple voice assistant services |
US11183183B2 (en) | 2018-12-07 | 2021-11-23 | Sonos, Inc. | Systems and methods of operating media playback systems having multiple voice assistant services |
US11538460B2 (en) | 2018-12-13 | 2022-12-27 | Sonos, Inc. | Networked microphone devices, systems, and methods of localized arbitration |
US11132989B2 (en) | 2018-12-13 | 2021-09-28 | Sonos, Inc. | Networked microphone devices, systems, and methods of localized arbitration |
US11540047B2 (en) | 2018-12-20 | 2022-12-27 | Sonos, Inc. | Optimization of network microphone devices using noise classification |
US11159880B2 (en) | 2018-12-20 | 2021-10-26 | Sonos, Inc. | Optimization of network microphone devices using noise classification |
US11646023B2 (en) | 2019-02-08 | 2023-05-09 | Sonos, Inc. | Devices, systems, and methods for distributed voice processing |
US11315556B2 (en) | 2019-02-08 | 2022-04-26 | Sonos, Inc. | Devices, systems, and methods for distributed voice processing by transmitting sound data associated with a wake word to an appropriate device for identification |
US12153858B2 (en) * | 2019-02-28 | 2024-11-26 | Qualcomm Incorporated | Voice activation for computing devices |
US20200278832A1 (en) * | 2019-02-28 | 2020-09-03 | Qualcomm Incorporated | Voice activation for computing devices |
US20220300249A1 (en) * | 2019-03-19 | 2022-09-22 | Spotify Ab | Refinement of voice query interpretation |
US12079541B2 (en) * | 2019-03-19 | 2024-09-03 | Spotify Ab | Refinement of voice query interpretation |
US11798553B2 (en) | 2019-05-03 | 2023-10-24 | Sonos, Inc. | Voice assistant persistence across multiple network microphone devices |
US11361756B2 (en) | 2019-06-12 | 2022-06-14 | Sonos, Inc. | Conditional wake word eventing based on environment |
US11501773B2 (en) | 2019-06-12 | 2022-11-15 | Sonos, Inc. | Network microphone device with command keyword conditioning |
US11200894B2 (en) | 2019-06-12 | 2021-12-14 | Sonos, Inc. | Network microphone device with command keyword eventing |
US11854547B2 (en) | 2019-06-12 | 2023-12-26 | Sonos, Inc. | Network microphone device with command keyword eventing |
US11354092B2 (en) | 2019-07-31 | 2022-06-07 | Sonos, Inc. | Noise classification for event detection |
US12211490B2 (en) | 2019-07-31 | 2025-01-28 | Sonos, Inc. | Locally distributed keyword detection |
US11710487B2 (en) | 2019-07-31 | 2023-07-25 | Sonos, Inc. | Locally distributed keyword detection |
US11138975B2 (en) | 2019-07-31 | 2021-10-05 | Sonos, Inc. | Locally distributed keyword detection |
US11138969B2 (en) | 2019-07-31 | 2021-10-05 | Sonos, Inc. | Locally distributed keyword detection |
US10871943B1 (en) | 2019-07-31 | 2020-12-22 | Sonos, Inc. | Noise classification for event detection |
US11714600B2 (en) | 2019-07-31 | 2023-08-01 | Sonos, Inc. | Noise classification for event detection |
US11551669B2 (en) | 2019-07-31 | 2023-01-10 | Sonos, Inc. | Locally distributed keyword detection |
US11189286B2 (en) | 2019-10-22 | 2021-11-30 | Sonos, Inc. | VAS toggle based on device orientation |
US11862161B2 (en) | 2019-10-22 | 2024-01-02 | Sonos, Inc. | VAS toggle based on device orientation |
US11869503B2 (en) | 2019-12-20 | 2024-01-09 | Sonos, Inc. | Offline voice control |
US11200900B2 (en) | 2019-12-20 | 2021-12-14 | Sonos, Inc. | Offline voice control |
US11562740B2 (en) | 2020-01-07 | 2023-01-24 | Sonos, Inc. | Voice verification for media playback |
US11556307B2 (en) | 2020-01-31 | 2023-01-17 | Sonos, Inc. | Local voice data processing |
US11308958B2 (en) | 2020-02-07 | 2022-04-19 | Sonos, Inc. | Localized wakeword verification |
US11961519B2 (en) | 2020-02-07 | 2024-04-16 | Sonos, Inc. | Localized wakeword verification |
US11482224B2 (en) | 2020-05-20 | 2022-10-25 | Sonos, Inc. | Command keywords with input detection windowing |
US11308962B2 (en) | 2020-05-20 | 2022-04-19 | Sonos, Inc. | Input detection windowing |
US11727919B2 (en) | 2020-05-20 | 2023-08-15 | Sonos, Inc. | Memory allocation for keyword spotting engines |
US11694689B2 (en) | 2020-05-20 | 2023-07-04 | Sonos, Inc. | Input detection windowing |
US11698771B2 (en) | 2020-08-25 | 2023-07-11 | Sonos, Inc. | Vocal guidance engines for playback devices |
US12283269B2 (en) | 2020-10-16 | 2025-04-22 | Sonos, Inc. | Intent inference in audiovisual communication sessions |
US11984123B2 (en) | 2020-11-12 | 2024-05-14 | Sonos, Inc. | Network device interaction by range |
US20220215835A1 (en) * | 2021-01-06 | 2022-07-07 | Comcast Cable Communications, Llc | Evaluating user device activations |
US20220230634A1 (en) * | 2021-01-15 | 2022-07-21 | Harman International Industries, Incorporated | Systems and methods for voice exchange beacon devices |
US11893985B2 (en) * | 2021-01-15 | 2024-02-06 | Harman International Industries, Incorporated | Systems and methods for voice exchange beacon devices |
US11551700B2 (en) | 2021-01-25 | 2023-01-10 | Sonos, Inc. | Systems and methods for power-efficient keyword detection |
EP4301092A4 (en) * | 2021-02-25 | 2024-05-01 | Panasonic Intellectual Property Management Co., Ltd. | REGULATOR AND COOKING APPLIANCE WITH SAID REGULATOR |
US20220301557A1 (en) * | 2021-03-19 | 2022-09-22 | Mitel Networks Corporation | Generating action items during a conferencing session |
US11798549B2 (en) * | 2021-03-19 | 2023-10-24 | Mitel Networks Corporation | Generating action items during a conferencing session |
Also Published As
Publication number | Publication date |
---|---|
US20220036901A1 (en) | 2022-02-03 |
US11922937B2 (en) | 2024-03-05 |
US11151991B2 (en) | 2021-10-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11922937B2 (en) | Electronic message transmission | |
US10855676B2 (en) | Audio verification | |
US11580325B2 (en) | Systems and methods for hyper parameter optimization for improved machine learning ensembles | |
US10817256B2 (en) | Volume control | |
US10986062B2 (en) | Subscription transfer | |
US10019291B2 (en) | Determining resource utilization by one or more tasks | |
US11362979B2 (en) | Displaying messaging interfaces based upon email conversations | |
US20230214400A1 (en) | Search across multiple user interfaces | |
EP3493486A1 (en) | Publishing message conversations to electronic forums | |
US10686742B2 (en) | Adjusting recipients of a message | |
US20230099092A1 (en) | Presenting content of an application | |
US11314937B2 (en) | Controlling a graphical user interface to present a message comprising a representation of an item | |
US20180267914A1 (en) | Device interfacing | |
US11334219B2 (en) | Presenting messages via graphical objects in a graphical user interface | |
US11113079B2 (en) | Mobile assistant | |
US10536539B2 (en) | Data sessionization | |
US10846748B2 (en) | Onboarding feature cues | |
US20160063542A1 (en) | Providing information associated with a product |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: OATH INC., NEW YORK Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:BHAGWAN, VARUN;REEL/FRAME:045396/0136 Effective date: 20180330 |
|
FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
AS | Assignment |
Owner name: VERIZON MEDIA INC., NEW YORK Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:OATH INC.;REEL/FRAME:054258/0635 Effective date: 20201005 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT RECEIVED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
AS | Assignment |
Owner name: YAHOO ASSETS LLC, VIRGINIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:YAHOO AD TECH LLC (FORMERLY VERIZON MEDIA INC.);REEL/FRAME:058982/0282 Effective date: 20211117 |
|
AS | Assignment |
Owner name: ROYAL BANK OF CANADA, AS COLLATERAL AGENT, CANADA Free format text: PATENT SECURITY AGREEMENT (FIRST LIEN);ASSIGNOR:YAHOO ASSETS LLC;REEL/FRAME:061571/0773 Effective date: 20220928 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 4 |