CN113094475B - Dialog intention recognition system and method based on context attention flow - Google Patents
Dialog intention recognition system and method based on context attention flow Download PDFInfo
- Publication number
- CN113094475B CN113094475B CN202110634398.4A CN202110634398A CN113094475B CN 113094475 B CN113094475 B CN 113094475B CN 202110634398 A CN202110634398 A CN 202110634398A CN 113094475 B CN113094475 B CN 113094475B
- Authority
- CN
- China
- Prior art keywords
- dialog
- statement
- dialogue
- vector
- representing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 25
- 239000013598 vector Substances 0.000 claims abstract description 125
- 230000006870 function Effects 0.000 claims abstract description 72
- 239000013604 expression vector Substances 0.000 claims abstract description 39
- 238000013528 artificial neural network Methods 0.000 claims abstract description 31
- 238000012512 characterization method Methods 0.000 claims abstract description 28
- 238000004458 analytical method Methods 0.000 claims abstract description 18
- 230000004927 fusion Effects 0.000 claims abstract description 13
- 238000012545 processing Methods 0.000 claims description 23
- 238000004364 calculation method Methods 0.000 claims description 9
- 239000011159 matrix material Substances 0.000 claims description 6
- 238000007476 Maximum Likelihood Methods 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 238000013527 convolutional neural network Methods 0.000 description 5
- 238000012549 training Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 3
- 230000004044 response Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 238000011176 pooling Methods 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 239000006145 Eagle's minimal essential medium Substances 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Machine Translation (AREA)
Abstract
The invention provides a dialogue intention recognition system and a method based on context attention flow, which comprises an input coding module, an autocorrelation coefficient analysis module, a feedforward neural network and a multitask learning module, wherein the input coding module is used for coding a dialogue intention; the input coding module is used for coding an input sentence containing a plurality of words to obtain a corresponding representation vector; the autocorrelation coefficient analysis module is used for splicing the characterization vectors of the current statement and the characterization vectors of the historical dialogue statements and then calculating to obtain the expression vector of the previous statement fused with the problem information; then, performing feature fusion according to the above sentence expression vector to obtain a context sentence expression vector fused with the dialogue context information; finally, performing dot product operation according to the representation vector of the current statement and the representation vector of the context statement to obtain a feature vector for intention identification; the multi-task learning module is used for optimizing the feature vectors according to the total loss function of the system, and therefore efficiency and accuracy of conversation intention recognition are improved.
Description
Technical Field
The invention relates to the technical field of computers, in particular to a system and a method for recognizing dialog intentions based on contextual attention flow.
Background
The core function module of the dialogue robot is intention identification. The robot firstly needs to predict the corresponding intention according to the dialogue sentences sent by the user, and then sends the corresponding answer to the user based on the intention, thereby completing the online automatic response. The conversation is a process of multi-turn question answering, but at present, when the online robot identifies the intention, only the content of a single sentence is considered, and the intentions in many conversations cannot be identified through the content of the single sentence. Therefore, the on-line robot has the intention of having a considerable number of sentences that cannot be accurately recognized on a single sentence basis, resulting in a failure of the robot's question-answer response.
In order to solve the problem of intention recognition in multi-turn dialogue question-answering, two types of methods are mainly adopted in the industry and academia at present:
memory network-based methods-memory networks generally include an input encoding module, a memory module, and an output prediction module. The method generally maintains a memory slot space (memory module), stores historical statements above the conversation, dynamically updates the memory state of the network by applying an attention mechanism, generates a feature vector based on the memory state, and predicts the intention of the conversation based on the feature vector.
The reading understanding method based on the reading understanding technology, namely a reading understanding model, generally adopts an encoder to encode input articles and questions, obtains the representation of word granularity of the articles through the technologies of mutual attention, self-attention and the like of article contents and question contents, constructs two starting and ending position prediction heads, predicts the probability (P (start)) of each word as the starting position and the probability (P (end)) of the ending position of a question answer, and finally selects a group of phrases with the maximum probability (P (start)) P (end) to form the question answer.
Although these historical sentences can be accurately positioned when the reading understanding-based technology is used for processing the problems that the multiple turns of conversations depend on, topic articles required for reading understanding are difficult to obtain in the industry, and after relevant historical sentences are obtained, a construction model is required to further fuse the sentence information and current sentence information so as to predict the conversation intention. The model based on the memory network cannot directly select related historical dialogue sentences as the above dependence information of the dialogue, so that the model is difficult to accurately fuse the dialogue above information into the current sentence. Furthermore, it is also possible to select coding features of some of the sentences repeatedly at a time, resulting in the model not being able to adequately focus on other relevant features, affecting the ability of the model to model multiple rounds of conversation.
Therefore, it is necessary to provide a scheme in order to improve efficiency and accuracy of recognition of the dialog intention and enhance the response capability of the robot.
Disclosure of Invention
The invention aims to provide a system and a method for recognizing dialog intentions based on contextual attention flow, which are used for realizing the technical effect of improving the efficiency and the accuracy of dialog intention recognition.
In a first aspect, the present invention provides a contextual attention flow based dialog intent recognition system comprising: the system comprises an input coding module, an autocorrelation coefficient analysis module, a feedforward neural network and a multitask learning module;
the input coding module is used for coding an input sentence containing a plurality of words to obtain a corresponding representation vector; the input sentences comprise a plurality of historical conversation sentences and current sentences of known conversation intents and conversation types in the conversation sample set;
the autocorrelation coefficient analysis module is used for splicing the characterization vectors of the current statement and the characterization vectors of the historical dialogue statements and then calculating to obtain the expression vector of the previous statement fused with the problem information; then, performing feature fusion according to the above sentence expression vector to obtain a context sentence expression vector fused with the dialogue context information; finally, performing dot product operation according to the representation vector of the current statement and the representation vector of the context statement to obtain a feature vector for intention identification;
the feedforward neural network is used for processing the characteristic vector and inputting the processed characteristic vector into the multi-task learning module;
the multi-task learning module is used for calculating according to the processing result of the feedforward neural network and the actual dialogue intention of each historical dialogue statement to obtain a corresponding dialogue intention identification loss function; analyzing according to the processing result of the feedforward neural network and the actual type of each historical dialogue statement to obtain a corresponding dialogue upper-part type recognition loss function; simultaneously splicing the characterization vectors of the current statement and the characterization vectors of each historical dialogue statement, and calculating by a conditional random field to obtain a corresponding dialogue intention evidence loss function; then, calculating to obtain a total loss function of the system according to the conversation intention identification loss function, the conversation upper text type identification loss function and the conversation intention evidence loss function, and optimizing the feature vector according to the total loss function;
the calculation mode of the expression vector of the above sentence is as follows:
in the formula,u i 1representing the expression vector of the above sentence; tanh represents a hyperbolic tangent function;W cq andb cq all represent the learning parameters of the above-question attention layer;qrepresenting a current sentence;u i is shown asiA history dialogue statement;Nrepresenting a total number of historical conversational utterances;irepresenting variables with the value range of 1-N;
the calculation mode of the expression vector of the context sentence is as follows:
in the formula,u i 2representing a context sentence representation vector;Nrepresenting a total number of historical conversational utterances;W self ∈Rd×d,Rd×drepresenting a real number matrix with d-dimension of rows and columns, wherein R represents a real number;attn ij representing the attention weight after the normalization processing of the softmax function;score ij an attention weight between the ith and jth history statements above representing the current statement;kis a variable, representing the second in a range of valueskA plurality of;
the calculation mode of the feature vector is as follows:
in the formula,vec feature the feature vector is represented by a vector of features,W qc andb qc all represent questions-the learning parameters of the above attention layers,qwhich is indicative of the current sentence,dotrepresenting a dot product operation.
Further, the autocorrelation coefficient analysis module includes the above-question attention layer, the self-attention layer, and the question-above attention layer; the upper-question attention layer is used for splicing the representation vectors of the current statement and the representation vectors of the historical dialogue statements and then calculating through a hyperbolic tangent function to obtain an upper statement representation vector fused with question information; the self-attention layer is used for performing feature fusion on the above-mentioned sentence expression vector through a self-attention mechanism to obtain a context sentence expression vector fused with historical dialogue context information; the question-above attention layer is used for performing dot product operation according to the representation vector of the current statement and the representation vector of the context statement to obtain a feature vector for intention identification.
Further, the multitask learning module comprises a conversation intention identification unit, a conversation text class identification unit and a conversation text evidence selection unit; the dialogue intention recognition unit is used for calculating according to the processing result of the feedforward neural network and the intention of each historical dialogue statement to obtain a corresponding dialogue intention recognition loss function, and the dialogue upper class recognition unit is used for calculating according to the processing result of the feedforward neural network and the type of each historical dialogue statement to obtain a corresponding dialogue upper class recognition loss function; and the evidence selection unit above the dialog is used for splicing the representation vector of the current statement and the representation vectors of the historical dialog statements and then calculating according to the relevance between the current statement and the historical dialog statements to obtain a corresponding evidence selection loss function above the dialog.
Further, the dialog intention recognition unit, the dialog context class recognition unit and the dialog context evidence selection unit are implemented in the following manner:
in the above equation, Loss1 represents the dialog intent recognition Loss function; loss2 represents the class recognition penalty function above the dialog; loss3 represents the evidence selection Loss function above the dialog; crf denotes a conditional random field;ffrepresenting a feed-forward neural network;θ acflow network parameters representing an autocorrelation coefficient analysis module;θ ff a network parameter representing a feedforward neural network;θ crf network parameters representing conditional random fields;qrepresenting a current sentence;u 1,u 2,…,u N representing each historical dialog statement;x k representing the second in a sample set of dialogkA sample is obtained;CErepresenting a cross entropy operation;MLErepresenting a maximum likelihood estimation operation;intent k to representx k A corresponding intent;type k to representx k A corresponding type;tag k to representx k A flag as to whether the current statement is relevant;sel N then the mark of each historical dialogue sentence in the mark sequence is represented, 0 represents irrelevant, and 1 represents relevant; d represents a data set in which each sample is containedx k Corresponding intentionintent k Type (c) oftype k And related indiciatag k 。
Further, the total loss function is calculated in the following manner:
in the formula,min obj represents the total Loss function, Loss1 represents the dialog intent recognition Loss function; loss2 represents the class recognition penalty function above the dialog; loss3 represents the evidence selection Loss function above the dialog;λ 1,λ 2,λ 3representing a hyper-parameter.
In a second aspect, an embodiment of the present invention provides a dialog intention recognition method based on contextual attention flow, which is applied to the dialog intention recognition system described above, and includes:
s1, coding an input sentence containing a plurality of words to obtain a corresponding characterization vector; the input statement comprises a plurality of historical dialogue statements and a current statement;
s2, splicing the characterization vectors of the current statement and the characterization vectors of the historical dialogue statements and then calculating to obtain a previous statement expression vector fused with problem information; then, performing feature fusion according to the above sentence expression vector to obtain a context sentence expression vector fused with historical dialogue context information; finally, performing dot product operation according to the representation vector of the current statement and the representation vector of the context statement to obtain a feature vector for intention identification;
s3, processing the feature vectors through a feedforward neural network and inputting the processed feature vectors into the multi-task learning module;
s4, optimizing the feature vector through a multi-task learning module according to a total loss function of the system;
and S5, analyzing according to the optimized feature vector to obtain the intention of the current statement.
The beneficial effects that the invention can realize are as follows: the system and the method for recognizing the dialogue intention based on the context attention flow acquire the feature vector for recognizing the current sentence intention through the set autocorrelation coefficient analysis module, and optimize the feature vector through the trained multi-task learning module, so that the efficiency and the accuracy of recognizing the dialogue intention are improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments of the present invention will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
FIG. 1 is a schematic diagram of a topology of a dialog intention recognition system based on contextual attention flow according to an embodiment of the present invention;
fig. 2 is a flowchart illustrating a dialog intention recognition method based on contextual attention flow according to an embodiment of the present invention.
Icon: 10-dialog intention recognition system; 100-input coding module; 200-an autocorrelation coefficient analysis module; 300-a feed-forward neural network; 400-multitask learning module.
Detailed Description
The technical solutions in the embodiments of the present invention will be described below with reference to the drawings in the embodiments of the present invention.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present invention, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.
Referring to fig. 1, fig. 1 is a schematic diagram illustrating a topology of a dialog intention recognition system based on contextual attention flow according to an embodiment of the present invention.
In one embodiment, the present invention provides a contextual attention flow based dialog intent recognition system 10 comprising an input encoding module 100, an autocorrelation coefficient analysis module 200, a feedforward neural network 300, and a multitasking learning module 400;
the input encoding module 100 is configured to encode an input sentence including a plurality of words to obtain a corresponding token vector; the input statement comprises a plurality of historical conversation statements and current statements of known conversation intents and conversation types in the conversation sample;
the autocorrelation coefficient analysis module 200 is configured to splice the characterization vectors of the current statement and the characterization vectors of each historical dialogue statement and then perform calculation to obtain an above-mentioned statement expression vector fused with the problem information; then, performing feature fusion according to the above sentence expression vector to obtain a context sentence expression vector fused with the dialogue context information; finally, performing dot product operation according to the representation vector of the current sentence and the representation vector of the context sentence to obtain a feature vector for intention identification;
the feedforward neural network 300 is used for processing the feature vectors and inputting the processed feature vectors into the multi-task learning module;
the multitask learning module 400 is configured to calculate according to the processing result of the feedforward neural network and the actual dialogue intention of each historical dialogue statement to obtain a corresponding dialogue intention identification loss function; analyzing according to the processing result of the feedforward neural network and the actual type of each historical dialogue statement to obtain a corresponding dialogue upper-text type recognition loss function; simultaneously splicing the characterization vectors of the current statement and the characterization vectors of each historical dialogue statement, and calculating by a conditional random field to obtain a corresponding dialogue intention evidence loss function; and then, calculating to obtain a total loss function of the system according to the conversation intention identification loss function, the conversation upper text type identification loss function and the conversation intention evidence loss function, and optimizing the feature vector according to the total loss function.
Through the embodiment, the complexity of the system is reduced, and the efficiency and the accuracy of dialog intention recognition are improved.
In one embodiment, the input encoding module 100 may use an LSTM-CNN encoder; the LSTM-CNN encoder firstly uses a word embedding layer based on a Glove word vector to encode a sentence containing N words into an N x d-dimensional matrix, each word corresponds to a d-dimensional vector in the matrix, then uses an LSTM encoder to read in the matrix, and sends the output of the LSTM into a multi-convolution kernel CNN network. The CNN network comprises convolution kernels with the lengths of 1, 3, 5, 7 and 9 units, the convolution results of the convolution kernels are spliced together, and maximum pooling operation is performed, so that the characterization vectors of all statement codes are generated.
It should be noted that the input encoding module 100 is not limited to use of the LSTM-CNN encoder, and other encoders may be used instead, such as a transform network.
In one embodiment, autocorrelation coefficient analysis module 200 includes the above-question attention layer, the self-attention layer, and the question-above attention layer; the problem attention layer is used for splicing the representation vector of the current statement and the representation vectors of the historical dialogue statements and then calculating through a hyperbolic tangent function to obtain a problem information fused representation vector of the previous statement; the self-attention layer is used for performing feature fusion on the above-mentioned sentence expression vector through a self-attention mechanism to obtain a context sentence expression vector fused with historical dialogue context information; question-above attention layer is used to perform dot product operation based on the token vector of the current sentence and the context sentence representation vector, obtaining a feature vector for intent recognition.
Specifically, the above sentence expression vector is calculated in the following manner:
in the formula,u i 1representing the expression vector of the above sentence; tanh represents a hyperbolic tangent function;W cq andb cq all represent the learning parameters of the above-question attention layer;qrepresenting a current sentence;u i is shown asiA history dialogue statement;Nrepresenting a total number of historical conversational utterances;irepresenting variables with the value range of 1-N.
The calculation method of the expression vector of the context sentence is as follows:
in the formula,u i 2representing a context sentence representation vector;Nrepresenting a total number of historical conversational utterances;W self ∈Rd×d,Rd×drepresenting a real number matrix with d-dimension of rows and columns, wherein R represents a real number;attn ij representing the attention weight after the normalization processing of the softmax function;score ij an attention weight between the ith and jth history statements above representing the current statement;kis a variable, representing the second in a range of valueskAnd (4) respectively.
The calculation method of the feature vector is as follows:
in the formula,vec feature the feature vector is represented by a vector of features,W qc andb qc all represent questions-the learning parameters of the above attention layers,qwhich is indicative of the current sentence,dotrepresenting a dot product operation.
In one embodiment, the multitask learning module 400 includes a dialogue intention recognition unit, a dialogue above class recognition unit, and a dialogue above evidence selection unit; the dialogue intention recognition unit is used for calculating according to the processing result of the feedforward neural network and the intention of each historical dialogue statement to obtain a corresponding dialogue intention recognition loss function, and the dialogue upper class recognition unit is used for calculating according to the processing result of the feedforward neural network and the type of each historical dialogue statement to obtain a corresponding dialogue upper class recognition loss function; and the evidence selection unit above the dialog is used for splicing the characterization vectors of the current statement and the characterization vectors of the historical dialog statements and then calculating according to the relevance between the current statement and the historical dialog statements to obtain a corresponding evidence selection loss function above the dialog.
Specifically, the dialog intention recognition unit, the dialog context class recognition unit and the dialog context evidence selection unit are implemented in the following manners:
in the above equation, Loss1 represents the dialog intent recognition Loss function; loss2 represents the class recognition penalty function above the dialog; loss3 represents the evidence selection Loss function above the dialog; crf denotes a conditional random field;ffrepresenting a feed-forward neural network;θ acflow network parameters representing an autocorrelation coefficient analysis module;θ ff a network parameter representing a feedforward neural network;θ crf network parameters representing conditional random fields;qrepresenting a current sentence;u 1,u 2,…,u N representing each historical dialog statement;x k representing the second in a sample set of dialogkA sample is obtained;CErepresenting a cross entropy operation;MLErepresenting a maximum likelihood estimation operation;intent k to representx k A corresponding intent;type k to representx k A corresponding type;tag k to representx k A flag as to whether the current statement is relevant;sel N then the mark of each historical dialogue sentence in the mark sequence is represented, 0 represents irrelevant, and 1 represents relevant;d represents a data set in which each sample is containedx k Corresponding intentionintent k Type (c) oftype k And related indiciatag k 。
In one embodiment, the total loss function is calculated by:
in the formula,min obj represents the total Loss function, Loss1 represents the dialog intent recognition Loss function; loss2 represents the class recognition penalty function above the dialog; loss3 represents the evidence selection Loss function above the dialog;λ 1,λ 2,λ 3representing a hyper-parameter. Wherein,λ 1,λ 2,λ 3can be obtained by adopting a hyper-parameter grid search; for example, a training data set may be set, the training data set is divided into a training set and a verification set, the intention recognition accuracy is obtained under different super parameters, and a group of super parameters with the highest accuracy on the verification set is selected.
Through the embodiment, the feature vector for dialog intention prediction can be more accurate, and therefore accuracy of dialog intention recognition is improved.
Referring to fig. 2, fig. 2 is a flowchart illustrating a dialog intention recognition method based on contextual attention flow according to an embodiment of the present invention.
In one embodiment, the present invention further provides a method for recognizing dialog intention based on contextual attention flow for the above-mentioned dialog intention recognition system, which is described in detail as follows.
S1, coding an input sentence containing a plurality of words to obtain a corresponding characterization vector; the input statement comprises a plurality of historical dialogue statements and a current statement;
s2, splicing the characterization vectors of the current statement and the characterization vectors of the historical dialogue statements and then calculating to obtain a previous statement expression vector fused with problem information; then, performing feature fusion according to the above sentence expression vector to obtain a context sentence expression vector fused with historical dialogue context information; finally, performing dot product operation according to the representation vector of the current statement and the representation vector of the context statement to obtain a feature vector for intention identification;
s3, processing the feature vectors through a feedforward neural network and inputting the processed feature vectors into the multi-task learning module;
s4, optimizing the feature vector through a multi-task learning module according to a total loss function of the system;
and S5, analyzing according to the optimized feature vector to obtain the intention of the current statement.
Through the process, the efficiency and the accuracy of recognizing the conversation intention are improved.
Further, in order to solve the intention recognition that the dialog depends on, the industry currently adopts NLI (natural language reasoning) method, memory network and other methods for analysis. The invention contrasts and analyzes the following methods:
l BERT-NLI: and the method comprises the steps of splicing the sentences on the conversation into single sentences by using an advanced natural language model BERT in the industry as a sentence coder, sending the single sentences and the current conversation sentences into the BERT, and then using the pooling vector of the BERT as a feature vector to identify the intention.
l E2 EMEM: an end-to-end memory network can form input, memory update and output closed-loop parameter update.
l DMN: the dynamic memory network adopts a dynamic gating algorithm to update the memory state and continuously updates the internal memory module of the network.
l KVNet: a key-value network of a parameter key hash can greatly enlarge the retrieval range of facts and improve the retrieval fusion precision. The above of the dialog may be regarded as a fact in the dialog and the current dialog statement as a retrieval request.
l DANet: a deep dialogue historical statement fusion network can fuse information above dialogue into the representation of current dialogue statements based on an attention mechanism, and improves dialogue intention recognition accuracy.
In order to test the accuracy, we obtained about 90 ten thousand dialogs from Taobao, performed artificial intention labeling (dialog intention, dialog type, dialog intention related facts), and then trained and tested the method of the present invention (ACFlow) and the above 5-class industry method using 90% of them as training set and 10% as test set, and the experimental results are shown in Table 1. As can be seen from Table 1, the method provided by the invention has an accuracy rate which is about 6-7% higher than that of the representative methods.
TABLE 1
In summary, the embodiment of the present invention provides a system and a method for recognizing dialog intentions based on contextual attention streams, including an input encoding module, an autocorrelation coefficient analyzing module, a feedforward neural network, and a multitask learning module; the input coding module is used for coding an input sentence containing a plurality of words to obtain a corresponding representation vector; the input statement comprises a plurality of historical conversation statements and current statements of known conversation intents and conversation types in the conversation sample; the autocorrelation coefficient analysis module is used for splicing the characterization vectors of the current statement and the characterization vectors of the historical dialogue statements and then calculating to obtain the expression vector of the previous statement fused with the problem information; then, performing feature fusion according to the above sentence expression vector to obtain a context sentence expression vector fused with the dialogue context information; finally, performing dot product operation according to the representation vector of the current sentence and the representation vector of the context sentence to obtain a feature vector for intention identification; the feedforward neural network is used for processing the characteristic vector and inputting the processed characteristic vector into the multi-task learning module; the multi-task learning module optimizes the characteristic vector through a total loss function of the system; efficiency and accuracy of dialog intention recognition are improved.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Claims (6)
1. A contextual attention flow based dialog intent recognition system comprising: the system comprises an input coding module, an autocorrelation coefficient analysis module, a feedforward neural network and a multitask learning module;
the input coding module is used for coding an input sentence containing a plurality of words to obtain a corresponding representation vector; the input sentences comprise a plurality of historical conversation sentences and current sentences of known conversation intents and conversation types in the conversation sample set;
the autocorrelation coefficient analysis module is used for splicing the characterization vectors of the current statement and the characterization vectors of the historical dialogue statements and then calculating to obtain the expression vector of the previous statement fused with the problem information; then, performing feature fusion according to the above sentence expression vector to obtain a context sentence expression vector fused with the dialogue context information; finally, performing dot product operation according to the representation vector of the current statement and the representation vector of the context statement to obtain a feature vector for intention identification;
the feedforward neural network is used for processing the characteristic vector and inputting the processed characteristic vector into the multi-task learning module;
the multi-task learning module is used for calculating according to the processing result of the feedforward neural network and the actual dialogue intention of each historical dialogue statement to obtain a corresponding dialogue intention identification loss function; analyzing according to the processing result of the feedforward neural network and the actual type of each historical dialogue statement to obtain a corresponding dialogue upper-part type recognition loss function; simultaneously splicing the characterization vectors of the current statement and the characterization vectors of each historical dialogue statement, and calculating by a conditional random field to obtain a corresponding dialogue intention evidence loss function; then, calculating to obtain a total loss function of the system according to the conversation intention identification loss function, the conversation upper text type identification loss function and the conversation intention evidence loss function, and optimizing the feature vector according to the total loss function;
the calculation mode of the expression vector of the above sentence is as follows:
in the formula,u i 1representing the expression vector of the above sentence; tanh represents a hyperbolic tangent function;W cq andb cq all represent the learning parameters of the above-question attention layer;qrepresenting a current sentence;u i is shown asiA history dialogue statement;Nrepresenting a total number of historical conversational utterances;irepresenting variables with the value range of 1-N;
the calculation mode of the expression vector of the context sentence is as follows:
in the formula,u i 2representing a context sentence representation vector;Nrepresenting a total number of historical conversational utterances;W self ∈Rd×d,Rd×drepresenting a real number matrix with d-dimension of rows and columns, wherein R represents a real number;attn ij representing the attention weight after the normalization processing of the softmax function;score ij an attention weight between the ith and jth history statements above representing the current statement;kis a variable, representing the second in a range of valueskA plurality of;
the calculation mode of the feature vector is as follows:
in the formula,vec feature the feature vector is represented by a vector of features,W qc andb qc all represent questions-the learning parameters of the above attention layers,qwhich is indicative of the current sentence,dotrepresenting a dot product operation.
2. The dialog intent recognition system of claim 1 wherein the autocorrelation coefficient analysis module comprises a above-question attention layer, a self-attention layer, and a question-above attention layer; the upper-question attention layer is used for splicing the representation vectors of the current statement and the representation vectors of the historical dialogue statements and then calculating through a hyperbolic tangent function to obtain an upper statement representation vector fused with question information; the self-attention layer is used for performing feature fusion on the above-mentioned sentence expression vector through a self-attention mechanism to obtain a context sentence expression vector fused with historical dialogue context information; the question-above attention layer is used for performing dot product operation according to the representation vector of the current statement and the representation vector of the context statement to obtain a feature vector for intention identification.
3. The dialog intent recognition system of claim 1 wherein the multitask learning module comprises a dialog intent recognition unit, a dialog context class recognition unit and a dialog context evidence selection unit; the dialogue intention recognition unit is used for calculating according to the processing result of the feedforward neural network and the intention of each historical dialogue statement to obtain a corresponding dialogue intention recognition loss function, and the dialogue upper class recognition unit is used for calculating according to the processing result of the feedforward neural network and the type of each historical dialogue statement to obtain a corresponding dialogue upper class recognition loss function; and the evidence selection unit above the dialog is used for splicing the representation vector of the current statement and the representation vectors of the historical dialog statements and then calculating according to the relevance between the current statement and the historical dialog statements to obtain a corresponding evidence selection loss function above the dialog.
4. The dialog intention recognition system according to claim 3, characterized in that the dialog intention recognition unit, the dialog context class recognition unit and the dialog context evidence selection unit are implemented in such a way that:
in the above equation, Loss1 represents the dialog intent recognition Loss function; loss2 represents the class recognition penalty function above the dialog; loss3 represents the evidence selection Loss function above the dialog; crf denotes a conditional random field;ffrepresenting a feed-forward neural network;θ acflow network parameters representing an autocorrelation coefficient analysis module;θ ff a network parameter representing a feedforward neural network;θ crf network parameters representing conditional random fields;qrepresenting a current sentence;u 1,u 2,…,u N representing each historical dialog statement;x k representing the second in a sample set of dialogkA sample is obtained;CErepresenting a cross entropy operation;MLErepresenting a maximum likelihood estimation operation;intent k to representx k A corresponding intent;type k to representx k A corresponding type;tag k to representx k A flag as to whether the current statement is relevant;sel N then the mark of each historical dialogue sentence in the mark sequence is represented, 0 represents irrelevant, and 1 represents relevant; d represents a data set in which each sample is containedx k Corresponding intentionintent k Type (c) oftype k And related indiciatag k 。
5. The dialog intent recognition system of claim 4 wherein the total loss function is calculated by:
in the formula,min obj represents the total Loss function, Loss1 represents the dialog intent recognition Loss function; loss2 represents the class recognition penalty function above the dialog; loss3 represents the evidence selection Loss function above the dialog;λ 1,λ 2,λ 3representing a hyper-parameter.
6. A dialog intention recognition method based on contextual attention flow, applied to the dialog intention recognition system according to any one of claims 1 to 5, comprising:
s1, coding an input sentence containing a plurality of words to obtain a corresponding characterization vector; the input statement comprises a plurality of historical dialogue statements and a current statement;
s2, splicing the characterization vectors of the current statement and the characterization vectors of the historical dialogue statements and then calculating to obtain a previous statement expression vector fused with problem information; then, performing feature fusion according to the above sentence expression vector to obtain a context sentence expression vector fused with historical dialogue context information; finally, performing dot product operation according to the representation vector of the current statement and the representation vector of the context statement to obtain a feature vector for intention identification;
s3, processing the feature vectors through a feedforward neural network and inputting the processed feature vectors into the multi-task learning module;
s4, optimizing the feature vector through a multi-task learning module according to a total loss function of the system;
and S5, analyzing according to the optimized feature vector to obtain the intention of the current statement.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110634398.4A CN113094475B (en) | 2021-06-08 | 2021-06-08 | Dialog intention recognition system and method based on context attention flow |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110634398.4A CN113094475B (en) | 2021-06-08 | 2021-06-08 | Dialog intention recognition system and method based on context attention flow |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113094475A CN113094475A (en) | 2021-07-09 |
CN113094475B true CN113094475B (en) | 2021-09-21 |
Family
ID=76664440
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110634398.4A Active CN113094475B (en) | 2021-06-08 | 2021-06-08 | Dialog intention recognition system and method based on context attention flow |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113094475B (en) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113590798B (en) * | 2021-08-09 | 2024-03-26 | 北京达佳互联信息技术有限公司 | Dialog intention recognition, training method for a model for recognizing dialog intention |
CN113849647B (en) * | 2021-09-28 | 2024-05-31 | 平安科技(深圳)有限公司 | Dialogue identity recognition method, device, equipment and storage medium |
CN114238549A (en) * | 2021-12-15 | 2022-03-25 | 平安科技(深圳)有限公司 | Training method and device of text generation model, storage medium and computer equipment |
CN114818738B (en) * | 2022-03-01 | 2024-08-02 | 达观数据有限公司 | Method and system for identifying intention track of customer service hotline user |
CN114611527B (en) * | 2022-03-01 | 2024-07-19 | 华南理工大学 | Task-oriented dialogue strategy learning method for user personality perception |
CN114420169B (en) * | 2022-03-31 | 2022-06-21 | 北京沃丰时代数据科技有限公司 | Emotion recognition method and device and robot |
CN115129844B (en) * | 2022-07-01 | 2025-02-07 | 北京师范大学 | A problem sustainability evaluation system based on supervised contrastive learning |
CN116822522B (en) * | 2023-06-13 | 2024-05-28 | 连连银通电子支付有限公司 | Semantic analysis method, semantic analysis device, semantic analysis equipment and storage medium |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101163010B1 (en) * | 2008-12-15 | 2012-07-09 | 한국전자통신연구원 | Apparatus for online advertisement selecting based on content affective and intend analysis and method thereof |
US10963273B2 (en) * | 2018-04-20 | 2021-03-30 | Facebook, Inc. | Generating personalized content summaries for users |
US10679613B2 (en) * | 2018-06-14 | 2020-06-09 | Accenture Global Solutions Limited | Spoken language understanding system and method using recurrent neural networks |
CN108920622B (en) * | 2018-06-29 | 2021-07-20 | 北京奇艺世纪科技有限公司 | Training method, training device and recognition device for intention recognition |
CN109241255B (en) * | 2018-08-20 | 2021-05-18 | 华中师范大学 | An Intent Recognition Method Based on Deep Learning |
US12087288B2 (en) * | 2018-09-06 | 2024-09-10 | Google Llc | Language understanding and dialogue state tracking in dialogue systems |
CN112699686B (en) * | 2021-01-05 | 2024-03-08 | 浙江诺诺网络科技有限公司 | Semantic understanding method, device, equipment and medium based on task type dialogue system |
CN112800196B (en) * | 2021-01-18 | 2024-03-01 | 南京明略科技有限公司 | FAQ question-answering library matching method and system based on twin network |
-
2021
- 2021-06-08 CN CN202110634398.4A patent/CN113094475B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN113094475A (en) | 2021-07-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113094475B (en) | Dialog intention recognition system and method based on context attention flow | |
CN113297364B (en) | Natural language understanding method and device in dialogue-oriented system | |
US20220129621A1 (en) | Bert-based machine-learning tool for predicting emotional response to text | |
CN110046221A (en) | A kind of machine dialogue method, device, computer equipment and storage medium | |
CN111061847A (en) | Dialogue generation and corpus expansion method and device, computer equipment and storage medium | |
Chen et al. | Joint multiple intent detection and slot filling via self-distillation | |
CN115186147B (en) | Dialog content generation method and device, storage medium, and terminal | |
CN118227769B (en) | Knowledge graph enhancement-based large language model question-answer generation method | |
CN111177325A (en) | Method and system for automatically generating answers | |
CN116991982B (en) | Interactive dialogue method, device, equipment and storage medium based on artificial intelligence | |
CN115497465A (en) | Voice interaction method and device, electronic equipment and storage medium | |
CN111914553A (en) | Financial information negative subject judgment method based on machine learning | |
CN113420136A (en) | Dialogue method, system, electronic equipment, storage medium and program product | |
CN110489730B (en) | Text processing method, device, terminal and storage medium | |
CN116341651A (en) | Entity recognition model training method and device, electronic equipment and storage medium | |
Rauf et al. | BCE4ZSR: Bi-encoder empowered by teacher cross-encoder for zero-shot cold-start news recommendation | |
CN113688636B (en) | Recommended method, device, computer equipment and storage medium for extended questions | |
CN112667788A (en) | Novel BERTEXT-based multi-round dialogue natural language understanding model | |
CN112818688B (en) | Text processing method, device, equipment and storage medium | |
Shi | E-Commerce Products Personalized Recommendation Based on Deep Learning | |
CN116136868A (en) | Reinforcement learning agent for multidimensional conversational action selection | |
Nishimoto et al. | Dialogue management with deep reinforcement learning: Balancing exploration and exploitation | |
Selamat et al. | Arabic script web documents language identification using decision tree-ARTMAP model | |
US12153884B1 (en) | Advanced transformer architecture with epistemic embedding for enhanced natural language processing | |
Aldabergen et al. | Question answering model construction by using transfer learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |