Beta version 1.0 released.

NDSS ‘19 talk (Feb 27) at San Diego

LipFuzzer Introduction

LipFuzzer aims to assess the problematic Intent Classifier (NDSS paper link) at a large scale. The tool generates potentially dangerous voice commands that are likely to incur semantic inconsistency such that a user reaches an unintended vApp/functionality (i.e. users think they use voice commands correctly but yield unwanted results).

LipFuzzer design

                            |            |        * Pronounication      
    Seed Voice      ==>     | NLP Engine | ==>    * Vocabulary       
    Command Input           |            |        * Grammar          
            .........     ---------             |       LipEngine      |
            /User   /     /Default/             |......................|
            /Defined/  +  /Fuzzing/    +++>     | *Module_1 *Module_2  |
            /Rules  /     /Rules  /   loading.. | *Module_3 ...        |
            ---------     ---------             | *Module_n            |
                                                   ||    ||  .... 
                                                   \/    \/

                                                  Potentially Dangerous
                                                  Voice Commands (Lapsus)

Seed Voice Command Input

The input can be any voice commands (in English). In our study, we mainly focus on Voice Assistant Applications (vApps).


Natural language such as voice commands do not have enough information for fuzzing tasks mentioned earlier. We leverage NLP techniques to retrieve computational linguistic information to build LAPSUS Models. Pronunciation-level Information.

We choose phonemes as sound-level linguistic information since it is the basic sound unit. We extract phonemes from each word by leveraging CMU Pronouncing Dictionary in the NLTK package. For vocabulary linguistic information, we leverage basic string metric. In order to tackle the ambiguity of the natural language, we also use grammar-level linguistic information, i.e., PoS Tagging, Stemming and Lemmatization. In particular, PoS Tagging processes grammar information by tagging tenses and other word contexts. The stemming and the lemmatization are similar regarding functionality. The goal of both stemming and lemmatization is to reduce inflectional forms and sometimes derivationally related forms of a word to a common base form.

What are being used? We use the following example to demonstrate what are the linguistic data we used for fuzzing purpose. We show one example from coreNLP.

Fuzzing Rules

To instantiate each konwledge-transfered fuzzing rules, we define fuzzing rules based on retrived computational linguistic data. In detail, we set “matching” conditions and “actions” for mutation actions. Please check details in the Fuzzing Rule.

LipFuzzer learns from existing linguistic knowledge to find out potentially (and likely) dangerous voice commands. For example, we list one regional accent example in the follwoing:


LipEngine conducts actual mutation operations for voice commands. It consists of multipule modules that operates based on fuzzing rules.


The output of LipFuzzer is a set of mutated voice command (Lapsus) that arguably easy to be mispoke by vApp users.

How to install LipFuzzer


We make our code available on github. You can download it via following command:

git clone

Lipfuzzer Dependency

We stand on the shoulders of giants, please be aware of the following dependency.

Stanford CoreNLP: You need to download the Enlgish version of CoreNLP tool package and use the path in LipFuzzer.

Please download and extract (you will need to donwload the zip version) the tool under the root folder for example you can see .jar files under:


Natural Language Toolkit (NLTK): Various ways can be used to install NLTK, for example:

sudo pip install -U nltk

Others: sudo pip install pyenchant sudo pip install inflect

Try it out

After installed LipFuzzer, use file to demonstrate a simple fuzzing test.


More about LipFuzzer

Current Version Features:

  1. Phonemes, Lemma, Dependency, vocabulary based fuzzing
  2. Default LipEngine Modules are provided
  3. Custom Modules available
  4. Generate new fuzzing rules

Ongoing/future Development

Better linguistic model with automated weight training available.

Advanced chetbot for auto-checking.



[1]  Yangyong Zhang, Lei Xu, Abner Mendoza, Guangliang Yang, Phakpoom Chinprutthiwong, Guofei Gu. "Life after Speech Recognition: Fuzzing Semantic Misinterpretation for Voice Assistant Applications." In Proc. of the Network and Distributed System Security Symposium (NDSS'19), San Diego, California, Feb. 2019. [pdf] [bib]