Mining Input Grammars for Security Testing
- Andreas Zeller | Saarland University
Knowing which part of a program processes which parts of an input can reveal the structure of the input as well as the structure of the program. In a URL “http://www.example.com/path/”, for instance, the protocol “http”, the host “www.example.com”, and the path “path” would be handled by different functions and stored in different variables. Given a set of sample inputs, we use _dynamic tainting_ to trace the data flow of each input character, and aggregate those input fragments that would be handled by the same function into lexical and syntactical entities. The result is a _context-free grammar_ that accurately reflects valid input structure; as it draws on function and variable names, it can be as readable as textbook examples: URL ::= PROTOCOL “://” HOST “/” PATH PROTOCOL ::= “http” | “https” | … HOST ::= /[a-zA-Z0-9.]+/ … We expect inferred grammars to considerably ease the understanding of file and input formats. Their most important use, however, will be in automatic fuzz testing, where grammars can easily be turned into producers that help to quickly cover program features. Our grammar-based LANGFUZZ fuzzer is in daily use at Mozilla and has uncovered more than 4,000 defects so far; mining grammars automatically will bring such techniques to a wide range of programs. For details on our work on grammar mining, see https://www.st.cs.uni-saarland.de/models/autogram/
-
-
Tom Zimmermann
Sr. Principal Researcher
-
-
Series: Microsoft Research Talks
-
Decoding the Human Brain – A Neurosurgeon’s Experience
- Dr. Pascal O. Zinn
-
-
-
-
-
-
Challenges in Evolving a Successful Database Product (SQL Server) to a Cloud Service (SQL Azure)
- Hanuma Kodavalla,
- Phil Bernstein
-
Improving text prediction accuracy using neurophysiology
- Sophia Mehdizadeh
-
Tongue-Gesture Recognition in Head-Mounted Displays
- Tan Gemicioglu
-
DIABLo: a Deep Individual-Agnostic Binaural Localizer
- Shoken Kaneko
-
-
-
-
Audio-based Toxic Language Detection
- Midia Yousefi
-
-
From SqueezeNet to SqueezeBERT: Developing Efficient Deep Neural Networks
- Forrest Iandola,
- Sujeeth Bharadwaj
-
Hope Speech and Help Speech: Surfacing Positivity Amidst Hate
- Ashique Khudabukhsh
-
-
-
Towards Mainstream Brain-Computer Interfaces (BCIs)
- Brendan Allison
-
-
-
-
Learning Structured Models for Safe Robot Control
- Subramanian Ramamoorthy
-