Simulation of Built-in PHP features for Precise Static Code Analysis

Johannes Dahse, Thorsten Holz

Annual Network & Distributed System Security Symposium (NDSS), San Diego, February 2014


The World Wide Web grew rapidly during the last decades and is used by millions of people every day for online shopping, banking, networking, and other activities. Many of these websites are developed with PHP, the most popular scripting language on the Web. However, PHP code is prone to different types of critical security vulnerabilities that can lead to data leakage, server compromise, or attacks against an application's users. This problem can be addressed by analyzing the source code of the application for security vulnerabilities before the application is deployed on a web server. In this paper, we present a novel approach for the precise static analysis of PHP code to detect security vulnerabilities in web applications. As dismissed by previous work in this area, a comprehensive configuration and simulation of over 900 PHP built-in features allows us to precisely model the highly dynamic PHP language. By performing an intra- and inter-procedural data flow analysis and by creating block and function summaries, we are able to efficiently perform a backward-directed taint analysis for 20 different types of vulnerabilities. Furthermore, string analysis enables us to validate sanitization in a context-sensitive manner. Our method is the first to perform fine-grained analysis of the interaction between different types of sanitization, encoding, sources, sinks, markup contexts, and PHP settings. We implemented a prototype of our approach in a tool called RIPS. Our evaluation shows that RIPS is capable of finding severe vulnerabilities in popular real-world applications: we reported 73 previously unknown vulnerabilities in five well-known PHP applications such as phpBB, osCommerce, and the conference management software HotCRP.


Tags: RIPS, Static Analysis, web security