Published and pre-published research works
A two-layered, proactive and reactive, security framework for distributed frameworks, such as Apache Spark. In the proactive layer, we used program analysis to detect potential dangerous and malicious code early. In the reactive layer, we implemented attribute-based access control using aspect-oriented programming and secured the environment with security manager-based sandboxing.
Docker containers are widely used because they are lightweight and can run multiple instances on a single hardware. However, they are less isolated than virtual machines, which makes them more vulnerable to attacks. Several approaches have been developed to reduce the attack surface of containers, but measuring the performance of these debloaters is challenging. This paper presents a unified platform, DebloatBenchC, to benchmark container debloaters. The platform currently includes 7 workload applications and 3 container debloaters: Speaker, Confine (syscall reduction tools), and Slimtoolkit (image size reduction tool).
Automated software debloating of program source or binary code has tremendous potential to improve both application performance and security. Unfortunately, measuring and comparing the effectiveness of various debloating methods is challenging due to the absence of a universal benchmarking platform that can accommodate diverse approaches. In this paper, first, we present DebloatBenchA, an extensible and sustainable benchmarking platform that enables comparison of different research techniques. Then, we perform a holistic comparison of the techniques to assess the current progress.
A secure text and image based search engine using trusted processors. All the data indexing algorithms are data oblivious to reduce information leakage.
Automated program analysis-based system to detect cryptographic API misuses in massive java projects. CryptoGuard efficiently and effectively identifies intended program slices by excluding language-specific non-essential elements, which reduces the rate of false-positive significantly. We helped harden the security of several high-impact apache projects, including Spark, Ranger, and Ofbiz.
Over the last few years, data storage in cloud-based services has been very popular due to the easy management and monetary advantages of cloud computing. Recent developments showed that such data could be leaked due to various attacks. To address some of these attacks, encrypting sensitive data before sending to the cloud emerged as an important protection mechanism. Still, indexing, querying and running complex data analytics tasks on the encrypted data remained as important challenges. In this dissertation, we address some of the encrypted data processing challenges using two different but complementary approaches. First, we explore what kind of data querying functionality we can provide for encrypted data even if we have no support from the server. Later, we provide solutions for the use cases where the cloud server provides a trusted processor for processing some of the encrypted data.
An encrypted data analytics framework that allows multiple distrusting parties to perform complex analytics and ML tasks on large encrypted data sets.
ETL-Query framework for performing complex queries, such as, facial recognition, on encrypted images stored in simple cloud storages without native computation capability (e.g. S3, Dropbox).