Research

Published and pre-published research works

SecureDL - The Queen's Guard: A Secure Enforcement of Fine-grained Access Control In Distributed Data Analytics Platforms

A two-layered, proactive and reactive, security framework for distributed frameworks, such as Apache Spark. In the proactive layer, we used program analysis to detect potential dangerous and malicious code early. In the reactive layer, we implemented attribute-based access control using aspect-oriented programming and secured the environment with security manager-based sandboxing.

Paper Presentation

Evaluating Container Debloaters

Docker containers are widely used because they are lightweight and can run multiple instances on a single hardware. However, they are less isolated than virtual machines, which makes them more vulnerable to attacks. Several approaches have been developed to reduce the attack surface of containers, but measuring the performance of these debloaters is challenging. This paper presents a unified platform, DebloatBenchC, to benchmark container debloaters. The platform currently includes 7 workload applications and 3 container debloaters: Speaker, Confine (syscall reduction tools), and Slimtoolkit (image size reduction tool).

Paper

SoK: A Tale of Reduction, Security, and Correctness-Evaluating Program Debloating Paradigms and Their Compositions

Automated software debloating of program source or binary code has tremendous potential to improve both application performance and security. Unfortunately, measuring and comparing the effectiveness of various debloating methods is challenging due to the absence of a universal benchmarking platform that can accommodate diverse approaches. In this paper, first, we present DebloatBenchA, an extensible and sustainable benchmarking platform that enables comparison of different research techniques. Then, we perform a holistic comparison of the techniques to assess the current progress.

Paper

SGX-IR: Secure Information Retrieval with Trusted Processors

A secure text and image based search engine using trusted processors. All the data indexing algorithms are data oblivious to reduce information leakage.

Paper Presentation Video

CryptoGuard: High precision detection of cryptographic vulnerabilities in massive-sized Java projects

Automated program analysis-based system to detect cryptographic API misuses in massive java projects. CryptoGuard efficiently and effectively identifies intended program slices by excluding language-specific non-essential elements, which reduces the rate of false-positive significantly. We helped harden the security of several high-impact apache projects, including Spark, Ranger, and Ofbiz.

Paper

Secure Cloud Data Analytics with Trusted Processors

Over the last few years, data storage in cloud-based services has been very popular due to the easy management and monetary advantages of cloud computing. Recent developments showed that such data could be leaked due to various attacks. To address some of these attacks, encrypting sensitive data before sending to the cloud emerged as an important protection mechanism. Still, indexing, querying and running complex data analytics tasks on the encrypted data remained as important challenges. In this dissertation, we address some of the encrypted data processing challenges using two different but complementary approaches. First, we explore what kind of data querying functionality we can provide for encrypted data even if we have no support from the server. Later, we provide solutions for the use cases where the cloud server provides a trusted processor for processing some of the encrypted data.

Paper Presentation

SGX-BigMatrix: A practical encrypted data analytic framework with trusted processors

An encrypted data analytics framework that allows multiple distrusting parties to perform complex analytics and ML tasks on large encrypted data sets.

Paper Presentation Video

A practical framework for executing complex queries over encrypted multimedia data

ETL-Query framework for performing complex queries, such as, facial recognition, on encrypted images stored in simple cloud storages without native computation capability (e.g. S3, Dropbox).

Paper