Privacy-Preserving Distributed Linear Regression on High-Dimensional Data

TitlePrivacy-Preserving Distributed Linear Regression on High-Dimensional Data
Publication TypeConference Proceedings
Year of Conference2017
AuthorsGascon, A., P. Schoppmann, B. Balle, M. Raykova, J. Dorner, S. Zahur, and D. Evans
Conference NamePrivacy Enhancing Technologies
PublisherDe Gruyter

We propose privacy-preserving protocols for computing
linear regression models, in the setting where the training
dataset is vertically distributed among several parties. Our
main contribution is a hybrid multi-party computation protocol
that combines Yao’s garbled circuits with tailored protocols
for computing inner products. Like many machine learning
tasks, building a linear regression model involves solving
a system of linear equations. We conduct a comprehensive
evaluation and comparison of different techniques for securely
performing this task, including a new Conjugate Gradient Descent
(CGD) algorithm. This algorithm is suitable for secure
computation because it uses an efficient fixed-point representation
of real numbers while maintaining accuracy and convergence
rates comparable to what can be obtained with a classical
solution using floating point numbers. Our technique improves
on Nikolaenko et al.’s method for privacy-preserving
ridge regression (S&P 2013), and can be used as a building
block in other analyses. We implement a complete system and
demonstrate that our approach is highly scalable, solving data
analysis problems with one million records and one hundred
features in less than one hour of total running time.

Output type: