Updates-Leak: Data Set Inference and Reconstruction Attacks in Online Learning

Salem, Ahmed Mohamed Gamal; Bhattacharyya, Apratim; Backes, Michael; Fritz, Mario; Zhang, Yang

DetailsSummary

Updates-Leak: Data Set Inference and Reconstruction Attacks in Online Learning

Salem, A. M. G., Bhattacharyya, A., Backes, M., Fritz, M., & Zhang, Y. (2019). Updates-Leak: Data Set Inference and Reconstruction Attacks in Online Learning. Retrieved from http://arxiv.org/abs/1904.01067.

Item is Released

show all hide all

Basic

show hide

Item Permalink: https://hdl.handle.net/21.11116/0000-0003-EC51-8 Version Permalink: https://hdl.handle.net/21.11116/0000-0003-EC52-7

Genre: Paper

Files

show Files

hide Files

:

arXiv:1904.01067.pdf (Preprint), 2MB

View Save

File Permalink:
https://hdl.handle.net/21.11116/0000-0003-EC53-6

Name:
arXiv:1904.01067.pdf

Description:
File downloaded from arXiv at 2019-07-03 11:07

OA-Status:

Visibility:
Public

MIME-Type / Checksum:
application/pdf / [MD5]

Technical Metadata:

View

Copyright Date:
-

Copyright Info:
-

License:
http://arxiv.org/licenses/nonexclusive-distrib/1.0/

Locators

show

Creators

show

hide

Creators:
Salem, Ahmed Mohamed Gamal¹, Author
Bhattacharyya, Apratim², Author
Backes, Michael¹, Author
Fritz, Mario¹, Author
Zhang, Yang¹, Author

Affiliations:
1External Organizations, ou_persistent22
2Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society, ou_1116547

Content

show

hide

Free keywords: Computer Science, Cryptography and Security, cs.CR,Computer Science, Learning, cs.LG,Statistics, Machine Learning, stat.ML

Abstract: Machine learning (ML) has progressed rapidly during the past decade and the
major factor that drives such development is the unprecedented large-scale
data. As data generation is a continuous process, this leads to ML service
providers updating their models frequently with newly-collected data in an
online learning scenario. In consequence, if an ML model is queried with the
same set of data samples at two different points in time, it will provide
different results.
In this paper, we investigate whether the change in the output of a black-box
ML model before and after being updated can leak information of the dataset
used to perform the update. This constitutes a new attack surface against
black-box ML models and such information leakage severely damages the
intellectual property and data privacy of the ML model owner/provider. In
contrast to membership inference attacks, we use an encoder-decoder formulation
that allows inferring diverse information ranging from detailed characteristics
to full reconstruction of the dataset. Our new attacks are facilitated by
state-of-the-art deep learning techniques. In particular, we propose a hybrid
generative model (BM-GAN) that is based on generative adversarial networks
(GANs) but includes a reconstructive loss that allows generating accurate
samples. Our experiments show effective prediction of dataset characteristics
and even full reconstruction in challenging conditions.

Details

show

hide

Language(s): eng - English

Dates: Created: 2019-04-01Published Online: 2019

Publication Status: Published online

Pages: 15 p.

Publishing info: -

Table of Contents: -

Rev. Type: -

Identifiers: arXiv: 1904.01067
URI: http://arxiv.org/abs/1904.01067
BibTex Citekey: Salem_arXiv1904.01067

Degree: -

Event

show

Legal Case

show

Project information

show

Source

show