date: 2016-11-19T19:54:21Z pdf:docinfo:custom:lastpage: 2792 pdf:PDFVersion: 1.3 pdf:docinfo:title: Supervised learning through the lens of compression access_permission:can_print_degraded: true EventType: Poster pdf:docinfo:custom:firstpage: 2784 subject: Neural Information Processing Systems http://nips.cc/ dc:format: application/pdf; version=1.3 access_permission:fill_in_form: true pdf:encrypted: false dc:title: Supervised learning through the lens of compression Book: Advances In Neural Information Processing Systems 29 pdf:docinfo:custom:Date: 2016 modified: 2016-11-19T19:54:21Z Description-Abstract: This work continues the study of the relationship between sample compression schemes and statistical learning, which has been mostly investigated within the framework of binary classification. We first extend the investigation to multiclass categorization: we prove that in this case learnability is equivalent to compression of logarithmic sample size and that the uniform convergence property implies compression of constant size. We use the compressibility-learnability equivalence to show that (i) for multiclass categorization, PAC and agnostic PAC learnability are equivalent, and (ii) to derive a compactness theorem for learnability. We then consider supervised learning under general loss functions: we show that in this case, in order to maintain the compressibility-learnability equivalence, it is necessary to consider an approximate variant of compression. We use it to show that PAC and agnostic PAC are not equivalent, even when the loss function has only three values. cp:subject: Neural Information Processing Systems http://nips.cc/ pdf:docinfo:subject: Neural Information Processing Systems http://nips.cc/ pdf:docinfo:custom:Created: 2016 pdf:docinfo:creator: Ofir David, Shay Moran, Amir Yehudayoff meta:author: Ofir David, Shay Moran, Amir Yehudayoff access_permission:extract_for_accessibility: true lastpage: 2792 pdf:docinfo:custom:Type: Conference Proceedings Editors: D.D. Lee and U.V. Luxburg and I. Guyon and R. Garnett Author: Ofir David, Shay Moran, Amir Yehudayoff producer: PyPDF2 pdf:docinfo:producer: PyPDF2 pdf:docinfo:custom:Description: Paper accepted and presented at the Neural Information Processing Systems Conference (http://nips.cc/) pdf:unmappedUnicodeCharsPerPage: 0 Description: Paper accepted and presented at the Neural Information Processing Systems Conference (http://nips.cc/) access_permission:modify_annotations: true firstpage: 2784 dc:creator: Ofir David, Shay Moran, Amir Yehudayoff pdf:docinfo:custom:EventType: Poster Last-Modified: 2016-11-19T19:54:21Z dcterms:modified: 2016-11-19T19:54:21Z title: Supervised learning through the lens of compression Last-Save-Date: 2016-11-19T19:54:21Z Created: 2016 pdf:docinfo:modified: 2016-11-19T19:54:21Z Language: en-US pdf:docinfo:custom:Language: en-US pdf:docinfo:custom:Book: Advances In Neural Information Processing Systems 29 meta:save-date: 2016-11-19T19:54:21Z Content-Type: application/pdf X-Parsed-By: org.apache.tika.parser.DefaultParser creator: Ofir David, Shay Moran, Amir Yehudayoff access_permission:assemble_document: true xmpTPg:NPages: 9 Publisher: Curran Associates, Inc. pdf:charsPerPage: 2264 access_permission:extract_content: true Date: 2016 access_permission:can_print: true Type: Conference Proceedings pdf:docinfo:custom:Editors: D.D. Lee and U.V. Luxburg and I. Guyon and R. Garnett pdf:docinfo:custom:Description-Abstract: This work continues the study of the relationship between sample compression schemes and statistical learning, which has been mostly investigated within the framework of binary classification. We first extend the investigation to multiclass categorization: we prove that in this case learnability is equivalent to compression of logarithmic sample size and that the uniform convergence property implies compression of constant size. We use the compressibility-learnability equivalence to show that (i) for multiclass categorization, PAC and agnostic PAC learnability are equivalent, and (ii) to derive a compactness theorem for learnability. We then consider supervised learning under general loss functions: we show that in this case, in order to maintain the compressibility-learnability equivalence, it is necessary to consider an approximate variant of compression. We use it to show that PAC and agnostic PAC are not equivalent, even when the loss function has only three values. pdf:docinfo:custom:Published: 2016 Published: 2016 pdf:docinfo:custom:Publisher: Curran Associates, Inc. access_permission:can_modify: true