Implementations of differentially private analyses

Since the advent of differential privacy, a number of systems supporting differentially private data analyses have been implemented and deployed.

Research Projects and Prototypes

  • PINQ: An API implemented in C#.[1]
  • Airavat: A MapReduce-based system implemented in Java hardened with SELinux-like access control.[2]
  • Fuzz: Time-constant implementation in Caml Light of a domain-specific language.[3]
  • GUPT: Implementation of the sample-and-aggregate framework.[4]
  • KTELO: A framework and system for answering linear counting queries.[5]

Real-World Deployments

  • OnTheMap: Interactive tool for exploration of US commute patterns.[6][7]
  • PSI (Ψ): A Private data Sharing Interface developed as part of the Harvard University Privacy Tools Project.[10]
  • Application usage statistics in Microsoft Windows 10.[12]
  • Flex: A SQL-based system developed for internal Uber analytics.[13][14]
  • TensorFlow Privacy: An open-source library for differentially private machine learning.[15][16]
  • Census 2020 synthetic microdata.[17]

Attacks on Implementations

In addition to standard defects of software artifacts that can be identified using testing or fuzzing, implementations of differentially private mechanisms may suffer from the following vulnerabilities:

  • Subtle algorithmic or analytical mistakes.[18][19]
  • Timing side-channel attacks.[3] In contrast with timing attacks against implementations of cryptographic algorithms that typically have low leakage rate and must be followed with non-trivial cryptanalysis, a timing channel may lead to a catastrophic compromise of a differentially private system, since a targeted attack can be used to exfiltrate the very bit that the system is designed to hide.
  • Leakage through floating-point arithmetic.[20] Differentially private algorithms are typically presented in the language of probability distributions, which most naturally lead to implementations using floating-point arithmetic. The abstraction of floating-point arithmetic is leaky, and without careful attention to details, a naive implementation may fail to provide differential privacy. (This is particularly the case for ε-differential privacy, which does not allow any probability of failure, even in the worst case.) For example, the support of a textbook sampler of the Laplace distribution (required, for instance, for the Laplace mechanism) is less than 80% of all double-precision floating point numbers; moreover, the support for distributions with different means are not identical. A single sample from a naïve implementation of the Laplace mechanism allows distinguishing between two adjacent datasets with probability more than 35%.
  • Timing channel through floating-point arithmetic.[21] Unlike operations over integers that are typically constant-time on modern CPUs, floating-point arithmetic exhibits significant input-dependent timing variability.[22] Handling of subnormals can be particularly slow, as much as by ×100 compared to the typical case.[23]

References

  1. McSherry, Frank (1 September 2010). "Privacy integrated queries" (PDF). Communications of the ACM. 53 (9): 89–97. doi:10.1145/1810891.1810916. S2CID 52898716.
  2. Roy, Indrajit; Setty, Srinath T.V.; Kilzer, Ann; Shmatikov, Vitaly; Witchel, Emmett (April 2010). "Airavat: Security and Privacy for MapReduce" (PDF). Proceedings of the 7th Usenix Symposium on Networked Systems Design and Implementation (NSDI).
  3. Haeberlen, Andreas; Pierce, Benjamin C.; Narayan, Arjun (2011). "Differential Privacy Under Fire". 20th USENIX Security Symposium.
  4. Mohan, Prashanth; Thakurta, Abhradeep; Shi, Elaine; Song, Dawn; Culler, David E. "GUPT: Privacy Preserving Data Analysis Made Easy". Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data. pp. 349–360. doi:10.1145/2213836.2213876. S2CID 2135755.
  5. Zhang, Dan; McKenna, Ryan; Kotsogiannis, Ios; Hay, Michael; Machanavajjhala, Ashwin; Miklau, Gerome (June 2018). "KTELO: A Framework for Defining Differentially-Private Computations". SIGMOD'18: 2018 International Conference on Management of Data: 115–130. arXiv:1808.03555. doi:10.1145/3183713.3196921. S2CID 5033862.
  6. "OnTheMap".
  7. Machanavajjhala, Ashwin; Kifer, Daniel; Abowd, John; Gehrke, Johannes; Vilhuber, Lars (April 2008). "Privacy: Theory meets Practice on the Map". 2008 IEEE 24th International Conference on Data Engineering: 277–286. doi:10.1109/ICDE.2008.4497436. ISBN 978-1-4244-1836-7. S2CID 5812674.
  8. Erlingsson, Úlfar. "Learning statistics with privacy, aided by the flip of a coin".
  9. Erlingsson, Úlfar; Pihur, Vasyl; Korolova, Aleksandra (November 2014). "RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response". Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security: 1054–1067. arXiv:1407.6981. Bibcode:2014arXiv1407.6981E. doi:10.1145/2660267.2660348. S2CID 6855746.
  10. Gaboardi, Marco; Honaker, James; King, Gary; Nissim, Kobbi; Ullman, Jonathan; Vadhan, Salil; Murtagh, Jack (June 2016). "PSI (Ψ): a Private data Sharing Interface".
  11. Differential Privacy Team (December 2017). "Learning with Privacy at Scale". Apple Machine Learning Journal. 1 (8). {{cite journal}}: |last1= has generic name (help)
  12. Ding, Bolin; Kulkarni, Janardhan; Yekhanin, Sergey (December 2017). "Collecting Telemetry Data Privately". 31st Conference on Neural Information Processing Systems: 3574–3583. arXiv:1712.01524. Bibcode:2017arXiv171201524D.
  13. Tezapsidis, Katie (Jul 13, 2017). "Uber Releases Open Source Project for Differential Privacy".
  14. Johnson, Noah; Near, Joseph P.; Song, Dawn (January 2018). "Towards Practical Differential Privacy for SQL Queries". Proceedings of the VLDB Endowment. 11 (5): 526–539. arXiv:1706.09479. doi:10.1145/3187009.3177733.
  15. Radebaugh, Carey; Erlingsson, Ulfar (March 6, 2019). "Introducing TensorFlow Privacy: Learning with Differential Privacy for Training Data".
  16. "TensorFlow Privacy". GitHub. 2019-08-09.
  17. Abowd, John M. (August 2018). "The U.S. Census Bureau Adopts Differential Privacy". Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining: 2867. doi:10.1145/3219819.3226070. hdl:1813/60392. ISBN 9781450355520. S2CID 51711121.
  18. McSherry, Frank (25 February 2018). "Uber's differential privacy .. probably isn't". GitHub.
  19. Lyu, Min; Su, Dong; Li, Ninghui (1 February 2017). "Understanding the sparse vector technique for differential privacy". Proceedings of the VLDB Endowment. 10 (6): 637–648. arXiv:1603.01699. doi:10.14778/3055330.3055331. S2CID 5449336.
  20. Mironov, Ilya (October 2012). "On Significance of the Least Significant Bits for Differential Privacy" (PDF). Proceedings of the 2012 ACM Conference on Computer and Communications Security (ACM CCS). ACM: 650–661. doi:10.1145/2382196.2382264. ISBN 9781450316514. S2CID 3421585.
  21. Andrysco, Marc; Kohlbrenner, David; Mowery, Keaton; Jhala, Ranjit; Lerner, Sorin; Shacham, Hovav (May 2015). "On Subnormal Floating Point and Abnormal Timing". 2015 IEEE Symposium on Security and Privacy: 623–639. doi:10.1109/SP.2015.44. ISBN 978-1-4673-6949-7. S2CID 1903469.
  22. Kohlbrenner, David; Shacham, Hovav (August 2017). "On the Effectiveness of Mitigations Against Floating-point Timing Channels". Proceedings of the 26th USENIX Conference on Security Symposium. USENIX Association: 69–81.
  23. Dooley, Isaac; Kale, Laxmikant (September 2006). "Quantifying the interference caused by subnormal floating-point values" (PDF). Proceedings of the Workshop on Operating System Interference in High Performance Applications.
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.