AAAI Publications, Second AAAI Conference on Human Computation and Crowdsourcing

Font Size: 
Instance-Privacy Preserving Crowdsourcing
Hiroshi Kajino, Yukino Baba, Hisashi Kashima

Last modified: 2014-09-05


Crowdsourcing is a technique to outsource tasks to a number of workers. Although crowdsourcing has many advantages, it gives rise to the risk that sensitive information may be leaked, which has limited the spread of its popularity. Task instances (data workers receive to process tasks) often contain sensitive information, which can be extracted by workers. For example, in an audio transcription task, an audio file corresponds to an instance, and the content of the audio (e.g., the abstract of a meeting) can be sensitive information. In this paper, we propose a quantitative analysis framework for the instance privacy problem. The proposed framework supplies us performance measures of instance privacy preserving protocols. As a case study, we apply the proposed framework to an instance clipping protocol and analyze the properties of the protocol. The protocol preserves privacy by clipping instances to limit the amount of information workers obtain. The results show that the protocol can balance task performance and instance privacy preservation. They also show that the proposed measure is consistent with standard measures, which validates the proposed measure.


Crowdsourcing; Privacy; Human Computation


Chen, K.; Kannan, A.; Yano, Y.; Hellerstein, J. M.; and Parikh, T. S. 2012. Shreddr: pipelined paper digitization for low-resource organizations. In Proceedings of the 2nd ACM Symposium on Computing for Development.

Cooper, S.; Khatib, F.; Treuille, A.; Barbero, J.; Lee, J.; Beenen, M.; Leaver-Fay, A.; Baker, D.; Popovic ́, Z.; and Players, F. 2010. Predicting protein structures with a multiplayer online game. Nature 466(7307):756–760.

Dawid, A. P., and Skene, A. M. 1979. Maximum likelihood estimation of observer error-rates using the EM algorithm. Journal of the Royal Statistical Society. Series C (Applied Statistics) 28(1):20–28.

Harris, C. G. 2011. Dirty deeds done dirty cheap: A darker side to crowdsourcing. In Proceedings of 2011 IEEE International Conference on Privacy, Security, Risk, and Trust, and IEEE International Conference on Social Computing, 1314–1317.

Lasecki, W. S.; Teevan, J.; and Kamar, E. 2014. Information extraction and manipulation threats in crowd-powered systems. In Proceedings of the 2014 ACM Conference on Computer Supported Cooperative Work.

Law, E., and von Ahn, L. 2011. Human computation. Morgan & Claypool Publishers.

Li, N.; Li, T.; and Venkatasubramanian, S. 2007. t- closeness: Privacy beyond k-anonymity and l-diversity. In Proceedings of 2007 IEEE 23rd International Conference on Data Engineering, 106–115.

Little, G., and Sun, Y.-A. 2011. Human OCR: Insights from a complex human computation process. In Proceedings of CHI 2011 Workshop on Crowdsourcing and Human Computation, 8–11.

Machanavajjhala, A.; Kifer, D.; Gehrke, J.; and Venkitasubramaniam, M. 2007. l-diversity: Privacy beyond k- anonymity. ACM Transactions on Knowledge Discovery from Data 1(1).

Sheng, V. S.; Provost, F.; and Ipeirotis, P. G. 2008. Get another label? Improving data quality and data mining using multiple, noisy labelers. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 614–622.

Varshney, L. R. 2012. Privacy and reliability in crowdsourcing service delivery. In Proceedings of the 2012 Annual SRII Global Conference, 55–60.

von Ahn, L.; Maurer, B.; McMillen, C.; Abraham, D.; and Blum, M. 2008. reCAPTCHA: human-based character recognition via Web security measures. Science 321(5895):1465–1468.

Welinder, P.; Branson, S.; Belongie, S.; and Perona, P. 2010. The multidimensional wisdom of crowds. In Advances in Neural Information Processing Systems 23, 2424–2432.

Whitehill, J.; Ruvolo, P.; Wu, T.; Bergsma, J.; and Movellan, J. 2009. Whose vote should count more: optimal integration of labels from labelers of unknown expertise. In Advances in Neural Information Processing Systems 22, 2035–2043.

Yao, B.; Jiang, X.; Khosla, A.; Lin, A. L.; Guibas, L.; and Fei-Fei, L. 2011. Human action recognition by learning bases of action attributes and parts. In Proceedings of 2011 IEEE International Conference on Computer Vision, 1331– 1338.

Full Text: PDF