Hafiz Imtiaz: ResearchIn the current age of connected services, internet of things and big data, our personal information is being collected by numerous services/platforms. For example, when we watch a video on YouTube or watch a movie on Netflix, the service providers (YouTube or Netflix) collect and store these information. They do that for every user. With this huge amount of data, they train their recommendation algorithm so that they can suggest us relevant new videos or movies next time we log into their services. If the suggestions are indeed relevant, it is very likely that we will engage with that content. This is a win-win situation: we are getting relevant content suggestions and the service provider is getting our attention and time (which they can sell to advertisers for profit). However, our media consumption history is nobody's business but ours, right? So, why are they using such personal information and what are we going to do about it? This is where privacy-preserving machine learning algorithms can play a vital role. If the recommendation algorithm used by YouTube or Netflix is privacy-preserving and efficient, we will get almost as good content suggestions, while simultaneously keeping our private information private. Not only recommendation systems, privacy-sensitive learning is important in many applications: examples include human health research, business informatics, and location-based services among others. Releasing any function of private data, even summary statistics and other aggregates, can reveal information about the underlying data. Differential privacy (DP) is a cryptographically motivated and mathematically rigorous framework for measuring the risk associated with performing computations on private data. Many companies have made efforts to honor the user privacy: Google, Apple, Uber, and the US Census Bureau. In most of the recent machine learning based services, the data is decentralized among many nodes/users/sites. Differential privacy is also useful when the private data is decentralized over multiple locations and each site has its own dataset. Some noteworthy examples include:
A few examples of interesting machine learning algorithms/problems are:
Graduate StudentsUndergraduate Students |