Text and Data Mining valid from 2018-06-15
Received: 29 September 2017
Accepted: 6 June 2018
First Online: 15 June 2018
Compliance with ethical standards
: This work is funded 100% through the authors employment (a full-time internship for MF and full-time employment for the other authors) at Microsoft.
: All user identifying information was anonymized. We did not examine search queries with personally-identifiable information or other sensitive information. All data access and analysis performed for this research was done in accordance with the published end-user license agreement, which was worded as follows: “By connecting to Microsoft Health you agree to allow Microsoft to share your data between Cortana and Microsoft Health, to provide valuable personal insights and recommendations to help you reach your fitness and wellness goals.” Visits to businesses were logged by Cortana to offer local services and is agreed to by users. Twitter data were not connected to specific users, but rather was based on publicly available tweets and were aggregated across many users who visited the business location. Our work was conducted offline, on data collected to support existing business operations, and did not influence the user experience. All data were anonymized and deidentified prior to analyses. Each user was represented by an anonymous identifier. We filtered search queries to only those matching a whitelist of keywords relevant to our study. The Ethics Advisory Committee at Microsoft Research considers these precautions sufficient for triggering the Common Rule, exempting this work from detailed ethics review.
: Our data were collected between August 2015 and April 2016 and from individuals who agreed to link their Cortana data and Microsoft Health data (including Band device data) for use in generating additional insights or recommendations about their sleep or activity.