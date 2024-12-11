Microsoft has denied it is using content from Word and Excel documents to train artificial intelligence models without permission. While there's some fear mongering, the problem seems to be a blanket declaration in its company-wide privacy statement.

The controversy involves Connected Experiences, a long-standing Office feature that connects to the Internet for added functionality. This includes tools such as grammar checking, translation and audio transcription. It also allows downloads of templates and images, for example, to put into a PowerPoint presentation.

The feature came to sudden attention when a blogger, who more commonly writes about fantasy topics including dragons, wrote a post starting:

"Microsoft Office, like many companies in recent months, has slyly turned on an 'opt-out' feature that scrapes your Word and Excel documents to train its internal AI systems. This setting is turned on by default, and you have to manually uncheck a box in order to opt out."

The writer went on to note that the box to uncheck was at File -> Options -> Trust Center -> Trust Center Settings -> Privacy Options -> Privacy Settings -> Optional Connected Experiences, then uncheck the box: "Turn on optional connected experiences."

Microsoft Denial

For some reason this post wound up attracting huge attention, including from mainstream non-tech media. However, things are not quite as clear as they might seem.

Microsoft has responded with several statements, both on social media and in response to journalist enquiries. For example, it told Bleeping Computer that "Microsoft does not use customer data from Microsoft 365 consumer and commercial applications to train large language models (LLMs). Additionally, the Connected Services setting has no connection to how Microsoft trains large language models." (Source: bleepingcomputer.com)

It also posted on X (formerly Twitter) to say that "In the [Microsoft] 365 apps, we do not use customer data to train LLMs. This setting only enables features requiring Internet access like co-authoring a document."

The company has also rejected the suggestion the feature has only just been turned on. It says it's been on by default since its launch in April 2019.

Privacy Concerns

The real problem is that although Microsoft says it is not using customer content for AI training in this case, it may have the right to do so. Its overall privacy statement says "As part of our efforts to improve and develop our products, we may use your data to develop and train our AI models."

In other words, it isn't currently using customer data from Connected Experiences for AI training, but has the right to do so. That in turn raises questions about whether it is relying on opt-in or opt-out consent, which has different implications for whether such a policy and data use is lawful in different parts of the world. On that point, Microsoft's answers remain somewhat vague. (Source: theregister.com)

What's Your Opinion?

Had you heard the concerns about Connected Experiences? Do you ever read privacy policies for software? Should Microsoft get explicit permission to use customer data to train AI, or is it OK to simply use it unless customers actively opt-out?