Scanning Office 365 documents

Radim Řehůřek personal data, PII Tools, security

You wouldn’t know it if you’re coming from a data science background, but Microsoft products permeate the businesses world. A lot of documents are stored and shared within Windows environments, not all of them easily discoverable and interpretable with respect to personal information and data privacy.

After we implemented scanning of on-prem Windows devices, endpoints and file shares into PII Tools, the nr. 1 request has been to find personal data inside Office 365.

screenshot of UI for o365 batch scans

Scanning Office 365 using either REST API or the web dashboard.

Since we listen to our customers, I have a happy announcement to make: Starting with release 1.6.0, PII Tools can automatically scans contents of Office 365 accounts!

This includes both structured and unstructured content in Microsoft OneDrive, Microsoft Exchange Online and (later this month) Sharepoint Online, via three new connectors.

This is a blog post about PII Tools, our commercial product. PII Tools lets corporations perform personal and sensitive discovery with unprecedented accuracy, using context-aware detectors and secure cloud-free deployment. See our special introductory pricing.

What does “support Office 365” mean?

You can now find and review personal and sensitive information for documents, emails and tables shared within your company’s Office 365 account(s), directly from PII Tools. No need to export or copy the data to external locations. Since PII Tools runs on your own hardware, there’s no need to send any data into the cloud either.

“Scan across Office 365” can sound a little nebulous, and navigating Microsoft’s enterprise offerings not always straightforward (ha!). So here’s what this means specifically:

  • Office 365 is a suite of products and services, some of which may store personal information or sensitive data: names, addresses, credit cards, faces, passport scans, sexual preferences, religious views…
  • The most salient O365 services from the privacy and security perspective are:
    • OneDrive: file hosting service operated as part of Office Online; there are drives for users, user groups and entire sites
    • Exchange Online: hosted email, one mailbox per user
    • Sharepoint Online: cloud service to share and manage company data; some documents shared on OneDrive.
  • PII Tools lets you apply our context-aware personal data detectors to:
    • mailboxes of an individual user, or of all users in Exchange Online
    • drives of an individual user, or of all users in OneDrive
    • drives of a single group, or all groups in OneDrive
    • all drives for a given OneDrive site
  • Technically, programmatic access to Office 365 happens through an API called Microsoft Graph, mgraph
  • PII Tools comes with step-by-step instructions on how to set up and authorize Office 365 scans

Where next?

The pace at which PII Tools evolves these days is hectic. It was only last week we released our new on-prem web dashboard, to complement the existing REST APIs.

Sign up to receive PII Tools news from the intersection of data privacy & machine learning.

 Unsubscribe anytime. We send at most 2 posts per month.

We’re really proud of this new Office365 capability. With O365 SharePoint on our roadmap for July 2018, and Microsoft Azure Blob right after that, we’ll be able to offer our customers discovery across pretty much all major environments, whether streamed, local, endpoints, or cloud.