By Ashok Mahajan, Sr. Partner Solutions Architect, Startups AWS By Ed Casmer, CTO Cloud Storage Security By Gokhul Srinivasan, Sr. Partner Solutions Architect, Startups AWS By Sean Falconer, Head of Marketing Skyflow
Securing personally identifiable information (PII) while maintaining compliance can be a daunting task for organizations. Despite best intentions, PII often finds itself scattered across various repositories such as databases, data warehouses, log files, and backups. This makes the maintenance of robust security and compliance measures an uphill battle.
File management only adds to the complexity, requiring stringent security measures, strict access controls, and compliance-oriented storage practices. The risk of data loss and malware threats further intensifies when organizations receive files from external sources such as customers. Organizations must scan such external files before processing for viruses and malware to mitigate potential threats.
To minimize risk and de-scope existing upstream and downstream systems, organizations use Skyflow whichis available in AWS Marketplace. Skyflow Data Privacy Vault delivers security, compliance, and data residency for your Amazon Web Services (AWS) workloads.
Skyflow, an AWS Partner, uses Cloud Storage Security (CSS) to automatically and asynchronously scan uploaded files for malicious code and malware. CSS is an AWS Specialization Partner with the Security Competency, and it helps to further protect your infrastructure and ease the burden of sensitive file management.
In this post, well show how to secure PII data using Skyflow Data Privacy Vault and add malware protection using Cloud Storage Security on AWS.
Skyflow is a software-as-a-service (SaaS) offering that supports multi-tenant and single-tenant deployment models. Skyflow Data Privacy Vault isolates, protects, and governs access to sensitive customer data, which is transformed by the vault into opaque tokens that serve as references to this data. The non-sensitive tokens can be safely stored in any application storage systems or used in data warehouses.
A Skyflow vault can keep sensitive data in a specific geographic location, and tightly controls access to this data. Other systems only have access to non-sensitive tokenized data.
In the example below, a phone number (555-1212) is collected by a frontend application. This phone number, along with any other PII, is transformed by the vault, which is isolated outside of your companys existing infrastructure.
Any downstream services (such as a database) store only the token representation of the data (e.g. ABC123), and are removed from the scope of compliance. The token representation can preserve formatting as needed and be consistently generated to not break analytics and machine learning (ML) workflows.
Figure 1 Reducing compliance and security scope with a data privacy vault.
A data privacy vault serves as core infrastructure for PII, and Skyflow Data Privacy Vault provides this core infrastructure as a service which includes compute, storage, and network. The core architectural block is simplified to an API call, and Skyflow uses polymorphic encryption which combines multiple forms of encryption to secure PII and make it usable. This allows you to perform operations over fully encrypted data.
You can build any PII-specific workload on a Skyflow vault for data sharing, analytics, and encrypted operations. This way you could find all records with the same area code without decrypting the data or calculate the average income of your customers, again without exposing yourself, your employees, or your infrastructure to PII.
While a data privacy vault isnt a database, Skyflow Data Privacy Vault was designed to have some similar properties. For example, a Skyflow vault supports a schema that can consist of tables, columns, and rows (see image below).
Figure 2 Vault schema with four tables.
The vault is specially designed for supporting the full lifecycle of sensitive data, and it understands the structure of PII and its uses. For example, a Skyflow vault understands a social security number as a data type, not simply a string. This means the vault natively supports use cases like showing only the last four digits of a social security number based on the roles and policies you set up, or securely sharing the full social security number with a third-party vendor of identity verification.
The vault not only transforms sensitive data into non-sensitive data, but it tightly controls access to sensitive data through a zero-trust model where no user account or process has access to data unless its granted by explicit access control policies. These policies are built from the bottom, granting access to specific columns and rows of PII. This allows you to control who sees what, when, where, for how long, and in what format.
To store, manage, and retrieve data with Skyflow, you can use APIs directly or software development kits (SDKs). Skyflow supports both frontend and backend SDKs. Depending on your needs and where you choose to integrate, that will impact which SDK you use.
To learn more about the Skyflow SDKs and APIs, check out the documentation.
To demonstrate secure file storage and management through Skyflow, lets look at how this solution de-scopes both the frontend and backend application from touching the sensitive documents.
The following architecture diagram illustrates the file upload flow with Skyflow, AWS services mentioned above and CSS.
Figure 3 Example of file upload processed through Skyflow and CSS.
To control access to the customers vault, policies are created in Skyflow to allow programmatic writes into the vault table for client records.
Read and update access needed to be restricted to the single record owned by the currently logged in user. Skyflow customers can use an authentication service like Auth0 and the customer application knows who the user is based on the Auth0 token.
Skyflow vault respects the identity of the user and restrict access based on this identity. To support this requirement, customers use Skyflows context-aware authorization.
Programmatic access to Skyflow APIs is controlled through a service account created within your Skyflow account. The service accounts roles, and the policies attached to those roles, decide the level of access a service account has to a vault. The creation of Skyflow roles, policies, and service accounts is controlled programmatically through Skyflows management APIs or through Skyflow Studio, Skyflows web-based vault administration portal (see image below).
Figure 4 Example of creating a policy from Skyflow Studio.
Context-aware authorization lets your backend insert an additional claim for end user context into the JWT insertion. You can use any string that uniquely identifies the end user, such as the token provided by Auth0 after a client successfully logs in.
After the additional claim is added, the vault verifies the request and returns a bearer token with the context identifier. The diagram in Figure 5 below illustrates authentication with contextual information for the Skyflow customer and data retrieval.
Figure 5 Context-aware authorization flow diagram using Auth0 token for context.
Using the returned bearer token with the context restriction, the frontend customer application is able to retrieve the PII and files owned by the currently logged in user and only that user (Step 6).
Further, the time-to-live (TTL) on the bearer token can be controlled, so the token can be set to live only long enough to retrieve the record for the client.
When collecting and managing sensitive data like files containing PII, its best practice to take the entire application infrastructure out of security and compliance scope including the frontend.
Skyflow Elements provides a secure way to collect and reveal sensitive data including files. It offers several benefits, including complete programmatic isolation from your frontend applications, end-to-end encryption, tokenization, and the ability to customize the look and feel of the data collection form.
When users interact with Skyflow Elements, various components work together to collect and reveal sensitive data. Heres how it works:
After uploading a file, Skyflow automatically scans the file for viruses leveraging the CSS integration within the vault. You can retrieve the status of a scan using the Get Status Scan API.
If the file doesnt contain a virus, a status of SCAN_CLEAN is returned and the file is available for downloading or in-page retrieval. Otherwise, a status of SCAN_INFECTED is returned and the file moved into quarantine.
To reveal an uploaded file, the file is embedded into the web frontend as an iframe so the file never touches the customers servers.
Skyflow enables a business to offload the security, privacy, and compliance responsibilities of sensitive file and PII handling so its can focus resources on their core business.
In this post, we discussed the challenges businesses face with managing sensitive customer data. We reviewed how to secure personally identifiable information (PII) using Skyflow Data Privacy Vault and add malware protection using Cloud Storage Security (CSS) on AWS.
We also showed how Skyflow Data Privacy Vault can securely collect, manage, and use sensitive data. Skyflow integrates with CSS to support automatic virus and malware detection and protection for files.
To learn more, contact Skyflow or try out Skyflow in AWS Marketplace. For additional information regarding Cloud Storage Security, check out CSS in AWS Marketplace.
Read the rest here:
Protecting and Managing Sensitive Customer Data with Skyflow and Cloud Storage Security | Amazon Web Services - AWS Blog