Machine-learning (ML) based, anomaly detection systems can enhance ransomware defenses by modeling the behavior of cloud identities (user, group, roles) as they interact with data stores such as AWS S3, GCP BigQuery, Azure Blob storage, and others. In this blog, we’ll explore ML-based anomaly detection and how it integrates with a comprehensive data security posture management (DSPM) solution.
Ransomware Ever On the Rise
It is no secret that ransomware incidents have been on the rise. Verizon’s 2023 Data Breach Investigation Report notes that a quarter of all breaches that occurred between November 2021 and November 2022 involved ransomware. Ransomware poses a serious threat to even the most security-conscious organizations, as evidenced by the compromise of companies like Intel, SAS airlines, and the City of Dallas (as stated in a May 2023 incidents report by Cyber Management Alliance).
A typical ransomware incident starts with an attacker gaining access to the victim’s environment through any number of means including phishing, malicious download, or remote exploitation. The attacker then scans for sensitive data assets within the environment, often exfiltrating the data before encrypting the data held within these stores. The key used to encrypt data is sent and stored with the attacker who demands a ransom from the victim to decrypt the affected data. More sophisticated ransomware campaigns also involve deleting backup data and removing fingerprints to cover tracks of initial infection and possibly persistence events.
With data becoming an attacker’s key target, it becomes increasingly important to adopt a more data-centric approach. Therefore, the industry needs security tools which can adapt quickly to the changing landscape. Enter machine learning capabilities.
Machine Learning, Anomaly Detection, and Data Security Posture Management
Anomaly detection systems aim to use machine learning (ML) to capture the pattern of activity across assets of interest. In a cloud service provider (CSP) environment, for instance, to model a user identity behavior, anomaly detection systems could leverage operation activity from AWS CloudTrail, GCP Big Query, or Azure Activity logs, and use ML algorithms such as IsolationForest or LocalOutlierFactor to encode the behavior. Leveraging anomaly detection for maintaining security has become indispensable for organizations, as such systems help organizations stay ahead of the game in which attackers can already bypass detection mechanisms for widely known attack patterns. Security tools which rely on signatures to detect known attacks suffer from slow upgrade especially since attacks migrate quickly to new vectors within days. In particular, anomaly detection has also become critical for securing those environments which make use of state-of-the-art technology, since the risk surface associated with new tools and their evolving capabilities becomes prominent only with time.
Using Symmetry DataGuard Anomaly Detection to Detect Ransomware
Symmetry Systems’ DataGuard uses state-of-the-art ML techniques to model behavior of cloud identities (user, group, roles) interacting with data stores (AWS S3, GCP BigQuery, Azure Blob storage, etc.) using activity monitoring of the said assets. The anomaly detection system built at Symmetry supports such activity monitoring across the three major cloud service providers (AWS/GCP/Azure) for better data security. This of course means that DataGuard can monitor for and detect ransomware end-to-end activity.
DataGuard’s Anomaly Detection service relies on data-specific operations and events to build the behavioral ML model for each of the two asset types: data stores and identities. Figure 1 provides a simple outline of how anomaly detection works in the customer environment.
Figure 1: Anomaly Detection at Symmetry
Let’s consider how the DataGuard Anomaly Detector trains in AWS. As Figure 1 shows, the Dataguard proxy ingests CloudTrail logs to create data events stored within Symmetry’s data lake, for use by anomaly detection as well as other DataGuard services. These data events identify the source (typically an IAM identity like a user) and the destination (a data store such as an S3 bucket). The Anomaly Detector retrieves recent operational activity for a set of assets, transforms them into feature vectors consumable by the machine learning algorithm during the training phase (“AD training” in the figure). The trained models are then persisted and also used by the inference (“AD inference” in the figure) service for live detection of anomalous events.
Note that the features and the models used to fit the user activity is decided based on robust criteria. The criteria includes a high-AUC metric (“Area Under the Curve”) for cross-validated experiments, efficacy on simulated attack events, and performance observed on real-customer data. Also note that the behavioral models we build are based on Data Activity Monitoring (DAM) of identities interacting with the data stores. This allows us to build a behavioral profile for identities as well as data stores and monitor for asset-specific anomalous use-cases separately. For instance, a user-identity which accesses an excessive number of data stores results in an identity anomaly, while an abnormal number of read operations on a single data store containing sensitive data results in a data-store anomaly event. DataGuard customers can then specify policy actions on the specific type of anomaly that is discovered.
Testing Our Anomaly Detection Capabilities
To ensure a high impact anomaly detection service, the Symmetry team executes various robustness experiments that validate low false positive rates, high true positive rates, and excellent coverage for known attack scenarios involving DSPM (data security posture management) assets.
The ML & Security teams at Symmetry continuously run purple team events and attack simulations to keep our threat detection capabilities for all three clouds (AWS, GCP, Azure) up-to-date. In that vein, we came across the Invictus Incident Response team’s description of the ransomware attack which was perpetrated against their AWS cloud customer.
Overview of an AWS Cloud Ransomware Campaign
The ransomware incident described by Invictus discusses how they pieced together the ransomware campaign affecting their AWS customer and provides valuable insight into this campaign.
The article maps the malicious ransomware activity to MITRE ATT&CK steps, which we also summarize below:
- Initial Access: The attacker gets hold of a long-lived AWS access key with permissions limited to a single S3 bucket.
- Reconnaissance: The attacker executes various types of operations to assimilate the target environment. For instance, the attacker attempts to determine users, buckets (containing potentially sensitive data), list access keys, etc. Logs indicate that seven different types of operations were performed: ListUsers, ListBuckets, ListIdentities, ListAccessKeys, ListServiceQuotas, GetAccount, GetSendQuota.
- Persistence: Several CreateUser operations were executed with the goal of the attacker leveraging such newly created user identities in case the victim identity is revised and therefore rendered inaccessible by the attacker. All such user creation attempts, however, were denied.
- Exfiltration: The attacker successfully pulls data out from the customer cloud. This is done using the GetObject operation. Note that for the attack simulation below, we assume that only one such operation is executed.
- Impact: Finally, the attacker first checks if the S3 bucket versioning is enabled (with the GetBucketVersioning AWS operation). On affirming so, the attacker now disables the versioning (with PutBucketVersioning) which prohibits data recovery from the target bucket.
Simulation & Results
We simulated this ransomware attack and validated detection using Symmetry DataGuard’s Anomaly Detection capability, replicating the above behavior step-by-step, by first blending the malicious activity with benign activity of select identities, followed by evaluating the traffic against previously trained identity ML models.
We ran the simulation consuming events from two Symmetry customers, one deployed within AWS and another within GCP. Note that the attack simulation occurs passively within the environment and therefore avoids setting off any alerts. The choice of target environments reflect diverse verticals and CSPs and therefore a diverse set of user activity as well.
We simulated execution of Reconnaissance, Persistence, Exfiltration, and Impact activity for each target environment. Although the ransomware campaign lists five steps, we evaluate against the last four steps since the described first step (Initial Access) does not point to which AWS operation was used, and hence cannot be simulated.
To ensure a fair evaluation of the DataGuard Anomaly Detection capability, we randomly chose 100 active identities and blended the above outlined malicious activity with the identity’s benign activity. Active identities represent users executing at least one thousand operations within the last six months. Our goal was to successfully detect the resulting anomalous activity using the threat detection ML models that DataGuard builds off of the user’s data access activity. To evaluate the success of our detection, we chose to use Recall, a common ML performance metric to measure the percentage of identities with malicious activity that were correctly detected as exhibiting anomalous behavior.
Figure 2: Average recall across identities for the Symmetry AWS customer
Figure 2 above shows the Average Recall metric as a heatmap for the AWS customer. Average Recall is simply computed by averaging the Recall observed for each of the 100 identities we execute the test against. The Y-axis shows that we evaluate Recall for every step of the ransomware attack specified in the previous section. The X-axis shows the total number of malicious operations that were performed at each attack step. For instance, the first column in Figure 2 indicates that we simulated a ransomware attack where only one operation for each attack step composed the malicious activity. This malicious activity was then interleaved with the benign traffic for each identity and the existing trained models for each identity were used to verify the anomaly. Note that from the anomaly detection perspective, detecting the attack step with only a subset of attack operations is harder to accomplish since the attack profile is incomplete and might seem closer to benign behavior. As shown in the figure, we achieve a 100% average Recall across all identities for each attack step. We repeat the test with all events that make up the attack step and again observe a perfect Recall. Additionally, for each customer, we force replay the Reconnaissance-step specific operations multiple times (a common attack scenario) and verify the 100% Recall rate.
Testing Across Cloud Service Providers
For the GCP customer as well, we discover every attack step in the blended traffic containing both benign and malicious activity for active identities and again observe a perfect average Recall. The reader may notice that the operation types outlined in the original article point to AWS operations. We map those AWS operations to GCP operations before executing the experiment within the GCP customer. For instance, the AWS “CreateUser” operation is mapped to the GCP’s “iam.createAccount”, and the AWS “GetBucketVersioning” is mapped to GCP “storage.buckets.get” operation, and so on.
Conclusion: Cloud Ransomware and Anomaly Detection
In this article, we demonstrated the successful detection of a real-world ransomware attack simulated within real customer traffic across multiple CSPs and customer profiles. The anomaly detection mechanism built into Symmetry’s DSPM DataGuard solution ensures detection of every stage of the simulated ransomware attack and similar ransomware incidents where the core malicious behavior is statistically anomalous. The robustness of the anomaly detection service ensures attack detection affecting active identities (harder to detect) and therefore ensures coverage for inactive identities as well. The production-ready ML models already ensure very low false positive rates and are also agnostic of the cloud service provider.