As on-premises and cloud-hosted data repositories get larger, they are outstripping the ability of traditional data-crunching methods to efficiently analyze the information. As a result, more enterprises have turned to data science and machine learning platforms to create business value. The benefit of using a platform for big-data analytics is you don’t have to be a data scientist to achieve fast, accurate results. How can security professionals use data science and machine learning platforms to create efficient systems for identifying actual security threats and orchestrating effective remediation actions?
An example use case for machine learning algorithms
In a security context, DevOps can use Machine Learning algorithms to test their pipelines and detect anomalies, Support can use clustering algorithms to group similar customers’ requests and reduce their workload, and Network Operation Centers (NOCs) can use anomaly detection algorithms to detect malfunctioning networks. Since it’s pretty clear that everyone can benefit from data science, finding ways to empower every willing employee by providing fundamental data science skills and instruction for using machine learning platforms is very likely to pay high dividends back to the enterprise.
In a recent article, Capital One SVP and Head of Data Insights Dave Kang refers to an internal machine learning platform the company created that provides privileged associates with governed access to algorithms, components, and infrastructure for reuse. They did this to enable non-data science and machine learning practitioners to leverage the platform to make smart business decisions. In the Capital One case, they used home-grown and open-source machine learning algorithms hosted by a shared platform to detect anomalies and automatically create defenses for credit card fraud.
How machine learning drives good data security
Using traditional native logging and database activity monitoring (DAM) tools to discover and classify sensitive data is a time-consuming manual process not suited to the structured, unstructured, and semi-structured data repositories that most security teams manage today. Also, most teams are not familiar enough with cloud-based environments to determine easily what sensitive data they contain and where the data is located.
Today, you can apply Imperva Data Security Fabric analytics engines to identify sensitive data, and adopt an automated lifecycle management approach to classifying, labeling, and mapping sensitive data across all your data repositories. Once data is discovered and classified, security teams can apply User Entity and Behavior Analytics (UEBA) at the tool level to automatically detect threats to sensitive data. As new data and data sources are introduced, the Imperva Data Security Fabric machine learning-driven system enables security teams to discover and classify them automatically, saving significant time and effort. The system also automates the remediation process, enabling you to prevent costly data breaches.
Imperva Data Security Fabric (DSF) automates the entire sensitive data management (SDM) lifecycle from discovery through classification and tagging all the way to remediation. Imperva Data Security Fabric eliminates the manual processes by fully automating classification for every type of data source. Imperva DSF automates the routing of information between people and the use of machine learning algorithms. Through machine learning, an iterative model creation process enables the SDM system to learn from the actions you take on data. As a result, you make the system more efficient in terms of determining what data is labeled, tagged, or ignored. As you are classifying and tagging data and separating false positives from actionable information, you can tell the SDM system to record what you are doing, and in that process, you are enabling the system to learn from you. The system mimics the actions you have taken on data, so after the first instance, the amount of manual work required decreases sharply and continues to do so every time. These “taught systems” enable you to manage data classification and tagging automatically and with a minimum of human intervention.
Democratizing data science and machine learning is critical
A recent Gartner survey reported that forty-eight percent of respondents have already deployed or plan to deploy AI/ML in the next twelve months. Given the shortage of data science and ML talent at most organizations, finding a suitable platform and empowering groups of non-data scientists in your organization to leverage this technology is very likely to be important going forward.
See how the machine learning capabilities of Imperva Data Security Fabric enable security teams and other security stakeholders in your organization to make better decisions and mitigate cybersecurity risks. Learn more.
Try Imperva for Free
Protect your business for 30 days on Imperva.