
According to Han, J., Pei, J., and Kamber, M. (2011), data mining is, “knowledge mining from data.” By using data mining techniques, businesses can identify critical patterns and actionable information. These methods are improving the decision-making processes for a wide range of industries. Over the years, there has been many advancements made to data mining technology, enabling professionals to work with improved mining tools for the discovery of critical information from diverse sources. Good data mining solutions have numerous practical applications. From collecting, and processing, to analyzing critical information and detecting anomalies. This research paper will outline a strategic assessment of one of the leading data mining tools on the market today and highlight how it addresses the different facets of a comprehensive data mining solution.
Oracle Data Mining (ODM)
Let’s evaluate the Oracle Data Mining (ODM) tools’ functionality, strengths, and relative weaknesses based on today’s big data and cloud-based landscape. As an overview, “ODM is a component of the Oracle Advanced Analytics Database Option, it provides powerful data mining algorithms that enable data analysts to discover insights, make predictions and leverage their Oracle data and investments (Oracle, 2020). It allows users to build and apply predictive models inside the Oracle Database to help you predict customer behavior, target your best customers, develop customer profiles, identify cross-selling opportunities and detect anomalies and potential fraud” (Oracle, 2020). The tool offers a comprehensive suite of structured query language (SQL) applications that seamlessly integrate with other commonly used scripting and programming languages (i.e. Python, R, C++, etc.). Such functions support data modeling projects of all kinds and the demand for functionality at all levels. It supports big data analytics with high performance speeds given its cloud infrastructure, enabling integration with commercial and open-source applications.
According to Oracle (2020), the SQL data mining functions can mine data tables and views, star schema data including transactional data, aggregations, unstructured data i.e. CLOB data type (using Oracle Text to extract tokens) and spatial data. Overall, the ODM suite of functionality was designed to address data privacy concerns, multi-source data accessibility demands, data analysis, data modeling, and data reporting needs.
ODM – Key Features
According to Gartner (2020), customers rated Oracle’s overall advanced analytics solutions accordingly:
- Overall Product Capabilities was rated 4.6 out of 5;
- Evaluation & Contracting was rated a 4.3 out of 5;
- Integration & Deployment was rated at 4.3 out of 5; and lastly,
- Service & Support was rated 4.4 out of 5.
Machine Learning (ML) Algorithms and SQL Functions
Some of the most well-recognized features are the machine learning (ML) algorithms and SQL functions. Adopted by different fields for various applications, algorithms and SQL functions are especially significant in corporate sectors, allowing companies to improve their revenues, assess risks, and establish healthy customer relationships. The fields that are using data mining solutions to redefine business include but are not limited to:
- Banks (Online/In-Store)
- Retail (Online/In-Store)
- Insurance Companies
- Telecommunication Providers
- Pharmaceutical Companies
- Manufacturing
- Academia
ODM offers customers the ability to use the following algorithms to address common business needs. As outlined by Oracle (2020), ML algorithms are provided to solve many types of business problems. Here are some of the data mining techniques available to customers (not to be limited by).
Classification
Source: Oracle (2020) | Image: Classification Visual
- Common technique used for predicting a specific outcome such as response/no-response, high/medium/low-value customer, and likely to buy/not buy (Oracle, 2020).
Regression
Source: Oracle (2020) | Image: Regression Visual
- Technique for predicting a continuous numerical outcome such as customer lifetime value, house value, and process yield rates (Oracle, 2020).
Attribute Importance
Source: Oracle (2020) | Image: Attribute Importance Visual
- Ranks attributes according to strength of relationship with target attribute. Use cases include finding factors most associated with customers who respond to an offer, factors most associated with healthy patients (Oracle, 2020).
Anomaly Detection
Source: Oracle (2020) | Image: Anomaly Detection Visual
- Identifies unusual or suspicious cases based on deviation from the norm. Common examples include health care fraud, expense report fraud, and tax compliance (Oracle, 2020).
Clustering
Source: Oracle (2020) | Image: Clustering Visual
- Useful for exploring data and finding natural groupings. Members of a cluster are more like each other than they are like members of a different cluster. Common examples include finding new customer segments, and life sciences discovery (Oracle, 2020).
Association
Source: Oracle (2020) | Image: Association Visual
- Finds rules associated with frequently co-occurring items, used for market basket analysis, cross-sell, root cause analysis. Useful for product bundling, in-store placement, and defect analysis (Oracle, 2020).
Feature Selection and Extraction
Source: Oracle (2020) | Image: Feature Selection and Extraction Visual
- Produces new attributes as linear combination of existing attributes. Applicable for text data, latent semantic analysis, data compression, data decomposition and projection, and pattern recognition (Oracle, 2020).
Oracle’s data mining suite also takes advantage of its database strengths such as counting, parallelism, scalability, and other data architectural elements (2020). Some earlier developed data mining algorithms like Naïve Bayes and A Priori, which use counting principles, are often used by customers to assemble conditional probability predictive models commonly used across marketing firms, financial institutes, and other business areas.
SQL and R Support
According to Oracle (2015), ODM’s ability to support both SQL and R applications makes it desired by consumers because these are the most versatile data management and statistical languages used across the global enterprise. While SQL is commonly known as a language for query, reporting, and analysis of structured data, R is known for its open-source’ness used for statistical analysis and free to use by anyone. As an example, here are a few of the Oracle Advanced Analytics (SQL & R):
Source: Oracle (2015) | Image: Traditional SQL vs Advanced SQL & R
Automatic Data Preparation, Types, Schemas, Tables
Another Rockstar feature that ODM has is its ability to aid customers with their data preparation and cleaning requirements. According to Oracle (2015), in order to perform proper analysis on data, analysts have to make explicit decisions about how to “bin” data, deal with missing values and oftentimes reduce the number of variables (feature selection) to be used in the models. With ODM, this is all possible.
ODM – Workflow Graphic User Interface (GUI)
ODM’S GUI provides users an intuitive interface with endless analytical featured techniques for use with business applications. The GUI provides users the ability to create, manage, and modify workflows associated with data mining analysis projects. Here’s is an example GUI from ODM.
Source: Oracle (2015) | Image: ODM GUI
Relevant Weaknesses
One of the main weaknesses is the cost associated with scalable deployments at the enterprise level. Although Oracle is a leader within this space, their cost structures are not favored by smaller organizations. Oracle does offer a competitive package for all business sizes like small, medium, and enterprise level needs. In addition, the learning curve to use ODM is pretty steep based on all of the functionality the tool offers. User reviews suggest that the products overall capabilities are high with performance, analytic options, and presentation features at the top of their list of favorites. However, implementation seems to be lacking based on negative customer feedback (Gartner, 2020).
Overall, ODM provides powerful data mining functionality and enables users to discover new insights in hidden data. ODM has several data mining and data analysis algorithms to help business users address operational problems. The technology suite offers users a robust data mining set of tools with features like advanced data security, compression, database in-memory, testing, and analytic functionality (Oracle, 2020). Although the costs are high and it comes with quite a steep learning curve, it’s an industry favorite across the data analytics community that continues to serve consumers across a variety of market verticals.
References
Han, J. Pei, J. Kamber, M. (2011). Data Mining: Concepts and Techniques. Morgan Kaufmann. https://learning.oreilly.com/library/view/data-mining-concepts/9780123814791/
Iverson, H. (2020). What to expect in Data and Analytics in 2020. TDWI. https://tdwi.org/articles/2020/01/07/data-all-what-to-expect-in-data-and-analytics-in-2020.aspx
Oracle. (2020). Oracle Data Mining. https://www.oracle.com/database/technologies/advanced-analytics/odm.html
Oracle. (2015). Big Data Analytics with Oracle Advanced Analytics. Big Data Analytics with Oracle Advanced Analytics. https://www.oracle.com/technetwork/database/options/advanced-analytics/oaa-12c-whitepaperv6-2618427.pdf