Last updated: July 26, 2019
Topic: BusinessConstruction
Sample donated:

Data mining is a method to extract useful information from large databases. It performs many tasks such as      classification, clustering, prediction, association analysis 1. One of the most important researched fields of data mining is frequent pattern mining, which plays vital role in all the above mentioned tasks. One of the major drawback frequent pattern mining is that it requires multiple database scans to drill out the frequent patterns and may produce large number of frequent patterns especially with long patterns, the refined solution of the above problems is Maximal Frequent Pattern (MFP) it is the smallest representative set for frequent pattern generation, MFP’s are the frequent patterns whose superset cannot be frequent 2.  This paper proposes a graphical method to produce MFP which will generate frequent patterns. This method introduces two new properties; a graph structure called as Prime graph and a PG-Miner algorithm. Prime graph is a simple graph structure by traversing it by one scan can produce frequent patterns as the graph itself captures the whole information about the transactions by using an optimizing data transformation technique which uses prime number theory. PG-miner is the proposed algorithm which traverses the prime graph and prunes the infrequent items. The efficiency and compaction of proposed method is proved with the help of experimental results.1. INTRODUCTIONWith the increase in the size of database there is a need of developing a tool which can drill down the useful information from the database with ease. Knowledge Discovery of Data (KDD) is a process to extract useful patterns from the database KDD Process shown with the help of figure. Data mining is an important step of KDD, which is used to extract useful information and can be implemented in many areas like data bases, artificial intelligence, knowledge discovery in neural networks etc. Frequent pattern mining is the one of the most important tool of data mining, which is used to extract frequent patterns based on minimum support or confidence value.Association rule mining is based on extracting interesting co-relations, Frequent pattern mining is the first step of association rule mining in this the patterns which satisfy the threshold is frequent otherwise infrequent 14. Many algorithms are been devised to mine frequent patterns. They basically fall in two categories: (1) Mining frequent patterns with candidate generation (2) Mining frequent patterns without candidate generation. Methods with candidate generation like Apriori 16, partition 21, incremental based 17 19, suffers from many problems like multiple database scans and candidate generation. Many extensions are made to the previous algorithm but still it encounters the above problems. And method without candidate generation like Pattern Growth 20 or FP-growth is an improvement over candidate generation algorithms. They require two database scans to drill out the frequent patterns from the database; several optimizations are made to reduce the number of database scans and lessen the time taken and the search space to produce frequent patterns.Maximal Frequent Pattern (MFP) is a reasonable solution for the above mentioned problem, as it is a smallest representative set to produce frequent patterns; it reduces the number of frequent set generation 3.This paper proposes a graphical method based on MFP to produce frequent patterns, this graphical approach can be extended to all data mining tasks. However, most of the times some changes are made in graph structure, pruning or traversal technique. This method uses simple graph structure to keep the transaction information and a graph miner algorithm to traverse the graph to find the frequent patterns and prunes the infrequent patterns. This method uses data transformation technique to convert data into prime number format which reduces the size of data sets significantly, then construction of prime-graph takes place and with the help of prime minor algorithm frequent patterns can be mined and it prunes all in frequent patterns from the data set, in one database scan, as all the useful information related with the transaction is stored in prime-graph, by traversing the graph once only frequent patterns can be mined.Various experiments have been performed on the web log data set to prove the efficiency and the correctness of the proposed method.Organization of the paper is as follows Section 2 introduces the problem and reviews some efficient related works. The proposed method is described in section 3. The experimental results and evaluation show in section 4. Finally, section 5 contains the conclusions and future works.