DEVELOPMENT OF AN IMPROVED DATA MINING ALGORITHM FOR FRAUD DETECTION AND CHURN BEHAVIOR IN MODELLING
1.1 BACKGROUND OF THE STUDY
This research is on Development of an improved data mining algorithm for fraud detection and churn behavior in modelling. With the increasing presence of technology in the world today, large amount of electronic data are created and used. These data used to be measured in the range of tens to hundreds of gigabytes (GB). Now, multi-terabyte (TB) or even petabyte (PB) databases are in use. These data needs to be managed for effective usage. People have no time to look at this data. Human attention has become the precious resource. So, we must find ways to automatically analyze the data, to automatically classify it, to automatically summarize it, to automatically discover and characterize trends in it. This amount of data is on the increase due to the massive usages of these data by billions of people. This explosive growth in stored data has generated an urgent need for new techniques and automated tools that can intelligently assist us in transforming the vast amounts of data into useful information and knowledge.
This generally led to creation of knowledge based expert systems. These knowledge based expert systems had limitations such as the time constraint required to obtain knowledge from the human experts in this fields and also the experts may not have the essential knowledge required. This led to the advent of data mining technology which promised solutions to these problems. These data mining technology involves intelligent methods which are applied to extract data patterns. It extracts information from a data set and transforms it into an understandable structure for other uses. Data mining is applied in areas such as market analysis, fraud detection, churn prediction, customer retention, to production control.
Telecommunication networks are extremely complex configurations of equipment, comprised of thousands of interconnected components. Each network element is capable of generating error and status messages, which leads to a tremendous amount of network data. This data must be stored and analyzed in order to support network management functions, such as fault isolation. This data will minimally include a time-stamp, a string that uniquely identifies the hardware or software component generating the message and a code that explains why the message is being generated.
The churn rate, also known as the rate of attrition, is the percentage of subscribers to a service who discontinue their subscriptions to that service within a given time period. For a company to expand its clientele, its growth rate, as measured by the number of new customers, must exceed its churn rate. The rate is generally expressed as a percentage.
1.2 STATEMENT OF THE PROBLEM
The definition of Fraud varies by jurisdiction but a fair summary is: “A deception deliberately practiced in order to secure unfair or unlawful gain” (O’Connor, 2018). Fraud has caused a huge problem for telecommunication companies, leading to loss of trillions of dollars. Fraud can be divided into two categories: subscription fraud and superimposition fraud. Subscription fraud occurs when a customer opens an account with the intention of never paying for the account charges, using either false or stolen identities. Superimposition fraud involves a legitimate account with some legitimate activity, but also includes some “superimposed” illegitimate activity by a person other than the account holder. Superimposition fraud poses a bigger problem for the telecommunications industry in Nigeria and for this reason data mining technique is used for identifying and predicting this type of fraud.
The churn rate, also known as the rate of attrition, is the percentage of subscribers to a service who discontinue their subscriptions to that service within a given time period. For a company to expand its clientele, its growth rate, as measured by the number of new customers, must exceed its churn rate. The rate is generally expressed as a percentage. A company can compare its churn and growth rates to determine if there was overall growth or loss. While the churn rate tracks lost customers, the growth rate tracks new customers who begin purchasing from the organization. If the growth rate is higher than the churn rate, the company has experienced growth. When the churn rate is higher than the growth rate, the company has experienced a loss in its customer base.
The data mining application should carry out its function or operate in real-time using the call detail records and, once fraud is detected or suspected, should trigger some action. This action may be to immediately block the call and/or deactivate the account, or may involve opening an investigation, which will result in a call to the customer for identification or to verify the legitimacy of the account activity.
1.3 OBJECTIVES OF THE STUDY
The following listed below are the objectives of this study:
- To provide a clear overview of data mining.
- To study various data mining techniques used for fraud detection and churn modeling by telecommunication industries.
- To identify the limitations of the existing data mining techniques used for fraud detection and churn modeling by telecommunication industries.
- To provide methods of overcoming such limitations.
1.4 RESEARCH QUESTIONS
- What is data mining?
- What are the various data mining techniques used for fraud detection and churn modeling by telecommunication companies in Nigeria?
- What are the limitations or challenges of the various data mining techniques used for fraud detection and churn modeling by telecommunication companies?
- How can these challenges or limitations of data mining techniques be overcome?
1.5 SIGNIFICANCE OF THE STUDY
The following are the significance of this study:
- The outcome of this study will educate on data mining techniques of telecommunication companies in Nigeria, how they are applied and how they’re used for fraud detection.
- This research will enhance on existing data mining techniques providing more efficient fraud detection methods
- This research will provide a method for detecting churn to quantify the company’s loss or gain
- This research will be a contribution to the body of literature in the area of student’s academic performance, thereby constituting the empirical literature for future research in the subject area.
1.6 SCOPE OF THE STUDY
This study covers various data mining techniques used by telecommunication companies in Nigeria for fraud detection and churn behavior.
1.7 LIMITATIONS OF THE STUDY
Resource availability constraint- Required resources for this study may not be entirely made available by the telecommunication industries.
Time constraint- Due to academic work, the time dedicated to this study will be cut down causing time limitation.
Financial constraint- Insufficient funds tend to impair the efficiency of the researcher souring for resources and for collection of data.
1.8 DEFINITION OF TERMS
Caller ID- This is a telephone service that transmits a caller’s telephone number to the called party’s telephone equipment when the call is being set up.
CDR- Call detail record is a data record produced by a telephone exchange that passes through that facility or device.
Disposition of the call- This is the result of a call, whether it connected or not.
Calling party- This is a person or the device that initiates a telephone call.
Called party- This is the person or device that receives a telephone call.
Phreaking- This is a term used to describe people who study, experiment with or explore telecommunication systems, such as equipment and systems connected to public telephone networks
Caller ID spoofing- This is the practice of causing the telephone network to indicate to the receiver of a call that the originator of the call is a station other than the true originating station
- Store Name: Brilliantng
- Vendor: Brilliantng
- No ratings found yet!