Analysis of Data on Staff Turnover Using Association Rules and Predictive Techniques

Lenka Girmanova, Zuzana Gašparová

Analysis of Data on Staff Turnover Using Association Rules and Predictive Techniques

Číslo: 2/2018
Periodikum: Quality Innovation Prosperity
DOI: 10.12776/qip.v22i2.1122

Klíčová slova: turnover; information technology; data mining; association rules; decision trees

Pro získání musíte mít účet v Citace PRO.

Přečíst po přihlášení

Anotace: Purpose: The purpose of this paper is to present the results of an analysis and evaluation of data on employee turnover based on deep data mining using association rules and decision trees in a specific organisation.

Methodology/Approach: For the analysis, we chose deep data mining methods, primarily a search for association rules using the Apriori algorithm in the R programming language. For the sake of supplementation and comparison of results, data were also analysed using the predictive decision trees method, applying the C5.0, rpart and ctree algorithms in the R program.

Findings: The results of the analyses showed that observing the basic principles of correct communication from the beginning of an employment relationship, or during hiring, is justified. Communication and regular conversations between a superior and employees can help identify problems earlier, address them and reduce the number of people leaving the company. The results of the analysis helped the organisation to set measures to reduce the number of an employee leaving.

Research Limitation/implication: A limiting factor in performing such analyses is the availability of quality data in the required quantity. Our most significant advantage when performing our analysis was that quality data were available. To create the final structure of the required data set, we used data from the organisation’s internal information systems.

Originality/Value of paper: This contribution offers a new approach to analysing data on employee turnover, whose essence is that we need to find the most interesting and frequent correlations in a significant amount of data.