Feature Selections for the Classification of Webpages to Detect Phishing Attacks: A Survey

Thumbnail Image
2020 , [Date of Conference: 26-28 June 2020]
Korkmaz, Mehmet
Diri, Banu
Journal Title
Journal ISSN
Volume Title
Research Projects
Organizational Units
Journal Issue

In recent years, due to the increased number of Internet-connected devices, almost all the real-world interactions are transferred to the cyberworld. Therefore, most of the commerce (especially in the e-commerce format) are executed over webpages. The anonymous and uncontrollable structure of Internet, enables the malicious use of this cyber environment for a relatively new crime format, named as e-crime, which mainly aims some illegal financial gain by cheating the standard end-users. Phishing attacks are one of the most preferred fraudulent technique which is used for getting some confidential information (like user-id, password, credit card information, etc.) of the end-users. Therefore, security admins of the networks try to decrease the number of victims is their companies. One principal protection mechanism is the use of blacklists to detect the phishing webpages. However, it has a significant deficiency in not protection about new page attacks. Most of the security admins use some learning systems which are trained by a pre-collected a-dataset by extracting some features from the URL and content of the web pages. The performance of the used system directly related with the features used for the classification. In this work, we aimed to analyze the previously used features in the classification of the web pages by making a comparative analysis about the literature. With this study, it is aimed to produce a general survey resource for the researchers, which aim to work on the classification of webpages or the security of the networks.

URL Features , Web Page Features , Phishing Detection , Machine Learning
Korkmaz, M., Sahingoz, O. K., & Diri, B. (2020, June). Feature selections for the classification of webpages to detect phishing attacks: a survey. In 2020 International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA) (pp. 1-9). IEEE.