User Guide
Why can I only view 3 results?
You can also view all results when you are connected from the network of member institutions only. For non-member institutions, we are opening a 1-month free trial version if institution officials apply.
So many results that aren't mine?
References in many bibliographies are sometimes referred to as "Surname, I", so the citations of academics whose Surname and initials are the same may occasionally interfere. This problem is often the case with citation indexes all over the world.
How can I see only citations to my article?
After searching the name of your article, you can see the references to the article you selected as soon as you click on the details section.
 Views 13
Data correlation matrix-based spam URL detection using machine learning algorithms
2024
Journal:  
Journal of Scientific Reports-A
Author:  
Abstract:

In recent years, the widespread availability of internet access has brought both advantages and disadvantages. Users now enjoy numerous benefits, including unlimited access to vast amounts of information and seamless communication with others. However, this accessibility also exposes users to various threats, including malicious software and deceptive practices, leading to victimization of many individuals. Common issues encountered include spam emails, fake websites, and phishing attempts. Given the essential nature of internet usage in contemporary society, the development of systems to protect users from such malicious activities has become imperative. Accordingly, this study utilized eight prominent machine learning algorithms to identify spam URLs using a large dataset. Since the dataset only contained URL information and spam classification, additional feature extractions such as URL length and the number of digits were necessary. The inclusion of such features enhances decision-making processes within the framework of machine learning, resulting in more efficient detection. As the effectiveness of feature extraction significantly impacts the results of the methods, the study initially conducted feature extraction and trained models based on the weight of features. This paper proposes a data correlated matrix approach for spam URL detection using machine learning algorithms. The distinctive aspect of this study lies in the feature extraction process applied to the dataset, aimed at discerning the most impactful features, and subsequently training models while considering the weighting of these features. The entire dataset was used without any reduction in data. Experimental findings indicate that tree-based machine learning algorithms yield superior results. Among all applied methods, the Random Forest approach achieved the highest success rate, with a detection rate of 96.33% for the non-spam class. Additionally, a combined and weighted calculation method yielded an accuracy of 94.16% for both spam and non-spam data.

Keywords:

0
2024
Author:  
Citation Owners
Information: There is no ciation to this publication.
Similar Articles






Journal of Scientific Reports-A

Field :   Fen Bilimleri ve Matematik; Mühendislik

Journal Type :   Ulusal

Metrics
Article : 764
Cite : 1.283
2023 Impact : 0.117
Journal of Scientific Reports-A