Makine öğrenmesi en basit tanımıyla, insana ait özellik ve davranışları bilgisayara öğretmektir. Makine öğrenmesi algoritmaları kendilerine verilen örnek olayları inceleyerek öğrenir ve bu örnek olaylar üzerinden genelleme yapma yeteneği kazanır. Modele öğretilmek istenilenlerin öğretileceği kısım eğitim seti, ne kadar iyi öğrendiğinin test edildiği kısım ise test seti olarak adlandırılır. Makine öğrenmesi literatüründe var olan çalışmalarda, veri seti bölme işlemi kullanıcının istediği rastgele bir oranda gerçekleşmektedir. Bu çalışmada, Kaliforniya Üniversitesi’nin lisansüstü öğrenci kabul kriterleri göz önünde bulundurularak, Hindistan’daki öğrenciler için oluşturulan yüksek lisans başvuru verileri, rastgele oran yöntemi ve sıralı küme örneklemesi (SKÖ) ile bölünmüş, elde edilen eğitim setleri kullanılarak doğrusal regresyon modelleri oluşturulmuştur. Daha sonra, test setleri kullanılarak modellerin hata kareler ortalamalarının karekökleri (HKOK) üzerinden, veri seti bölme yöntemlerinin performans karşılaştırması yapılmıştır. SKÖ yöntemi ile, temel bileşenler, kısmi en küçük kareler ve ridge regresyon modelleri için tek bir durum dışında, rastgele oran yöntemine göre daha düşük hata değerlerine ulaşılmıştır. Elastic net regresyon modeli hariç, diğer doğrusal regresyon modellerinde, SKÖ yöntemi ile, rastgele oran yönteminden daha iyi sonuçlar elde edilmiştir.
With the simplest definition of machine learning, it is to teach the computer the characteristics and behaviors of a person. Machine learning algorithms learn by studying the example events given to them and gain the ability to generalize through the example events. The part of the training set that will be taught by those who want to be taught by the model is called the test set, and the part where it is tested how well it is learned. In the studies that exist in machine learning literature, the process of dividing the data set occurs at a random rate that the user wants. In this study, taking into account the Graduate Student Admission criteria of the University of California, linear regression models were created using the obtained training sets, divided by the graduate application data created for students in India, the random rate method and the sequential cumulative sampling (SCO). Later, using test sets, the performance comparison of the methods of the data set division was made through the squares of the error square average (HKOK) of the models. With the SKÖ method, the basic components, with the exception of a single case for the partially smallest squares and ridge regression models, have reached lower error values than the random ratio method. Except the Elastic net regression model, in other linear regression models, the SKÖ method has achieved better results than the random ratio method.
The simplest definition of machine learning is to teach human characteristics and behaviors to the computer. Machine learning algorithms learn by examining case studies given to them and gain the ability to generalize through these events. The part where the model will be taught what is wanted is called the training set, and the part where it is tested how well it is learned is called the test set. In the studies that exist in the machine learning literature, data set splitting occurs at a random rate that the user wants. In this study, considering the graduate student admission criteria of the University of California, graduate application data created for students in India were divided by both random and ranked set sampling (RSS), linear regression models were created using the obtained training sets. Then, using the test sets, the performance comparison of the data set splitting methods was made based on the root mean square error (RMSE) of the models. With the RSS method, lower error values were obtained for principal components, partial least squares and ridge regression models compared to the random rate method except for a single case. In other linear regression models except elastic net regression model, better results were obtained with the RSS method compared to the random rate method.
Alan : Mühendislik; Fen Bilimleri ve Matematik
Dergi Türü : Ulusal
Benzer Makaleler | Yazar | # |
---|
Makale | Yazar | # |
---|