With the development of the Internet, the amount of data in the digital environment is continuously increasing. Especially with web 2.0 technology, as a result of sites which users are able to add new content such as wikipedia, blogs and social media sites, the amount of information on the internet is increasing both in number and size. Accessing the required information in a medium where there are so many data is a serious problem. Today’s information age make it necessary to use automatic text summarization systems in many areas about information retrieval in order to access the searched information. In this study, text summarization methods based on sentence extraction are discussed, firstly features to represent sentences in document is extracted and then the effectiveness of these attributes on summarization is tried to be determined by using genetic algorithm. The data set used in the study consists of 120 documents containing Turkish news texts and their summaries. 80 documents are trained with the help of genetic algorithm and the best weight values for the attributes are determined, then 40 test documents are summarized with these weights and the results are compared with the original summaries.
With the development of the Internet, the amount of data in the digital environment is continuously increasing. Especially with web 2.0 technology, as a result of sites which users are able to add new content such as wikipedia, blogs and social media sites, the amount of information on the internet is increasing both in number and you. Accessing the required information in a medium where there are so many data is a serious problem. Today's information age makes it necessary to use automatic text summarization systems in many areas about information retrieval in order to access the searched information. In this study, text summarization methods based on sentence extraction are discussed, first features to represent sentences in document is extracted and then the effectiveness of these attributes on summarization is tried to be determined by using genetic algorithm. The data set used in the study consists of 120 documents containing Turkish news texts and their summaries. 80 documents are trained with the help of genetic algorithm and the best weight values for the attributes are determined, then 40 test documents are summarized with these weights and the results are compared with the original summaries.
Relevant Articles | Author | # |
---|
Article | Author | # |
---|