°øÀ¯ÀÚ·á HOME > ÀÚ·á½Ç > °øÀ¯ÀÚ·á
 
[Á¤º¸±â¼úÈ°¿ë¿¬±¸] TF-IDF¸¦ ÀÌ¿ëÇÏ¿© DocumentTermMatrix¸¸µé±â.
°ü¸®ÀÚ 16-04-11 15:28 1,365
¾È³çÇϼ¼¿ä ¼®»ç 2±â ¼­ÇѼÖÀÔ´Ï´Ù.
 
Àú¹ø °­Àǽ𣿡 ±³¼ö´Ô²²¼­ ¸»¾¸ÇϽŠTF-IDF ±¸ÇÏ´Â ÄÚµå ¾÷·ÎµåÇÕ´Ï´Ù.
 
±×´ë·Î º¹»çÇÏ¿©¼­ ÄÚµå ½ÇÇàÇØ º¸½Å ´ÙÀ½, Äڵ带 ÇϳªÇϳª »ìÆ캸½Ã¸é
 
¾î·ÆÁö ¾Ê°Ô ÀÌÇØÇÏ½Ç ¼ö ÀÖÀ» °ÍÀÔ´Ï´Ù.
 
 
 
#ÅؽºÆ® º¤ÅÍ Çü¼º
Example<-c("Neural Network emulates how the human brain works by having a network of neurons that are interconnected and sending stimulating signal to each other.",
     "Support Vector Machine provides a binary classification mechanism based on finding a dividing hyperplane between a set of samples with +ve and -ve outputs.",
     "From a probabilistic viewpoint, the predictive problem can be viewed as a conditional probability estimation; trying to find Y where P(Y | X) is maximized.",
     "K Nearest neighbor is also called instance-based learning, in contrast to model-based learning, because it is not learning any model at all."
     )
 
library(tm)
library(RTextTools)
 
#RTextTools ÆÐÅ°ÁöÀÇ 'create_matrix' ÇÔ¼ö¿Í
#'tm' ÆÐÅ°ÁöÀÇ weightTfIdf¸¦ È°¿ëÇÏ¿© DocumentTermMatrix Çü¼º
dtmat<-create_matrix(Example, language = "english", removeNumbers = T, removePunctuation = T, stemWords = T, weighting = tm::weightTfIdf)
dtmat2<-as.matrix(dtmat); dtmat2