°øÁö»çÇ× HOME > Á¤º¸±¤Àå > °øÁö»çÇ×
 
ÇÑ±Û DTM ÆÐÅ°Áö 1.2
°ü¸®ÀÚ 18-06-14 23:30 874
   makeDTM.zip (9.6K) [4] DATE : 2018-06-14 23:30:30
ÇÑ±Û DTMÀ» ¸¸µå´Â ÆÐÅ°Áö¸¦ 1.2·Î ¾÷±×·¹À̵å Çß½À´Ï´Ù.
¾Æ·¡ÀÇ ÄÚµå·Î ±êÇé¿¡¼­ ´Ù¿î·Îµå ¹ÞÀ» ¼ö ÀÖ½À´Ï´Ù.

install.packages("devtools"); library(devtools)
install_github("caitechKHU/makeDTM"); library(makeDTM)

¶Ç´Â º» °Ô½Ã±Û¿¡ ÷ºÎµÈ ÆÄÀϷεµ »ç¿ëÇÒ ¼ö ÀÖ½À´Ï´Ù.
R¿¡¼­ .libPaths()  ÇÔ¼ö¸¦ ½ÇÇàÇÏ°í, Ãâ·ÂµÈ ù ¹ø° °æ·Î¿¡ Á÷Á¢ ¾ÐÃàÀ» Ç®¾î³Ö¾îµµ µË´Ï´Ù.


À̹ø ¾÷±×·¹À̵带 ÅëÇØ º» ÆÐÅ°Áö´Â tm ÆÐÅ°Áö¿¡ ºñÇÏ¿© ´ÙÀ½ÀÇ ÀåÁ¡À» °®½À´Ï´Ù.

1. ÇÑ±Û Ã³¸®°¡ Àß µÈ´Ù. Ä÷³ ºÐ¸®°¡ Àß µÇ°í, Á÷Á¢ ÇüÅÂ¼Ò ºÐ¼®À» ¼öÇàÇÒ ¼ö ÀÖ´Ù
2. ÇÊ¿äÇÑ Ä÷³¸¸ »ç¿ëÇÑ´Ù. ÇÊ¿äÇÑ ´Ü¾î¸¸ ±âÁØ Ä÷³ÀÌ µÇ°Ô ÇÑ´Ù
3. »ç¿ëÀÌ ½±´Ù. Corpus º¯È¯ µîÀ» °ÅÄ¡Áö ¾Ê°í ¹Ù·Î µ¥ÀÌÅÍÇÁ·¹ÀÓÀ» »ç¿ëÇÑ´Ù

ÇüÅÂ¼Ò ºÐ¼® ¿É¼Ç(RHINO=TRUE)À» »ç¿ëÇϸé TEXT Ä÷³ÀÇ ³»¿ëÀ» ÇüÅÂ¼Ò ´ÜÀ§·Î ºÐ¼®ÇØ ÁÝ´Ï´Ù.
ÀÌ ¿É¼ÇÀ» ¼öÇàÇϱâ À§Çؼ­´Â ¸ÕÀú RHINO¸¦ ¼³Ä¡ÇØ¾ß ÇÕ´Ï´Ù.
RHINO ¼³Ä¡´Â ¾Õ¿¡ ÀÖ´Â °Ô½Ã±ÛÀ» ÂüÁ¶Çϼ¼¿ä.


±âº» ½ÇÇà>
docs <- read.csv("sample.csv")     
library(makeDTM)
keyword <- c("¿¢¼¿À»", "´Ù½Ã", "¿À´ÃÀº")
makeDTM(docs, key=keyword, LABEL = TRUE, weight = "tfidf")

¿É¼Ç ½ÇÇà>
keyword <- c("¿¢¼¿", "¿À´Ã", "ÆíÇÏ")

1. ÇüÅÂ¼Ò ºÐ¼®
makeDTM(docs, key=keyword , LABEL = TRUE, weight = "tfidf", RHINO = TRUE)

2. ÇüÅÂ¼Ò ºÐ¼® ½Ã Ç°»ç ¼±Åà (ALL, noun, verb, NNG, NNP, NP, NNB, VV, VA, XR, VX)
makeDTM(docs, key=keyword , LABEL = TRUE, weight = "tfidf", RHINO = TRUE, pos="noun") 

3. TEXT ¿Í LABEL Ä÷³ ÁöÁ¤ (µ¥ÀÌÅÍ¿¡ body ¿Í tag ¶ó´Â Ä÷³ÀÌ ÀÖ´Â °æ¿ì)
makeDTM(docs, key=keyword , LABEL = TRUE, TEXT.name = "body", LABEL.name = "tag", RHINO = TRUE)

*** ÀÌ ÇÁ·Î±×·¥Àº °æÈñ´ëÇб³ °æ¿µ´ëÇÐÀÇ BK21 ÇÁ·Î±×·¥ (µ¥ÀÌÅÍ°úÇп¡ ±â¹ÝÇÑ °æ¿µÀü¹® ¿¬±¸Àη ¾ç¼ºÆÀ)ÀÇ Áö¿øÀ» ¹Þ¾Ò½À´Ï´Ù ***