제주대학교 Repository

초기 모드 결정 방식을 개선한 K-mode 알고리즘

Metadata Downloads
Alternative Title
K-mode Algorithm Improving Initial Mode Decision Methodologies
Abstract
Data mining is the process of uncovering previously unknown patterns and relationships in large databases using sophisticated statistical analysis and modeling techniques such as classification, association rule mining, clustering, etc.. Spacially, clustering is an important data mining problem. Clustering, in data mining, is useful for discovering groups and identifying interesting distributions in underlying data. The k-means algorithm is well known for its efficiency in clustering large data sets. However, working only on numeric values prohibits it from being used to cluster real world data containing categorical values. Huang presented an algorithm, called K-mode algorithm, to extend the K-means paradigm to categorical domains(1997). K-mode algorithm suffers from initial staring conditions effect (initial mode, the number of initial mode).
This paper improved the problem of K-mode algorithm using Max-Min method that is a kind of methods to decide initial values in K-means algorithm. We introduce new similarity measures to deal with categorical data sets using means of cluster. Tested with the Mushroom data sets and Small Soybean data sets the proposed algorithm has shown a good performance for the two aspects (accuracy, run time).
Author(s)
양순철
Issued Date
2007
Awarded Date
2007. 2
Type
Dissertation
URI
http://dcoll.jejunu.ac.kr/jsp/common/DcLoOrgPer.jsp?sItemId=000000003890
Alternative Author(s)
Yang, Soon-Cheol
Affiliation
제주대학교 대학원
Department
대학원 전산통계학과
Advisor
김철수
Table Of Contents
Ⅰ. 서론 = 1
Ⅱ. 데이터 마이닝 = 3
1. 데이터 마이닝의 개념 = 3
2. 데이터 마이닝 기법 = 4
1) 연관성 분석(Association Analysis) = 4
2) 의사결정나무분석(Decision Tree Analysis) = 5
3) 신경망(Neural Networks) = 6
Ⅲ. 관련 알고리즘 연구 = 8
1. K-means 알고리즘 = 8
2. K-means 알고리즘에서 초기값 결정 방법 = 9
1) KA(Kaufman Approach) 방법 = 9
2) Max-Min방법 = 10
3. ROCK 알고리즘 = 11
4. K-mode 알고리즘 = 14
Ⅳ. 제안 알고리즘 = 17
1. 유사도 = 17
2. 유사도를 이용한 초기 모드 결정 = 19
Ⅴ. 실험 결과 및 분석 = 25
1. 실험 환경 = 25
2. 실험 데이터 = 25
3. 실험 결과 = 27
1) Mushroom 데이터 = 27
2) Small Soybean 데이터 = 28
Ⅵ. 결론 및 연구 과제 = 30
Ⅶ. 참고문헌 = 31
Degree
Master
Publisher
제주대학교 대학원
Citation
양순철. (2007). 초기 모드 결정 방식을 개선한 K-mode 알고리즘
Appears in Collections:
General Graduate School > Computer Science and Statistics
공개 및 라이선스
  • 공개 구분공개
파일 목록

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.