초기 모드 결정 방식을 개선한 K-mode 알고리즘
- Alternative Title
- K-mode Algorithm Improving Initial Mode Decision Methodologies
- Abstract
- Data mining is the process of uncovering previously unknown patterns and relationships in large databases using sophisticated statistical analysis and modeling techniques such as classification, association rule mining, clustering, etc.. Spacially, clustering is an important data mining problem. Clustering, in data mining, is useful for discovering groups and identifying interesting distributions in underlying data. The k-means algorithm is well known for its efficiency in clustering large data sets. However, working only on numeric values prohibits it from being used to cluster real world data containing categorical values. Huang presented an algorithm, called K-mode algorithm, to extend the K-means paradigm to categorical domains(1997). K-mode algorithm suffers from initial staring conditions effect (initial mode, the number of initial mode).
This paper improved the problem of K-mode algorithm using Max-Min method that is a kind of methods to decide initial values in K-means algorithm. We introduce new similarity measures to deal with categorical data sets using means of cluster. Tested with the Mushroom data sets and Small Soybean data sets the proposed algorithm has shown a good performance for the two aspects (accuracy, run time).
- Author(s)
- 양순철
- Issued Date
- 2007
- Awarded Date
- 2007. 2
- Type
- Dissertation
- URI
- http://dcoll.jejunu.ac.kr/jsp/common/DcLoOrgPer.jsp?sItemId=000000003890
- Alternative Author(s)
- Yang, Soon-Cheol
- Affiliation
- 제주대학교 대학원
- Department
- 대학원 전산통계학과
- Advisor
- 김철수
- Table Of Contents
- Ⅰ. 서론 = 1
Ⅱ. 데이터 마이닝 = 3
1. 데이터 마이닝의 개념 = 3
2. 데이터 마이닝 기법 = 4
1) 연관성 분석(Association Analysis) = 4
2) 의사결정나무분석(Decision Tree Analysis) = 5
3) 신경망(Neural Networks) = 6
Ⅲ. 관련 알고리즘 연구 = 8
1. K-means 알고리즘 = 8
2. K-means 알고리즘에서 초기값 결정 방법 = 9
1) KA(Kaufman Approach) 방법 = 9
2) Max-Min방법 = 10
3. ROCK 알고리즘 = 11
4. K-mode 알고리즘 = 14
Ⅳ. 제안 알고리즘 = 17
1. 유사도 = 17
2. 유사도를 이용한 초기 모드 결정 = 19
Ⅴ. 실험 결과 및 분석 = 25
1. 실험 환경 = 25
2. 실험 데이터 = 25
3. 실험 결과 = 27
1) Mushroom 데이터 = 27
2) Small Soybean 데이터 = 28
Ⅵ. 결론 및 연구 과제 = 30
Ⅶ. 참고문헌 = 31
- Degree
- Master
- Publisher
- 제주대학교 대학원
- Citation
- 양순철. (2007). 초기 모드 결정 방식을 개선한 K-mode 알고리즘
-
Appears in Collections:
- General Graduate School > Computer Science and Statistics
- 공개 및 라이선스
-
- 파일 목록
-
Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.