本来我想用前面那个Human Activity Recognition Using Smartphones Data Set来完成我的Data mining的结业小论文，但后来在将该Dataset转换为Weka的arff格式时碰到点问题，所以就放弃了，最终将以下面这个Image Segmentation Data Set来写，同时相对来说，这个Dataset也跟我的工作更加相关、相近一些。
|Download: Data Folder, Data Set DescriptionAbstract: Image data described by high-level numeric-valued attributes, 7 classes|
|Data Set Characteristics:||Multivariate||Number of Instances:||2310||Area:||N/A|
|Attribute Characteristics:||Real||Number of Attributes:||19||Date Donated||1990-11-01|
|Associated Tasks:||Classification||Missing Values?||No||Number of Web Hits:||46055|
Vision Group, University of Massachusetts
Vision Group (Carla Brodley, brodley ‘@’ cs.umass.edu)
Data Set Information:
The instances were drawn randomly from a database of 7 outdoor images. The images were handsegmented to create a classification for every pixel.
Each instance is a 3×3 region.
1. region-centroid-col: the column of the center pixel of the region.
2. region-centroid-row: the row of the center pixel of the region.
3. region-pixel-count: the number of pixels in a region = 9.
4. short-line-density-5: the results of a line extractoin algorithm that counts how many lines of length 5 (any orientation) with low contrast, less than or equal to 5, go through the region.
5. short-line-density-2: same as short-line-density-5 but counts lines of high contrast, greater than 5.
6. vedge-mean: measure the contrast of horizontally adjacent pixels in the region. There are 6, the mean and standard deviation are given. This attribute is used as a vertical edge detector.
7. vegde-sd: (see 6)
8. hedge-mean: measures the contrast of vertically adjacent pixels. Used for horizontal line detection.
9. hedge-sd: (see 8).
10. intensity-mean: the average over the region of (R + G + B)/3
11. rawred-mean: the average over the region of the R value.
12. rawblue-mean: the average over the region of the B value.
13. rawgreen-mean: the average over the region of the G value.
14. exred-mean: measure the excess red: (2R – (G + B))
15. exblue-mean: measure the excess blue: (2B – (G + R))
16. exgreen-mean: measure the excess green: (2G – (R + B))
17. value-mean: 3-d nonlinear transformation of RGB. (Algorithm can be found in Foley and VanDam, Fundamentals of Interactive Computer Graphics)
18. saturatoin-mean: (see 17)
19. hue-mean: (see 17)
Papers That Cite This Data Set1:
Anthony K H Tung and Xin Xu and Beng Chin Ooi. CURLER: Finding and Visualizing Nonlinear Correlated Clusters. SIGMOD Conference. 2005. [View Context].
Xiaoli Z. Fern and Carla Brodley. Cluster Ensembles for High Dimensional Clustering: An Empirical Study. Journal of Machine Learning Research n, a. 2004. [View Context].
K. A. J Doherty and Rolf Adams and Neil Davey. Unsupervised Learning with Normalised Data and Non-Euclidean Norms. University of Hertfordshire. [View Context].
Adil M. Bagirov and John Yearwood. A new nonsmooth optimization algorithm for clustering. Centre for Informatics and Applied Optimization, School of Information Technology and Mathematical Sciences, University of Ballarat. [View Context].
Amund Tveit. Empirical Comparison of Accuracy and Performance for the MIPSVM classifier with Existing Classifiers. Division of Intelligent Systems Department of Computer and Information Science, Norwegian University of Science and Technology. [View Context].
Je Scott and Mahesan Niranjan and Richard W. Prager. Realisable Classifiers: Improving Operating Performance on Variable Cost Problems. Cambridge University Department of Engineering. [View Context].
C. Titus Brown and Harry W. Bullen and Sean P. Kelly and Robert K. Xiao and Steven G. Satterfield and John G. Hagedorn and Judith E. Devaney. Visualization and Data Mining in an 3D Immersive Environment: Summer Project 2003. [View Context].
Adil M. Bagirov and Alex Rubinov and A. N. Soukhojak and John Yearwood. Unsupervised and supervised data classification via nonsmooth and global optimization. School of Information Technology and Mathematical Sciences, The University of Ballarat. [View Context].
James Tin and Yau Kwok. Moderating the Outputs of Support Vector Machine Classifiers. Department of Computer Science Hong Kong Baptist University Hong Kong. [View Context].
Please refer to the Machine Learning Repository’s citation policy