Subjek Jurnal :
Geological mapping using remote sensing data: A comparison of five machine learning algorithms, their response to variations in the spatial distribution of training data and the use of explicit spatial information
Link Jurnal :
http://www.sciencedirect.com/science/article/pii/S0098300413002720
Inti Pembahasan :
Membandingkan lima mesin belajar algoritma (MLAs) dalam hal kinerja mereka sehubungan dengan masalah klasifikasi litologi yang kompleks dan bermetamorfosa dalam teori geologi. Mesin MLAs, Naive Bayes, k-Nearest Neighbors, Random Forests, Support Vector Machines and Artificial Neural Networks, mewakili lima strategi pembelajaran mesin umum untuk inferensi data. Perbandingan MLA termasuk sensitivitas mereka terhadap variasi dalam distribusi spasial dalam mengolah data, dan respon terhadap masuknya informasi spasial eksplisit.
Metode Yang Digunakan :
a. Pra proses
Data geofisika ditransformasikan ke umum dan diproyeksikan untuk sistem koordinat mana saja menggunakan bilinear interpola-tion. Semua masukan yang resampled sampai batas umum (12,8 km) dan resolusi (50 m), menghasilkan dimensi gambar dari 256 piksel (65.536 sampel). Untuk meningkatkan relevansinya dengan tugas litologi, data yang diproyeksikan diproses dalam berbagai cara khusus untuk properti geofisika yang mereka wakili. langkah-langkah pre-processing diterapkan untuk menghasilkan input data yang diberikan dalam Informasi Tambahan Bagian 1 (S1 - Data). koordinat spasial (Easting (m) dan Northing (m)), yang diperoleh dari lokasi pusat dimasukkan menghasilkan total 27 variabel yang tersedia untuk input. Olahan input data yang standar untuk mean nol dan satuan varians. data yang sangat berkorelasi, dengan mean korelasi Pearson koefisien 40,8 terkait dengan sebagian besar data lain, tersingkir.
b. Klasifikasi Model
Tabel 2 menunjukkan nilai parameter MLA dinilai dalam penelitian ini. parameter yang optimal dipilih berdasarkan akurasi rata-rata maksimum yang dihasilkan dari 10 kali lipat cross-validasi. model klasifikasi MLA dilatih menggunakan parameter yang dipilih di seluruh set sampel data sebelum evaluasi prediksi. Informasi tentang paket data dan fungsi yang digunakan untuk melatih model klasifikasi MLA dan rincian mengenai parameter yang terkait disediakan di Informasi tambahan Bagian (S2 - Software MLA dan Parameter).
c. Evaluasi Prediksi
Akurasi keseluruhan dan kappa statistik (Cohen, 1960) biasanya digunakan untuk mengevaluasi kinerja classifier (Lu dan Weng, 2007). Secara keseluruhan akurasi memperlakukan prediksi sebagai benar atau salah dan didefinisikan sebagai jumlah sampel uji diklasifikasikan dengan benar dan dibagi dengan jumlah total sampel uji. Kappa statistik adalah ukuran dari kesamaan antara prediksi dan pengamatan data yang mengoreksi kesepakatan yang terjadi secara kebetulan (Congalton dan Green, 1998). Kami tidak menggunakan area di bawah ROC untuk mengevaluasi MLA prediksi karena multiclass ROC menjadi meningkat keras dengan sejumlah besar kelas (yaitu 48) (Landgrebe dan Paclik, 2010). Kami memvisualisasikan distribusi spasial error prediksi dan menilai validitas geologi mereka dengan memplot MLA prediksi dalam domain spasial dan dengan membandingkan lokasi sampel kesalahan klasifikasi.
Kelebihan Metode :
•
Sistem informasi yang dibutuhkan dapat segera direalisasikan dan dapat segera melakukan perbaikan untuk menyempurnakan sistem tersebut.
•
Mengefektifkan perhitungan yang dibutuhkan.
•
Mengurangi kesalahan manusia.
Kekurangan Metode :
•
Hanya dapat dilakukan oleh spesifikasi komputer tertentu.
•
Data tentang longitude dan latitude suatu daerah rawan error(tidak menunjukkan lokasi sebenarnya).
Sumber :
[1].
Anselin, L., 1995. Local indicators of spatial association – LISA. Geogr. Anal. 27, 93–115.
[2].
Breiman, L., 1996. Bagging predictors. Machine Learn. 24, 123–140. Breiman, L., 2001. Random forests. Mach. Learn. 45, 5–32.
[3].
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J., 1984. Classification and Regression Trees. Wadsworths & Brooks/Cole Advanced Books & Software, Pacific Grove, USA p. 358.
[4].
Buckley, P.M., Moriarty, T., Needham, J. (compilers), 2002. Broken Hill Geoscience Database, 2nd ed. Geological Survey of New SouthWales , Sydney.
[5].
Carneiro, C.C., Fraser, S.J., Croacutesta, A.P., Silva, A.M., Barros, C.E.M., 2012. Semiautomated geologic mapping using self-organizing maps and airborne geophysics in the Brazilian Amazon. Geophysics 77, K17–K24.
[6].
Cohen, J., 1960. A coefficient of agreement for nominal scales. Educ. Psychol. Meas. 20, 37–46.
[7].
Congalton, R.G., Green, K., 1998. Assessing the Accuracy of Remotely Sensed Data: Principles and Practices, first edn. Lewis Publications, Boca Raton p. 137.
[8].
Cover, T., Hart, P., 1967. Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13, 21–27.
[9].
Cracknell, M.J., Reading, A.M., 2013. The upside of uncertainty: identification of lithology contact zones from airborne geophysics and satellite data using Random Forests and Support Vector Machines. Geophysics 78, WB113–WB126.
[10].
Fix, E., Hodges, J.L., 1951. Discriminatory analysis. Nonparametric discrimination; Consistency properties. U.S. Air Force, School of Aviation Medicine, Randolph Field, Texas.
[11].
Foody, G.M., Mathur, A., 2004. A relative evaluation of multiclass image classifica-tion by support vector machines. IEEE Trans. Geosci. Remote Sens. 42, 1335–1343.
[12].
Gahegan, M., 2000. On the application of inductive machine learning tools to geographical analysis. Geogr. Anal. 32, 113–139.
[13].
Gelfort, R., 2006. On Classification of Logging Data, Department of Energy and Economics. Clausthal University of Technology, Germany, p. 131.
[14].
Getis, A., 2010. Spatial autocorrelation. In: Fisher, M.M., Getis, A. (Eds.), Handbook of Applied Spatial Analysis: Software, Tools, Methods and Applications. Springer-Verlag, Berlin, pp. 255–278.
[15].
Guyon, I., 2008. Practical feature selection: from correlation to causality. In: Fogelman-Soulié, F., Perrotta, D., Piskorski, J., Steinberger, R. (Eds.), Mining Massive Data Sets for Security – Advances in Data Mining, Search, Social Networks and Text Mining, and their Applications to Security. IOS Press, Amsterdam, pp. 27–43.
[16].
Guyon, I., 2009. A practical guide to model selection. In: Marie, J. (Ed.), Proceedings of the Machine Learning Summer School. Canberra, Australia, January 26 - February 6, Springer Text in Statistics, Springer p.37.
[17].
Ham, J., Yangchi, C., Crawford, M.M., Ghosh, J., 2005. Investigation of the random forest framework for classification of hyperspectral data. IEEE Trans. Geosci. Remote Sens. 43, 492–501.
[18].
Hastie, T., Tibshirani, R., Friedman, J.H., 2009. The elements of statistical learning: data mining, Inference and Prediction, 2nd edn. Springer, New York, USA p. 533.
[19].
Henery, R.J., 1994. Classification. In: Michie, D., Spiegelhalter, D.J., Taylor, C.C. (Eds.), Machine Learning, Neural and Statistical Classification. Ellis Horwood, New York, pp. 6–16.
[20].
Hsu, C.-W., Lin, C.-J., 2002. A comparison of methods for multiclass support vector machines. IEEE Trans. Neural Netw. 13, 415–425.
[21].
Hsu, C.-W., Chang, C.-C., Lin, C.-J., 2010. A Practical Guide to Support Vector ClassificationDepartment of Computer Science, National Taiwan University, Taipei, Taiwan16.
[22].
Huang, C., Davis, L.S., Townshend, J.R.G., 2002. An assessment of support vector machines for land cover classification. Int. J. Remote Sens. 23, 725–749.
[23].
Kanevski, M., Pozdnoukhov, A., Timonin, V., 2009. Machine learning for spatial environmental data: theory, Applications and Software. CRC Press, Boca Raton, USA (368 pp.).
[24].
Karatzoglou, A., Meyer, D., Hornik, K., 2006. Support vector machines in R. J. Stat. Softw. 15, 28.
[25].
Kotsiantis, S.B., 2007. Supervised machine learning: a review of classification techniques. Informatica 31, 249–268.
[26].
Kovacevic, M., Bajat, B., Trivic, B., Pavlovic, R., 2009. Geological units classification of multispectral images by using Support Vector Machines. In: Proceedings of the International Conference on Intelligent Networking and Collaborative Systems, IEEE, pp. 267–272.
[27].
Kuhn, M., Wing, J., Weston, S., Williams, A., Keefer, C., Engelhardt, A., 2012. caret: Classifcation and Regression Training, R Package Version 5.15-023.
[28].
Kuncheva, L., 2004. Combining Pattern Classifiers: Methods and Algorithms. John Wiley & Sons p. 376.
[29].
Landgrebe, T.C.W., Paclik, P., 2010. The ROC skeleton for multiclass ROC estimation. Pattern Recogn. Lett. 31, 949–958.
[30].
Leverington, D.W., 2010. Discrimination of sedimentary lithologies using Hyperion and Landsat Thematic Mapper data: a case study at Melville Island, Canadian High Arctic. Int. J. Remote Sens. 31, 233–260.
[31].
Leverington, D.W., Moon, W.M., 2012. Landsat-TM-based discrimination of litholo-gical units associated with the Purtuniq Ophiolite, Quebec, Canada. Remote Sens. 4, 1208–1231.
[32].
Li, C.-H., Kuo, B.-C., Lin, C.-T., Huang, C.-S., 2012. A spatial-contextual Support Vector Machine for remotely sensed image classification. IEEE Trans. Geosci. Remote Sens. 50, 784–799.
[33].
Lloyd, C.D., 2011. Local Models for Spatial Analysis, 2nd edn. CRC Press, Taylor & Francis Group, Boco Raton, USA p. 336.
[34].
Lu, D., Weng, Q., 2007. A survey of image classification methods and techniques for improving classification performance. Int. J. Remote Sens. 28, 823–870.
[35].
Marsland, S., 2009. Machine Learning: An Algorithmic Perspective. Chapman & Hall/CRC (406 pp.).
[36].
Melgani, F., Bruzzone, L., 2004. Classification of hyperspectral remote sensing images with support vector machines. IEEE Trans. Geosci. Remote Sens. 42, 1778–1790.
[37].
Molina, R., P´erez de la Blanca, N., Taylor, C.C., 1994. Modern statistical techniques. In: Michie, D., Spiegelhalter, D.J., Taylor, C.C. (Eds.), Machine Learning. Neural and Statistical Classification. Ellis Horwood, New York, pp. 29–49.
[38].
Oommen, T., Misra, D., Twarakavi, N.K.C., Prakash, A., Sahoo, B., Bandopadhyay, S., 2008. An objective analysis of support vector machine based classification for remote sensing. Math. Geosci. 40, 409–424.
[39].
Page, R.W., Conor, C.H.H., Stevens, B.P.J., Gibson, G.M., Preiss, W.V., Southgate, P.N., 2005a. Correlation of Olary and Broken Hill Domains, Curnamona Province: possible relationship to Mount Isa and other North Australian Pb–Zn–Ag-bearing successions. Econ. Geol. 100, 663–676.
[40].
Page, R.W., Stevens, B.P.J., Gibson, G.M., 2005b. Geochronology of the sequence hosting the Broken Hill Pb–Zn–Ag orebody, Australia. Econ. Geol. 100, 633–661.
[41].
Pal, M., 2005. Random forest classifier for remote sensing classification. Int. J. Remote Sens. 26, 217–222.
[42].
Provost, F., Fawcett, T., 1997. Analysis and visualization of classifier performance: Comparison under imprecise class and cost distributions, Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD 97). American Association for Artificial Intelligence, Huntington Beach, CA, pp. 43–48.
[43].
Ripley, B.D., 1996. Pattern Recognition and Neural Networks. Cambridge University Press, Cambridge, UK p. 403.
[44].
Rojas, R., 1996. Neural Netwoks: A Systematic Introduction. Springer-Verlag, Berlin p. 502.
[45].
Song, X., Duan, Z., Jiang, X., 2012. Comparison of artificial neural networks and support vector machine classifiers for land cover classification in Northern China using a SPOT-5 HRG image. Int. J. Remote Sens. 33, 3301–3320.
[46].
Stevens, B.P.J., 1986. Post-depositional history of the Willyama Supergroup in the Broken Hill Block, NSW. Aust. J. Earth Sci. 33, 73–98.
[47].
Vapnik, V.N., 1998. Statistical Learning Theory. John Wiley & Sons, Inc., New York, USA p. 736.
[48].
Venables, W.N., Ripley, B.D., 2002. Modern Applied Statistics with S, 4th edn. Springer, New York, USA (495 pp.).
[49].
Waske, B., Braun, M., 2009. Classifier ensembles for land cover mapping using multitemporal SAR imagery. ISPRS J. Photogramm. Remote Sens. 64, 450–457.
[50].
Waske, B., Benediktsson, J.A., Árnason, K., Sveinsson, J.R., 2009. Mapping of hyperspectral AVIRIS data using machine-learning algorithms. Can. J. Remote Sens. 35, 106–116.
[51].
Webster, A.E., 2004. The Structural Evolution of the Broken Hill Pb–Zn–Ag deposit, New South Wales, Australia, ARC Centre for Excellence in Ore Deposit ResearchUniversity of Tasmania, Hobart p. 430.
[52].
Williams, D., 2009. Landsat 7 Science Data User's Handbook. National Aeronautics and Space Administration, Greenbelt, Maryland p. 186.
[53].
Willis, I.L., Brown, R.E., Stroud, W.J., Stevens, B.P.J., 1983. The early proterozoic Willyama supergroup: stratigraphic subdivision and interpretation of high to low‐grade metamorphic rocks in the Broken Hill Block, New South Wales. J. Geol. Soc. Aust. 30, 195–224.
[54].
Witten, I.H., Frank, E., 2005. Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Elsevier/Morgan Kaufman, San Fransisco, USA p. 525.
[55].
Yu, L., Porwal, A., Holden, E.J., Dentith, M.C., 2012. Towards automatic lithological classification from remote sensing data using support vector machines. Comput. Geosci. 45, 229–239.