The slide outlines the workflow for data preprocessing, covering five key steps: Data Cleaning, Integration, Transformation, Reduction, and Discretization. Each step includes a brief description and specific techniques, such as imputation for missing values, dataset merging, normalization/encoding, PCA, and binning.
Langkah-langkah Data Preprocessing
{ "headers": [ "Langkah", "Deskripsi", "Teknik" ], "rows": [ [ "Data Cleaning", "Pembersihan data dari missing values, duplikat, outliers", "Imputasi missing values, deteksi & penanganan outliers" ], [ "Data Integration", "Menggabungkan data dari berbagai sumber menjadi dataset konsisten", "Merging datasets, schema integration" ], [ "Data Transformation", "Mengubah skala/format data untuk keseragaman", "Normalisasi, standarisasi, encoding (One-Hot, Label)" ], [ "Data Reduction", "Mengurangi dimensi data tanpa hilang info penting", "Principal Component Analysis (PCA)" ], [ "Data Discretization", "Konversi data kontinu ke diskrit", "Binning, entropy-based discretization" ] ] }
Source: Data Preprocessing dalam Data Mining dan Machine Learning
Speaker Notes
Langkah-langkah Data Preprocessing: 1. Data Cleaning (missing values, outliers). 2. Data Integration (gabung sumber). 3. Data Transformation (normalisasi, encoding). 4. Data Reduction (PCA). 5. Data Discretization (kontinu ke diskrit). Pengertian: Tahap awal pembersihan dan transformasi data mentah menjadi format terstruktur untuk analisis. Pentingnya: Mengurangi kesalahan, tingkatkan akurasi model ML.