🤖 Week 3: Introduction to Machine Learning & Classification

ค่า k	ข้อดี	ข้อเสีย
k น้อย (k=1)	ไวต่อข้อมูล	ไวต่อ noise
k มาก	ทนต่อ noise	อาจ underfit

3. ความต้องการ Interpretability

ต้องอธิบายได้สูง (Regulated Industries)

เหมาะสม Linear/Logistic Regression, Single Decision Tree
ทางเลือก Rule-based Models, SHAP/LIME สำหรับ Black-box Models

Use Case Bank Loan Approval

# Train: income, debt, credit_score, employment
# Predict: applicant → approve/reject + reason
Model: Logistic Regression
Output: "Rejected: debt_ratio (0.8) > threshold (0.6)"
Requirement: ธปท. ต้องอธิบายเหตุผลการปฏิเสธได้

Week 3: Introduction to Machine Learning & Classification

เริ่มต้นการเรียนรู้ของเครื่อง และการจำแนกประเภท

หัวข้อวันนี้

ทำไมเครื่องต้องเรียนรู้?

การสอนเด็กแยกแมวกับสุนัข

Machine Learning ก็เหมือนกัน!

นิยาม Machine Learning

ทำไมต้องใช้ Machine Learning?

1. ปัญหาซับซ้อนเกิน

2. ปัญหาเปลี่ยนตลอด

3. ปัญหาที่มนุษย์ทำได้แต่อธิบายไม่ได้

ประเภทของการเรียนรู้

1. Supervised Learning (มีครูสอน)

2. Unsupervised Learning (ไม่มีครูสอน)

3. Reinforcement Learning (เรียนรู้จากรางวัล)

Supervised Learning

ลักษณะสำคัญ

Supervised Learning (ต่อ)

แบ่งเป็น 2 ประเภท

Unsupervised Learning

ลักษณะสำคัญ

Unsupervised Learning (ต่อ)

ตัวอย่าง

Reinforcement Learning

ลักษณะสำคัญ

Reinforcement Learning (ต่อ)

ตัวอย่าง

การแบ่งข้อมูลสำหรับ ML

Overfitting vs Underfitting

Overfitting (เรียนรู้มากไป)

Underfitting (เรียนรู้น้อยไป)

k-Nearest Neighbors (k-NN)

ขั้นตอนการทำงาน

การวัดระยะทางใน k-NN

Euclidean Distance

Manhattan Distance

การเลือกค่า k

หลักการเลือก

k-NN: ข้อดี vs ข้อเสีย

ข้อดี

ข้อเสีย

Decision Trees

การสร้าง Decision Tree

ขั้นตอน

การสร้าง Decision Tree (ต่อ)

Metrics สำหรับเลือก Feature

Information Gain

Gini Impurity

การ Pruning ต้นไม้

Pre-pruning (ป้องกันก่อน)

Post-pruning (ตัดทีหลัง)

Decision Trees: ข้อดี vs ข้อเสีย

ข้อดี

ข้อเสีย

Naive Bayes

แปลเป็นภาษาคน

ทำไมเรียกว่า "Naive"?

สมมติฐานที่ "ซื่อ"

ทำไมเรียกว่า "Naive"? (ต่อ)

ตัวอย่าง แยกสแปม

ประเภทของ Naive Bayes

Gaussian NB

Multinomial NB

Bernoulli NB

Naive Bayes: ข้อดี vs ข้อเสีย

ข้อดี

ข้อเสีย

Confusion Matrix

Metrics การวัดประสิทธิภาพ

Accuracy = (TP + TN) / Total

Precision = TP / (TP + FP)

Metrics การวัดประสิทธิภาพ (ต่อ)

Recall = TP / (TP + FN)

F1-Score

เมื่อไหร่ใช้

Cross-validation

ปัญหาของการแบ่งครั้งเดียว

k-Fold Cross-validation

การเลือก Algorithm

พิจารณาจาก 4 มิติหลัก