๊ด€๋ฆฌ ๋ฉ”๋‰ด

On the journey of

[ํ•ธ์ฆˆ์˜จ๋จธ์‹ ๋Ÿฌ๋‹] Chapter 3(3.1~3.7) ๋ณธ๋ฌธ

Experiences & Study

[ํ•ธ์ฆˆ์˜จ๋จธ์‹ ๋Ÿฌ๋‹] Chapter 3(3.1~3.7)

dlrpskdi 2023. 8. 3. 09:12

3.1 MNIST

 ๐Ÿ’ก MNIST ๋ฐ์ดํ„ฐ์…‹ (Modified National Institue of Standards and Technology Dataset) : ๊ณ ๋“ฑํ•™์ƒ๊ณผ ๋ฏธ๊ตญ ์ธ๊ตฌ์กฐ์‚ฌ๊ตญ ์ง์›๋“ค์ด ์†์œผ๋กœ ์“ด 70,000๊ฐœ์˜ ์ž‘์€ ์ˆซ์ž์ด๋ฏธ์ง€๋ฅผ ๋ชจ์€ ๋ฐ์ดํ„ฐ์…‹

  • ๋จธ์‹ ๋Ÿฌ๋‹ ๋ถ„์•ผ์˜ ‘Hello World’์™€ ๊ฐ™์€ ํ•™์Šต์šฉ ๋ฐ์ดํ„ฐ์…‹
  • MNIST ๋ฐ์ดํ„ฐ์…‹ ๊ฐ€์ ธ์˜ค๊ธฐ
    • ‘mnist_784’ : id
    • version=1 : ํ•˜๋‚˜์˜ id์— ์—ฌ๋Ÿฌ ๊ฐœ์˜ ๋ฒ„์ „์ด ์žˆ์Œ
    • ์ถœ๋ ฅ๊ฐ’
    • {'COL_NAMES':['label', 'data'], 'DESCR': ... 'data' : array([[0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], ... [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0]], dtype=uint8), 'target': array([0., 0., 0., ..., 9., 9., 9.]) }
    • ๋”•์…”๋„ˆ๋ฆฌ ๊ตฌ์กฐ๋ฅผ ๊ฐ€์ง.
    • ‘DESCR’ : ๋ฐ์ดํ„ฐ์…‹ ์„ค๋ช…
  • from sklearn.datasets import fetch_openml mnist = fetch_openml('mnist_784', version=1, as_frame=False)
  • MNIST ๋ฐ์ดํ„ฐ์…‹์€ ํฌ๊ธฐ๊ฐ€ ์ปค์„œ sklearn์— ๋‚ด์žฅ๋˜์–ด์žˆ์ง€ ์•Š๊ณ  openml.org์˜ ๋จธ์‹ ๋Ÿฌ๋‹ ๋ฐ์ดํ„ฐ ๋ ˆํฌ์ง€ํ„ฐ๋ฆฌ์—์„œ ๋‹ค์šด๋กœ๋“œํ•œ๋‹ค. ์ด๋ฅผ ๋„์™€์ฃผ๋Š” ๊ฒƒ์ด sklearn์˜ fetch_openml
  • data์™€ target ๋ฐฐ์—ด ์‚ดํŽด๋ณด๊ธฐ
X, y = mnist['data'], mnist['target']
X.shape # (70000, 784)
y.shape # (70000,)
  • ์ด๋ฏธ์ง€๊ฐ€ 70,000๊ฐœ ์žˆ๊ณ  ๊ฐ ์ด๋ฏธ์ง€์—๋Š” 784(28*28)๊ฐœ์˜ ํŠน์„ฑ์ด ์žˆ์Œ.
  • ๊ฐ ํŠน์„ฑ์€ 0(ํฐ์ƒ‰)๋ถ€ํ„ฐ 255(๊ฒ€์€์ƒ‰)๊นŒ์ง€์˜ ํ”ฝ์…€ ๊ฐ•๋„
  • MNIST ๋ฐ์ดํ„ฐ ํ•˜๋‚˜ ์‚ดํŽด๋ณด๊ธฐ
%matplotlib inline
import matplotlib
import matplotlib.pyplot as plt

some_digit = X[0] # as_frame=False ์•„๋‹ˆ๋ฉด KeyError
some_digit_image = some_digit.reshape(28, 28) # ์žฌ๋ฐฐ์—ด
plt.imshow(some_digit_image, cmap=matplotlib.cm.binary, interpolation="nearest")
plt.axis("off")
plt.show()

 

 

  • y[0] ์ถœ๋ ฅ ๊ฒฐ๊ณผ๋Š” ‘5’๋กœ ๋งž๋Š” ๋ ˆ์ด๋ธ”์ด ๋˜์–ด์žˆ์Œ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค.
  • ํ…Œ์ŠคํŠธ ์„ธํŠธ์˜ ์„ค์ •
    • ๋ฐ์ดํ„ฐ๋ฅผ ๋ถ„์„ํ•˜๊ธฐ ์ „์— ํ•ญ์ƒ ํ…Œ์ŠคํŠธ ์„ธํŠธ๋ฅผ ๊ตฌ๋ถ„ํ•ด๋†“์•„์•ผ ํ•จ
    • MNIST ๋ฐ์ดํ„ฐ์…‹์€ ์ด๋ฏธ ํ›ˆ๋ จ ์„ธํŠธ(์•ž์ชฝ 60,000๊ฐœ ์ด๋ฏธ์ง€)์™€ ํ…Œ์ŠคํŠธ ์„ธํŠธ(๋’ค์ชฝ 10,000๊ฐœ ์ด๋ฏธ์ง€)๋กœ ๋‚˜๋‰จ
    • ์ด๋ฏธ์ง€๊ฐ€ ๋žœ๋คํ•˜๊ฒŒ ์„ž์—ฌ์žˆ๊ธฐ๋•Œ๋ฌธ์— ์ˆœ์„œ๋Œ€๋กœ ๋‚˜๋ˆ ๋„ ๋ฌด๋ฐฉ
    X_train, X_test, y_train, y_test = X[:60000], X[60000:], y[:60000], y[60000:]
    
  • ์ˆœ์„œ๋Œ€๋กœ ๋‚˜๋ˆˆ ๊ฒƒ์ด ๊ฑฑ์ •๋œ๋‹ค๋ฉด?
    • ํ•˜๋‚˜์˜ ํด๋“œ์— ํŠน์ • ์ˆซ์ž๊ฐ€ ๋ˆ„๋ฝ๋œ ๊ฒฝ์šฐ,
    • ๋น„์Šทํ•œ ์ƒ˜ํ”Œ์ด ์—ฐ์ด์–ด ๋‚˜ํƒ€๋‚˜๋Š” ๊ฒฝ์šฐ,
    • ํ•™์Šต ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ์„ฑ๋Šฅ์ด ๋‚˜๋น ์ง
    import numpy as np
    
    shuffle_index = np.random.permutation(60000)
    X_train, y_train = X_train[shuffle_index], y_train[shuffle_index]
    

3.2 ์ด์ง„๋ถ„๋ฅ˜๊ธฐ ํ›ˆ๋ จ

<aside> ๐Ÿ’ก ์ด์ง„๋ถ„๋ฅ˜๊ธฐ (Binary Classifier) : ๋ฐ์ดํ„ฐ๋ฅผ ์–‘์„ฑ ํด๋ž˜์Šค์™€ ์Œ์„ฑ ํด๋ž˜์Šค ๋‘ ๊ฐ€์ง€๋กœ ๋ถ„๋ฅ˜ : ๊ธฐ๋ณธ์— MNIST ๋ฐ์ดํ„ฐ์…‹์„ 0~9์˜ ์ˆ˜๋กœ ๋ถ„๋ฅ˜ํ•˜๋Š” ๊ฒƒ์€ ๋‹ค์ค‘๋ถ„๋ฅ˜์˜ ์˜ˆ

</aside>

  • ๊ธฐ์กด MNIST ๋ฐ์ดํ„ฐ์…‹ ๋ถ„๋ฅ˜ ๋ฌธ์ œ๋ฅผ ๋‹จ์ˆœํ™”ํ•˜์—ฌ ‘5’์™€ ‘5๊ฐ€ ์•„๋‹Œ ์ˆ˜’๋ฅผ ๋ถ„๋ฅ˜ํ•˜๋Š” ์ด์ง„๋ถ„๋ฅ˜๊ธฐ๋ฅผ ๋งŒ๋“ค์–ด๋ณด๊ณ ์ž ํ•จ.
  • ๋ถ„๋ฅ˜ ์ž‘์—…์„ ์œ„ํ•ด ํƒ€๊นƒ ๋ฒกํ„ฐ๋ฅผ ๋งŒ๋“ฆ
y_train_5 = (y_train == 5) # 5๋ผ๋ฉด True๋กœ, 5๊ฐ€ ์•„๋‹Œ ๋‹ค๋ฅธ ์ˆ˜๋ผ๋ฉด False๋กœ ์ €์žฅ
y_test_5 = (y_test == 5)
  • ๋ถ„๋ฅ˜ ๋ชจ๋ธ ์ค‘ ํ•˜๋‚˜์ธ ํ™•๋ฅ ์  ๊ฒฝ์‚ฌํ•˜๊ฐ•๋ฒ•(Stochastic Gradient Descent, SGD) ๋ถ„๋ฅ˜๊ธฐ ์ด์šฉ. (์ž์„ธํ•œ ๋‚ด์šฉ์€ ๋‹ค์Œ์žฅ์— ๋‚˜์˜ค๋ฏ€๋กœ ์„ธ๋ถ€ ์„ค๋ช… ์ƒ๋žต)
  • sklearn์˜ SGDClassifier ํด๋ž˜์Šค ์ด์šฉ
from sklearn.linear_model import SGDClassifier

sgd_clf = SGDClassifier(max_iter=5, random_state=42)
sgd_clf.fit(X_train, y_train_5)
  • ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ ์ˆซ์ž 5์˜ ์ด๋ฏธ์ง€ ๊ฐ์ง€
sgd_clf.predict([some_digit]) # array([True], dtype=bool), ํŠน๋ณ„ํ•˜๊ฒŒ ๋งž์ถ˜ ์ผ€์ด์Šค

3.3 ์„ฑ๋Šฅ ํ‰๊ฐ€

  • ๋ถ„๋ฅ˜๊ธฐ ํ‰๊ฐ€๋ฅผ ์œ„ํ•œ ์—ฌ๋Ÿฌ ์„ฑ๋Šฅ ์ง€ํ‘œ ๋น„๊ต

3.3.1 ๊ต์ฐจ ๊ฒ€์ฆ์„ ์‚ฌ์šฉํ•œ ์ •ํ™•๋„ ์ธก์ •

<aside> ๐Ÿ’ก K-๊ฒน ๊ต์ฐจ ๊ฒ€์ฆ (K-Fold Cross Validation) : ํ›ˆ๋ จ์„ธํŠธ๋ฅผ K๊ฐœ์˜ ํด๋“œ๋กœ ๋‚˜๋ˆ„๊ณ , ๊ฐ ํด๋“œ์— ๋Œ€ํ•ด ์˜ˆ์ธก์„ ๋งŒ๋“ค๊ณ  ํ‰๊ฐ€ํ•˜๊ธฐ ์œ„ํ•ด ๋‚˜๋จธ์ง€ ํด๋“œ๋กœ ํ›ˆ๋ จ์‹œํ‚จ ๋ชจ๋ธ์„ ์‚ฌ์šฉ

</aside>

  • cross_val_score() ํ•จ์ˆ˜ ์ด์šฉ : ์ •ํ™•๋„(accuracy)
from sklearn.model_selection import cross_val_score
cross_val_score(sgd_clf, X_train, y_train_5, cv=3, scoring="accuracy") # K = 3
# array([0.9502, 0.96565, 0.96495])
  • ๋ชจ๋“  ์ •ํ™•๋„๊ฐ€ 95% ์ด์ƒ → “์ข‹์€ ๊ฒฐ๊ณผ์ธ๊ฐ€?”
  • ์˜ˆ์‹œ
  • ๋‹จ์ˆœ ๊ณ„์‚ฐ์œผ๋กœ ์ „์ฒด MNIST ๋ฐ์ดํ„ฐ์…‹์— ์ˆซ์ž 5์˜ ์ด๋ฏธ์ง€๋Š” ์•ฝ 10% ์ •๋„๋ผ๊ณ  ๊ฐ€์ •ํ•˜๋ฉด ๋ฌด์กฐ๊ฑด 5๊ฐ€ ์•„๋‹ˆ๋ผ๊ณ  ์˜ˆ์ธกํ•œ๋‹ค๋ฉด 90% ๋‚ด์™ธ์˜ ์ •ํ™•๋„๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ์Œ
    • ์ •ํ™•๋„๋Š” ํŠนํžˆ ๋ถˆ๊ท ํ˜•ํ•œ ๋ฐ์ดํ„ฐ์…‹์„ ๋‹ค๋ฃฐ ๋•Œ ๋ถ„๋ฅ˜๊ธฐ์˜ ์„ฑ๋Šฅ ์ธก์ • ์ง€ํ‘œ๋กœ ์„ ํ˜ธ๋˜์ง€ ์•Š๋Š”๋‹ค.
  • from sklearn.base import BaseEstimator class Never5Classifier(BaseEstimator): def fit(self, X, y=None): pass def predict(self, X): return np.zeros((len(X), 1), dtype=bool) never_5_clf = Never5Classifier() cross_val_score(never_5_clf, X_train, y_train_5, cv=3, scoring='accuracy') # array([0.909, 0.90715, 0.9128])

3.3.2 ์˜ค์ฐจ ํ–‰๋ ฌ

๐Ÿ’ก ์˜ค์ฐจ ํ–‰๋ ฌ (confusion matrix) : ์‹ค์ œ๋กœ ์ฐธ์ธ์ง€ ๊ฑฐ์ง“์ธ์ง€, ์˜ˆ์ธก์„ ๊ธ์ •์œผ๋กœ  ํ–ˆ๋Š”์ง€, ๋ถ€์ •์œผ๋กœ ํ–ˆ๋Š”์ง€์— ๋”ฐ๋ผ ๋„ค ๊ฐœ์˜ ๊ฒฝ์šฐ์˜ ์ˆ˜๋กœ ๊ตฌ๋ถ„ํ•œ ํ‘œ

 

  • ์˜ค์ฐจ ํ–‰๋ ฌ ๋งŒ๋“ค๊ธฐ
    • ์šฐ์„  ์‹ค์ œ ํƒ€๊นƒ๊ณผ ๋น„๊ตํ•  ์ˆ˜ ์žˆ๋„๋ก ์˜ˆ์ธก๊ฐ’์„ ๋งŒ๋“ค์–ด์•ผ ํ•จ → cross_val_predict() ํ•จ์ˆ˜ ์ด์šฉ : ๊ฐ ํ…Œ์ŠคํŠธ ํด๋“œ์—์„œ ์–ป์€ ์˜ˆ์ธก ๋ฐ˜ํ™˜
    from sklearn.model_selection import cross_val_predict
    
    y_train_pred = cross_val_predict(sgd_clf, X_train, y_train_5, cv=3)
    
    • confusion_matrix() ํ•จ์ˆ˜ ์ด์šฉํ•˜์—ฌ ์˜ค์ฐจ ํ–‰๋ ฌ ๋งŒ๋“ฆ.
    from sklearn.metrics import confusion_matrix
    confusion_matrix(y_train_5, y_train_pred)
    # array([[53272, 1307], 
    #       [1077, 4344]]) 
    
    ์˜ˆ์ธก ์Œ์„ฑ ์˜ˆ์ธก ์–‘์„ฑ
    ์‹ค์ œ ์Œ์„ฑ ์ง„์งœ์Œ์„ฑ (TN) ๊ฐ€์งœ์–‘์„ฑ (FP)
    ์‹ค์ œ ์–‘์„ฑ ๊ฐ€์งœ์Œ์„ฑ (FN) ์ง„์งœ์–‘์„ฑ (TP)
    • ์™„๋ฒฝํ•œ ๋ถ„๋ฅ˜๊ธฐ๋ผ๋ฉด FP = FN = 0

๐Ÿ’ก ์ •๋ฐ€๋„ (precision) : ์–‘์„ฑ ์˜ˆ์ธก์˜ ์ •ํ™•๋„ : ์–‘์„ฑํด๋ž˜์Šค๋กœ ์˜ˆ์ธกํ•œ ๊ฒƒ ์ค‘ ์ง„์งœ์–‘์„ฑ์˜ ๋น„์œจ

์ •๋ฐ€๋„ = {TP}/{TP+FP} 

๐Ÿ’ก ์žฌํ˜„์œจ (recall) = ๋ฏผ๊ฐ๋„ (sensitivity) = ์ง„์งœ์–‘์„ฑ๋น„์œจ (true positive rate; TPR) : ๋ถ„๋ฅ˜๊ธฐ๊ฐ€ ์ •ํ™•ํ•˜๊ฒŒ ๊ฐ์ง€ํ•œ ์–‘์„ฑ ์ƒ˜ํ”Œ์˜ ๋น„์œจ : “์‹ค์ œ ์–‘์„ฑ ์ค‘ ์ง„์งœ์–‘์„ฑ์œผ๋กœ ์–ผ๋งˆ๋‚˜ ๋ถ„๋ฅ˜๋˜์—ˆ๋Š”๊ฐ€”

์žฌํ˜„์œจ ={TP}/{TP+FN}

  • ex) 

3.3.3 ์ •๋ฐ€๋„์™€ ์žฌํ˜„์œจ

  • ํŒŒ์ด์ฌ์—์„œ๋Š” ์ •๋ฐ€๋„์™€ ์žฌํ˜„์œจ์„ ๊ณ„์‚ฐํ•˜๋Š” ํ•จ์ˆ˜๋ฅผ ์ œ๊ณต
from sklearn.metrics import precision_score, recall_score
precision_score(y_train_5, y_train_pred) # == 4344 / (4344 + 1307)
# 0.768713...
recall_score(y_train_5, y_train_pred) # == 4344 / (4344 + 1077)
# 0.801328...
  • 3.2.1์˜ ์ •ํ™•๋„์— ๋น„ํ•ด, 5๋กœ ํŒ๋ณ„๋œ ์ด๋ฏธ์ง€ ์ค‘ 77%๋งŒ ์ •ํ™•ํ•˜๊ณ  ์ „์ฒด ์ˆซ์ž 5์—์„œ 80%๋งŒ ๊ฐ์ง€ํ•œ ์‚ฌ์‹ค์€ ์„ฑ๋Šฅ์ด ์ข‹์•„๋ณด์ด์ง€ ์•Š์Œ

๐Ÿ’ก F1์ ์ˆ˜ (F1 score) : ์ •๋ฐ€๋„์™€ ์žฌํ˜„์œจ์˜ ์กฐํ™” ํ‰๊ท (harmonic mean) : ๋‘ ๋ถ„๋ฅ˜๊ธฐ๋ฅผ ๋น„๊ตํ•  ๋•Œ ํŽธ๋ฆฌ

  • F1 ์ ์ˆ˜ ๊ณ„์‚ฐ ์‹œ f1_score() ํ•จ์ˆ˜ ์ด์šฉ
from sklearn.metrics import fl_score
fl_score(y_train_5, y_train_pred)
# 0.78468...
  • ์ •๋ฐ€๋„์™€ ์žฌํ˜„์œจ์˜ ๋ฐ˜๋น„๋ก€ ๊ด€๊ณ„๋ฅผ ์ •๋ฐ€๋„/์žฌํ˜„์œจ ํŠธ๋ ˆ์ด๋“œ์˜คํ”„๋ผ๊ณ  ํ•จ.

3.3.4 ์ •๋ฐ€๋„/์žฌํ˜„์œจ ํŠธ๋ ˆ์ด๋“œ ์˜คํ”„

SGDClassifier์˜ ๋ถ„๋ฅ˜๋ฅผ ํ†ตํ•ด ํŠธ๋ ˆ์ด๋“œ ์˜คํ”„ ์ดํ•ด

์ด ๋ถ„๋ฅ˜๊ธฐ๋Š” ๊ฒฐ์ •ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๊ฐ ์ƒ˜ํ”Œ์˜ ์ ์ˆ˜๋ฅผ ๊ณ„์‚ฐ

  • ์ƒ˜ํ”Œ์ ์ˆ˜>์ž„๊ณ—๊ฐ’→์–‘์„ฑํด๋ž˜์Šค
  • ์ƒ˜ํ”Œ์ ์ˆ˜<์ž„๊ณ—๊ฐ’→์Œ์„ฑํด๋ž˜์Šค
  •  

5์— ๋Œ€ํ•œ train data๋ฅผ ์‚ฌ์šฉํ•  ๋•Œ, ์‹ค์ œ ์–‘์„ฑ (# of (tp+fn)) = 6์ด๋ฏ€๋กœ ๋งจ ์˜ค๋ฅธ์ชฝ ์ž„๊ณ„๊ฐ’์— ๋Œ€ํ•ด

์ •๋ฐ€๋„ = ์ง„์งœ ์–‘์„ฑ(# of ๋ถ„๋ฅ˜๊ธฐ ์‹ค์ œ 5์˜ ์ˆ˜ = 3 ) / ์–‘์„ฑ์œผ๋กœ ์˜ˆ์ธก( # of ๋ถ„๋ฅ˜๊ธฐ ๊ฐ์ง€ = 3) = 100%

์žฌํ˜„์œจ = ์ง„์งœ ์–‘์„ฑ(# of ๋ถ„๋ฅ˜๊ธฐ ์‹ค์ œ 5์˜ ์ˆ˜ = 3 ) / ์‹ค์ œ ์–‘์„ฑ( # of ์‹ค์ œ 5์ธ ๊ฐ’ = 6) = 50%

์ž„๊ณ„๊ฐ’์ด ๋†’์•„์งˆ ์ˆ˜๋ก ์ •๋ฐ€๋„ ↑ ์žฌํ˜„์œจ ↓

์ž„๊ณ„๊ฐ’์ด ๋‚ฎ์•„์งˆ ์ˆ˜๋ก ์ •๋ฐ€๋„ ↓ ์žฌํ˜„์œจ ↑

SGDClassifier์˜ ํŠน์ • ํด๋ž˜์Šค(0~9์ค‘ 1๊ฐœ)์— ๋Œ€ํ•œ ์˜ˆ์ธก๊ฐ’

y_scores = sgd_clf.decision_function([some_digit])
y_scores

>>array([303905.39584261])

# ์ž„๊ณ„์  = 0์— ๋Œ€ํ•œ ์˜ˆ์ธก๊ฐ’
threshold = 0
y_some_digit_pred = (y_scores>threshold)
y_some_digit_pred

>>array([ True]) #3๋ฅผ ์ž˜ ์˜ˆ์ธกํ•œ ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ์Œ

์ž„๊ณ„๊ฐ’ ์ฆ๊ฐ€ํ•˜๋ฉด ์žฌํ˜„์œจ์ด ์ค„์–ด๋“ฆ

์ฆ‰, ์ž„๊ณ„๊ฐ’์ด 0์ผ๋•, ๋ถ„๋ฅ˜๊ธฐ๊ฐ€ ์ˆซ์ž 3๋ฅผ ๊ฐ์ง€ํ–ˆ์œผ๋‚˜ ์ž„๊ณ„๊ฐ’์ด 100000000๋กœ ๋†’์ด๋ฉด ์ด๋ฅผ ๋†“์น˜๊ฒŒ ๋จ

# ์ž„๊ณ„์ : 0 →  100000000

threshold = 100000000
y_some_digit_pred = (y_scores > threshold)
y_some_digit_pred

>>array([False]) # ์ž„๊ณ„์ ์„ ๋†’์ด๋‹ˆ 3๋ฅผ ์˜ˆ์ธกํ•˜์ง€ ๋ชปํ•จ 

# ์ž„๊ณ„์  ๊ณ„์‚ฐ
y_scores = cross_val_predict(sgd_clf, X_train, y_train_3, cv = 3, 
															method = 'decision_function')

precision_recall_curve(): ๋ชจ๋“  ์ž„๊ณ„๊ฐ’์— ๋Œ€ํ•œ ์ •๋ฐ€๋„์™€ ์žฌํ˜„์œจ ๊ณ„์‚ฐ

from sklearn.metrics import precision_recall_curve
precisions, recalls, thresholds = precision_recall_curve(y_train_3, y_scores)

์ž„๊ณ„๊ฐ’์— ๋Œ€ํ•œ ์ •๋ฐ€๋„/์žฌํ˜„์œจ ํ•จ์ˆ˜

์ •๋ฐ€๋„์™€ ์žฌํ˜„์œจ์˜ ํŠธ๋ ˆ์ด๋“œ์˜คํ”„ ๊ด€๊ณ„๋ฅผ ์•Œ ์ˆ˜ ์žˆ์Œ

์ •๋ฐ€๋„๋Š” ๋†’์ด๋˜, ์žฌํ˜„์œจ์„ ์–ด๋Š์ •๋„ ํ™•๋ณดํ•  ์ˆ˜ ์žˆ๋Š” ์ ์ ˆํ•œ ์ž„๊ณ„์ ์„ ์ฐพ๋Š” ๊ฒƒ์ด ์ค‘์š”

# ์ •๋ฐ€๋„ 90 ๋ชฉํ‘œ๋กœ ์„ธํŒ…
y_train_pred_90 = (y_scores > 70000)

print(precision_score(y_train_3, y_train_pred_90)) #์ •๋ฐ€๋„
>>0.8810385188337945

print(recall_score(y_train_3, y_train_pred_90)) #์žฌํ˜„์œจ
>>0.675256891208612

3.5 ROC ๊ณก์„ 

๐Ÿ’ก reciver operating characteristic : ๊ฑฐ์ง“์–‘์„ฑ๋น„์œจ(FPR)์— ๋Œ€ํ•œ ์ง„์งœ ์–‘์„ฑ๋น„์œจ(TPR, ์žฌํ˜„์œจ)

  • FPR = ๊ฑฐ์ง“์–‘์„ฑ/์‹ค์ œ ์Œ์„ฑ
  • TNR(ํŠน์ด๋„) = ์ง„์งœ์Œ์„ฑ/์‹ค์ œ ์Œ์„ฑ

ROC ๊ณก์„ ์€ ๋ฏผ๊ฐ๋„(์žฌํ˜„์œจ,TPR)์— ๋Œ€ํ•œ 1- ํŠน์ด๋„(TNR)

from sklearn.metrics import roc_curve
# roc_curve๋ฅผ ์ด์šฉํ•ด ์—ฌ๋Ÿฌ ์ž„๊ณ—๊ฐ’์—์„œ TPR๊ณผ FPR ๊ณ„์‚ฐ
fpr, tpr, thresholds = roc_curve(y_train_3, y_scores)

์•„๋ž˜ ๊ทธ๋ž˜ํ”„๋ฅผ ๋ณด๋ฉด ์•Œ ์ˆ˜ ์žˆ๋“ฏ,

์žฌํ˜„์œจ ⇑ → ๋ถ„๋ฅ˜๊ธฐ๊ฐ€ ๋งŒ๋“œ๋Š” ๊ฑฐ์ง“ ์–‘์„ฑ ⇑

๊ณก์„  ์•„๋ž˜์˜ ๋ฉด์ (area under the curve:AUC) ์™„๋ฒฝํ•œ ๋ถ„๋ฅ˜๊ธฐ๋Š” ROC์˜ AUC๊ฐ€ 1์ด๊ณ , ์™„์ „ํ•œ ๋žœ๋ค ๋ถ„๋ฅ˜๊ธฐ๋Š” 0.5

๋ฌด์ž‘์œ„๋กœ ์˜ˆ์ธกํ• ๊ฒฝ์šฐ ์˜ค์ฐจ ํ–‰๋ ฌ์˜ ์‹ค์ œ ํด๋ž˜์Šค๊ฐ€ ๋น„์Šทํ•œ ๋น„์œจ์˜ ์˜ˆ์ธก ํด๋ž˜์Šค๋กœ ๋‚˜๋‰˜์–ด FPR๊ณผ TPR์˜ ๊ฐ’์ด ๋น„์Šทํ•ด์ง.

๊ฒฐ๊ตญ ROC ๊ณก์„ ์ด y=x์— ๊ฐ€๊น๊ฒŒ ๋˜์–ด AUC ๋ฉด์ ์ด 0.5๊ฐ€ ๋จ.

SGDClassifier vs ๋žœ๋คํฌ๋ ˆ์ŠคํŠธ Classifier

#RandomForestClassifier
# decision_function()์ด ์—†์œผ๋ฏ€๋กœ predict_proba()์ด์šฉ

from sklearn.ensemble import RandomForestClassifier
forest_clf = RandomForestClassifier(n_estimators=10, random_state=42)
y_probas_forest = cross_val_predict(forest_clf, X_train, y_train_3, cv=3,
                                    method="predict_proba")

#ROC ๊ณก์„ ์„ ๊ทธ๋ฆฌ๋ ค๋ฉด ํ™•๋ฅ ์ด ์•„๋‹Œ ์ ์ˆ˜ ํ•„์š”
# ์–‘์„ฑํด๋ž˜์Šค์˜ ํ™•๋ฅ ์„ ์ ์ˆ˜๋กœ ํ™œ์šฉ

y_scores_forest = y_probas_forest[:, 1] #set ์–‘์„ฑํด๋ž˜์Šค ํ™•๋ฅ  = ์ ์ˆ˜
fpr_forest, tpr_forest, thresholds_forest = roc_curve(y_train_3,y_scores_forest)


# randomforestcalssfier >> SGDClassfier roc ๊ทธ๋ž˜ํ”„ ๊ทธ๋ฆผ
plt.figure(figsize=(8, 6))
plt.plot(fpr, tpr, "b:", linewidth=2, label="SGD")
plot_roc_curve(fpr_forest, tpr_forest, "๋žœ๋ค ํฌ๋ ˆ์ŠคํŠธ")
plt.legend(loc="lower right", fontsize=16)
plt.show()

# ๋žœ๋คํฌ๋ ˆ์ŠคํŠธ ๋ถ„๋ฅ˜๊ธฐ์˜ score
from sklearn.metrics import roc_auc_score
roc_auc_score(y_train_3, y_scores_forest) # SGD๊ฐ’๋ณด๋‹ค ํ–ฅ์ƒ ํผ
>>0.9863627257748906

3.4 ๋‹ค์ค‘ ๋ถ„๋ฅ˜

* ๋‹ค์ค‘ ๋ถ„๋ฅ˜๊ธฐ(๋‹คํ•ญ ๋ถ„๋ฅ˜๊ธฐ): ๋‘˜ ์ด์ƒ์˜ ํด๋ž˜์Šค๋ฅผ ๊ตฌ๋ณ„

  • ์ด์ง„ ๋ถ„๋ฅ˜๊ธฐ๋ฅผ ์—ฌ๋Ÿฌ๊ฐœ ์‚ฌ์šฉํ•ด ๋‹ค์ค‘ ํด๋ž˜์Šค ๋ถ„๋ฅ˜ ์˜ˆ) 0~9๊นŒ์ง€ ํ›ˆ๋ จ์‹œ์ผœ ํด๋ž˜์Šค๊ฐ€ 10๊ฐœ์ธ ์ˆซ์ž ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜
    • OvA(์ผ๋Œ€๋‹ค ์ „๋žต): ์‹œ์Šคํ…œ์—์„œ ์ด๋ฏธ์ง€๋ฅผ ๋ถ„๋ฅ˜ํ•  ๋•Œ ๊ฐ ๋ถ„๋ฅ˜๊ธฐ์˜ ๊ฒฐ์ • ์ ์ˆ˜ ์ค‘์—์„œ ๊ฐ€์žฅ ๋†’์€ ๊ฒƒ์„ ํด๋ž˜์Šค๋กœ ์„ ํƒ
    • OvO(์ผ๋Œ€์ผ ์ „๋žต): ํ† ๋„ˆ๋จผํŠธ์‹์œผ๋กœ 0๊ณผ 1 ๊ตฌ๋ณ„, 0๊ณผ 2 ๊ตฌ๋ณ„, 1๊ณผ 2 ๊ตฌ๋ณ„ ๋“ฑ๊ณผ ๊ฐ™์ด ๊ฐ ์ˆซ์ž์˜ ์กฐํ•ฉ๋งˆ๋‹ค ์ด์ง„ ๋ถ„๋ฅ˜๊ธฐ๋ฅผ ํ›ˆ๋ จ
    • OvO ๊ณ„์‚ฐ์–‘: ํด๋ž˜์Šค N๊ฐœ → ๋ถ„๋ฅ˜๊ธฐ
    •  

  • ์žฅ์ : ๊ฐ ๋ถ„๋ฅ˜๊ธฐ์˜ ํ›ˆ๋ จ์— ์ „์ฒด ํ›ˆ๋ จ ์„ธํŠธ ์ค‘ ๊ตฌ๋ณ„ํ•  ๋‘ ํด๋ž˜์Šค์— ํ•ด๋‹นํ•˜๋Š” ์ƒ˜ํ”Œ๋งŒ ํ•„์š”!!

๋Œ€๋ถ€๋ถ„ ์ด์ง„๋ถ„๋ฅ˜: OvA์„ ํ˜ธ

๋‹จ ์„œํฌํŠธ๋ฒกํ„ฐ ๋จธ์‹ ๊ฐ™์€ ์ผ๋ถ€ ์•Œ๊ณ ๋ฆฌ์ฆ˜(=ํ›ˆ๋ จ ์„ธํŠธ์˜ ํฌ๊ธฐ์— ๋ฏผ๊ฐ)์€ OvO์„ ํ˜ธ

SGD ๋‹ค์ค‘ ๋ถ„๋ฅ˜

๋‹ค์ค‘ ๋ถ„๋ฅ˜ ๋ฐฉ์‹: y_train์ด์šฉํ•ด ์—ฌ๋Ÿฌ ํด๋ž˜์Šค ๋ถ„๋ฅ˜ ์ž‘์—…

  1. 3์„ ๊ตฌ๋ณ„ํ•œ ํƒ€๊นƒ ํด๋ž˜์Šค(y_train_3) ๋Œ€์‹ 0~9๊นŒ์ง€์˜ ์›๋ž˜ ํƒ€๊นƒ ํด๋ž˜์Šค(y_train)์„ ์‚ฌ์šฉํ•ด SGDClassifier์„ ์ˆ™๋ จ
  2. ๊ทธ๋Ÿฐ๋‹ค์Œ ์˜ˆ์ธก์„ ํ•˜๋‚˜ํ•จ-> ๊ฒฐ๊ณผ: ์ •ํ™•ํžˆ ๋งž์ถค
  3. ์‚ฌ์ดํ‚ท๋Ÿฐ์ด ์‹ค์ œ๋กœ 10๊ฐœ์˜ ์ด์ง„ ๋ถ„๋ฅ˜๊ธฐ๋ฅผ ํ›ˆ๋ จ์‹œํ‚ค๊ณ  ๊ฐ๊ฐ์˜ ๊ฒฐ์ • ์ ์ˆ˜๋ฅผ ์–ป์–ด ์ ์ˆ˜๊ฐ€ ๊ฐ€์žฅ ๋†’์€ ํด๋ž˜์Šค๋ฅผ ์„ ํƒ
sgd_clf.fit(X_train,y_train) # y_train_3์ด ์•„๋‹Œ y_train ์‚ฌ์šฉ
sgd_clf.predict([some_digit])
>>array([3])

๋‹ค์ค‘ ๋ถ„๋ฅ˜ ๋ฐฉ์‹: decision_function() ์ด์šฉ

  1. decision_function(): ์‚ฌ์ดํ‚ท๋Ÿฐ์ด ์‹ค์ œ๋กœ 10๊ฐœ์˜ ์ด์ง„ ๋ถ„๋ฅ˜๊ธฐ๋ฅผ ํ›ˆ๋ จ์‹œํ‚ค๊ณ  ๊ฐ๊ฐ์˜ ๊ฒฐ์ • ์ ์ˆ˜๋ฅผ ์–ป์–ด ์ ์ˆ˜๊ฐ€ ๊ฐ€์žฅ ๋†’์€ ํด๋ž˜์Šค๋ฅผ ์„ ํƒ๊ฐ€ ๋งž๋Š”์ง€ ํ™•์ธ
  2. ์ƒ˜ํ”Œ ํ•˜๋‚˜์˜ ์ ์ˆ˜๊ฐ€ ์•„๋‹ˆ๋ผ ํด๋ž˜์Šค๋งˆ๋‹ค ํ•˜๋‚˜์”ฉ, ์ด 10๊ฐœ์˜ ์ ์ˆ˜๋ฅผ ๋ฐ˜ํ™˜
  3.  
some_digit_scores = sgd_clf.decision_function([some_digit])
some_digit_scores # ๊ฐ€์žฅ ๋†’์€ ๊ฐ’: 129738.57895132 -> index = 3์ž„.

>>array([[-620766.3293308 , -303658.02060956, -420512.63999577,
         129738.57895132, -612558.43668255, -181995.32698842,
        -573231.01600869, -264848.38137796, -192137.27888861,
        -272236.34656753]])

๊ทธ๋ฆฌ๊ณ  SGD ๋ถ„๋ฅ˜๊ธฐ์— ๋Œ€ํ•œ ์ถ”์ •์น˜ & ์‹ค์ œ ๊ฐ’

print('์ถ”์ •๊ฐ’:{}'.format(np.argmax(some_digit_scores))) # index๊ฐ€ 3์ด ๋งž๋Š”์ง€ ์ฒดํฌ
>>์ถ”์ •๊ฐ’:3

print('sgd๊ฐ€ ๋ถ„๋ฅ˜ํ•œ mnist๊ฐ’:{}'.format(sgd_clf.classes_))
>>sgd๊ฐ€ ๋ถ„๋ฅ˜ํ•œ mnist๊ฐ’:[0 1 2 3 4 5 6 7 8 9]

print('sgd๊ฐ€ ๋ถ„๋ฅ˜ํ•œ minist ์ธ๋ฑ์Šค์˜ ์‹ค์ œ๊ฐ’:{}'.format(sgd_clf.classes_[3]))
>>sgd๊ฐ€ ๋ถ„๋ฅ˜ํ•œ minist ์ธ๋ฑ์Šค์˜ ์‹ค์ œ๊ฐ’:3

OvO๋‚˜ OvA ๊ฐ•์ œ ์‚ฌ์šฉ ๋ฐฉ๋ฒ•

# OvO๋‚˜ OvA ๊ฐ•์ œ ์‚ฌ์šฉ

from sklearn.multiclass import OneVsOneClassifier
ovo_clf = OneVsOneClassifier(SGDClassifier(max_iter=5, random_state=42))
ovo_clf.fit(X_train, y_train)
ovo_clf.predict([some_digit])


print(ovo_clf.fit(X_train, y_train))
>>OneVsOneClassifier(estimator=SGDClassifier(max_iter=5, random_state=42))

print(ovo_clf.predict([some_digit]))
>>[3]

print(len(ovo_clf.estimators_))
>>45

๋žœ๋คํฌ๋ ˆ์ŠคํŠธ ๋‹ค์ค‘ ๋ถ„๋ฅ˜

print(forest_clf.fit(X_train, y_train))
>>RandomForestClassifier(n_estimators=10, random_state=42)

print(forest_clf.predict([some_digit]))
>>[3]

๋žœ๋คํฌ๋ ˆ์ŠคํŠธ๋ถ„๋ฅ˜๊ธฐ์˜ ๊ฒฝ์šฐ ์ง์ ‘ ํด๋ž˜์Šค๋ฅผ ๋ถ„๋ฅ˜ํ•˜๊ณ  predict_proba()๋ฅผ ์‚ฌ์šฉํ•  ๊ฒฝ์šฐ ๊ฐ ๋ถ„๋ฅ˜์˜ ํ™•๋ฅ ์ด ๊ณ„์‚ฐ๋จ

forest_clf.predict_proba([some_digit])
>>array([[0. , 0. , 0. , 0.9, 0. , 0. , 0. , 0. , 0. , 0.1]])

๋ถ„๋ฅ˜๊ธฐ ์„ฑ๋Šฅ code

cross_val_score(sgd_clf, X_train, y_train, cv= 3, scoring = 'accuracy')
>>array([0.8639 , 0.82125, 0.8595 ])

์„ฑ๋Šฅ ํ–ฅ์ƒ: ์ž…๋ ฅ ์Šค์ผ€์ผ์„ ํ‘œ์ค€ํ™”ํ•ด ์ •ํ™•๋„๋ฅผ ๋†’์ž„.

from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train.astype(np.float64))
cross_val_score(sgd_clf, X_train_scaled, y_train, cv = 3, scoring = 'accuracy')
>>array([0.90985, 0.9104 , 0.9097 ])

3.5 ์—๋Ÿฌ ๋ถ„์„

1. ์˜ค์ฐจ ํ–‰๋ ฌ ๋ถ„์„ : ๋ถ„๋ฅ˜๊ธฐ์˜ ์„ฑ๋Šฅ ํ–ฅ์ƒ ๋ฐฉ์•ˆ์— ๋Œ€ํ•œ ํ†ต์ฐฐ์„ ์–ป์„ ์ˆ˜ ์žˆ๋‹ค.

  • ์ผ๋‹จ ์˜ค์ฐจ ํ–‰๋ ฌ ๋งŒ๋“ค์–ด๋ณด๊ธฐ:
y_train_pred = cross_val_predict(sgd_clf, X_train_scaled, y_train, cv=3)
conf_mx = confusion_matrix(y_train, y_train_pred)
conf_mx

์ด๋•Œ ์˜ค์ฐจ ํ–‰๋ ฌ์„ ์ด๋ฏธ์ง€๋กœ ๋ณด๊ณ  ์‹ถ๋‹ค๋ฉด:

plt.matshow(conf_mx, cmap=plt.cm.gray)
plt.show()

์ด ์˜ค์ฐจ ํ–‰๋ ฌ์€ ๋Œ€๋ถ€๋ถ„์˜ ์ด๋ฏธ์ง€๊ฐ€ ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ๋ถ„๋ฅ˜๋˜์—ˆ์Œ์„ ๋‚˜ํƒ€๋‚ด๋Š” ์ฃผ๋Œ€๊ฐ์„ ์— ์œ„์น˜ํ•˜๋ฏ€๋กœ ๋งค์šฐ ์ข‹๊ฒŒ ๋ณด์ธ๋‹ค. (๋ฐฐ์—ด์—์„œ ๊ฐ€์žฅ ํฐ ๊ฐ’์€ ํฐ์ƒ‰์œผ๋กœ, ๊ฐ€์žฅ ์ž‘์€ ๊ฐ’์€ ๊ฒ€์€์ƒ‰์œผ๋กœ ๋‚˜ํƒ€๋‚จ)

๋‹ค๋งŒ ์ˆซ์ž 5๋Š” ์ƒ๋Œ€์ ์œผ๋กœ ์–ด๋‘์›Œ ๋ณด์ด๋Š”๋ฐ, ์ด๋Š” 1) ๋ฐ์ดํ„ฐ์…‹์— ์ˆซ์ž 5์˜ ์ด๋ฏธ์ง€๊ฐ€ ์ ๊ฑฐ๋‚˜, ํ˜น์€ 2) ๋ถ„๋ฅ˜๊ธฐ๊ฐ€ ์ˆซ์ž 5๋ฅผ ๋‹ค๋ฅธ ์ˆซ์ž๋ณด๋‹ค ์ž˜ ๋ถ„๋ฅ˜ํ•˜์ง€ ๋ชปํ•œ๋‹ค๋Š” ๋œป์ด๋‹ค. ์›์ธ์„ ํ™•์ธํ•˜๊ธฐ ์œ„ํ•ด ์—๋Ÿฌ ๋น„์œจ์„ ๋น„๊ตํ•ด๋ณธ๋‹ค.

 

์—๋Ÿฌ ๋น„์œจ ๋น„๊ต

row_sums = conf_mx.sum(axis=1, keepdims=True)
norm_conf_mx = conf_mx / row_s๋‹ˆms

np.fill_diagonal(norm_conf_mx, 0)  # ์ฃผ๋Œ€๊ฐ์„  0์œผ๋กœ ์ฑ„์›€
pit.matshow(norm_conf_mx, cmap=plt.cm.gray)
plt.show ()

ํ–‰์€ ์‹ค์ œ ํด๋ž˜์Šค, ์—ด์€ ์˜ˆ์ธกํ•œ ํด๋ž˜์Šค๋ฅผ ๋‚˜ํƒ€๋‚ธ๋‹ค.

์œ„ ์ด๋ฏธ์ง€์—์„œ ๋ฐ์€ ์—ด(ํŠนํžˆ 8, 9์—ด)์€ ๋งŽ์€ ์ด๋ฏธ์ง€๊ฐ€ ํ•ด๋‹น ์ˆซ์ž๋กœ ์ž˜๋ชป ์˜ˆ์ธก๋˜์—ˆ์Œ์„, ๋ฐ์€ ํ–‰(ํŠนํžˆ 8, 9ํ–‰)์€ ํ•ด๋‹น ์ˆซ์ž๊ฐ€ ๋‹ค๋ฅธ ์ˆซ์ž๋“ค๊ณผ ํ˜ผ๋ˆ์ด ์ž์ฃผ ๋œ๋‹ค๋Š” ๊ฒƒ์„ ์˜๋ฏธํ•œ๋‹ค.

 

  • ์˜ค์ฐจ ํ–‰๋ ฌ ๋ถ„์„:

์œ„ ๊ทธ๋ž˜ํ”„๋ฅผ ๋ถ„์„ํ•˜๋ฉด ๋ถ„๋ฅ˜๊ธฐ์˜ ์„ฑ๋Šฅ ํ–ฅ์ƒ ๋ฐฉ์•ˆ์— ๋Œ€ํ•œ ํ†ต์ฐฐ์„ ์–ป์„ ์ˆ˜ ์žˆ๋‹ค. ๊ฐ€์žฅ ๋ˆˆ์— ๋„๋Š” ๋ฐ์€ ๋ถ€๋ถ„๋“ค์„ ๋ถ„์„ํ•ด๋ณด๋ฉด ์•„๋ž˜์™€ ๊ฐ™์€ **๊ฐœ์„ ์ (๋ฌธ์ œ์ )**์„ ์ฐพ์„ ์ˆ˜ ์žˆ๋‹ค.

  1. 3๊ณผ 5๊ฐ€ ์„œ๋กœ ํ˜ผ๋ˆ๋˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ๋‹ค.
  2. ๋ถ„๋ฅ˜๊ธฐ๊ฐ€ 8, 9๋ฅผ ์ž˜ ๋ถ„๋ฅ˜ํ•˜์ง€ ๋ชปํ•œ๋‹ค.

์ด์— ๋Œ€ํ•œ ๊ฐœ์„ ๋ฐฉ์•ˆ์œผ๋กœ๋Š”:

  1. ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ๋ฅผ ๋” ๋ชจ์€๋‹ค.
  2. ๋ถ„๋ฅ˜๊ธฐ์— ๋„์›€ ๋  ๋งŒํ•œ ํŠน์„ฑ์„ ๋” ์ฐพ์•„๋ณธ๋‹ค.
  3. ์–ด๋–ค ํŒจํ„ด์ด ๋“œ๋Ÿฌ๋‚˜๋„๋ก (Scikit-Image, Pillow, OpenCV ๋“ฑ์„ ์‚ฌ์šฉํ•ด์„œ) ์ด๋ฏธ์ง€๋ฅผ ์ „์ฒ˜๋ฆฌํ•œ๋‹ค.

2. ๊ฐœ๊ฐœ์˜ ์—๋Ÿฌ ๋ถ„์„ : ๋ถ„๋ฅ˜๊ธฐ๊ฐ€ ๋ฌด์Šจ ์ผ์„ ํ•˜๊ณ  ์žˆ๊ณ , ์™œ ์ž˜๋ชป๋˜์—ˆ๋Š”์ง€์— ๋Œ€ํ•ด ํ†ต์ฐฐ์„ ์–ป์„ ์ˆ˜ ์žˆ๋‹ค.

๋‹จ, ๋” ์–ด๋ ต๊ณ  ์‹œ๊ฐ„์ด ์˜ค๋ž˜ ๊ฑธ๋ฆฐ๋‹ค. ์˜ˆ์‹œ๋กœ 3๊ณผ 5์˜ ์ƒ˜ํ”Œ์„ ์‚ดํŽด๋ณธ๋‹ค.

  • 3๊ณผ 5์˜ ์ƒ˜ํ”Œ ๊ทธ๋ฆฌ๊ธฐ:
cl_a, cl_b = 3, 5
X_aa = X_train[(y_train == cl_a) & (y_train_pred == cl_a)]
X_ab = X_train[(y_train == cl_a) & (y_train_pred == cl_b)]
X_ba = X_train[(y_train == cl_b) & (y_train_pred == cl_a)]
X_bb = X_train[(y_train == cl_b) & (y_train_pred == cl_b)]
p it.figure(figsize=(8,8))
plt.subplot(221); plot_digits(X_aa[:25], images_per_row=5)
plt.subplot(222); plot_digits(X_ab[:25], images_per_row=5)
plt.subplot(223); plot_digits(X_ba[:25], images_per_row=5)
plt.subplot(224); plot_digits(X_bb[:25], images_per_row=5)
plt.show()

์™ผ์ชฝ์˜ ๋ธ”๋ก ๋‘ ๊ฐœ๋Š” 3์œผ๋กœ, ์˜ค๋ฅธ์ชฝ ๋ธ”๋ก ๋‘ ๊ฐœ๋Š” 5๋กœ ๋ถ„๋ฅ˜๋œ ์ด๋ฏธ์ง€์ด๋‹ค. ์™ผ์ชฝ ์•„๋ž˜ ๋ธ”๋ก, ์˜ค๋ฅธ์ชฝ ์œ„ ๋ธ”๋ก์€ ์ž˜๋ชป ๋ถ„๋ฅ˜๋œ ์ด๋ฏธ์ง€๋กœ ๋ถ„๋ฅ˜๊ธฐ๊ฐ€ ์‹ค์ˆ˜ํ•œ ๊ฒƒ์ด๋‹ค.

  • ๋ถ„์„:

๋ถ„๋ฅ˜๊ธฐ๊ฐ€ ์‹ค์ˆ˜ํ•œ ์›์ธ์€ ์„ ํ˜• ๋ชจ๋ธ์ธ SGDClassifier๋ฅผ ์‚ฌ์šฉํ–ˆ๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค.

์„ ํ˜• ๋ถ„๋ฅ˜๊ธฐ๋Š” ํด๋ž˜์Šค๋งˆ๋‹ค ํ”ฝ์…€์— ๊ฐ€์ค‘์น˜๋ฅผ ํ• ๋‹นํ•˜๊ณ  ์ƒˆ๋กœ์šด ์ด๋ฏธ์ง€์— ๋Œ€ํ•ด ๋‹จ์ˆœํžˆ ํ”ฝ์…€ ๊ฐ•๋„์˜ ๊ฐ€์ค‘์น˜ ํ•ฉ์„ ํด๋ž˜์Šค์˜ ์ ์ˆ˜๋กœ ๊ณ„์‚ฐํ•œ๋‹ค. ์ด๋•Œ 3๊ณผ 5๋Š” ๋ช‡ ๊ฐœ์˜ ํ”ฝ์…€๋งŒ ๋‹ค๋ฅด๊ธฐ ๋•Œ๋ฌธ์— ๋ชจ๋ธ์ด ์‰ฝ๊ฒŒ ํ˜ผ๋™ํ•˜๊ฒŒ ๋˜๋Š” ๊ฒƒ์ด๋‹ค.

ํ•ด๊ฒฐ ๋ฐฉ๋ฒ•์œผ๋กœ๋Š” ์ด๋ฏธ์ง€๋ฅผ ์ค‘์•™์— ์œ„์น˜์‹œํ‚ค๊ณ  ํšŒ์ „๋˜์–ด ์žˆ์ง€ ์•Š๋„๋ก ์ „์ฒ˜๋ฆฌํ•˜๋Š” ๊ฒƒ์„ ๋“ค ์ˆ˜ ์žˆ๋‹ค. ๋ถ„๋ฅ˜๊ธฐ๋Š” ์ด๋ฏธ์ง€์˜ ์œ„์น˜๋‚˜ ํšŒ์ „ ๋ฐฉํ–ฅ์— ๋งค์šฐ ๋ฏผ๊ฐํ•˜๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค.

3.6 ๋‹ค์ค‘ ๋ ˆ์ด๋ธ” ๋ถ„๋ฅ˜

1. ๋‹ค์ค‘ ๋ ˆ์ด๋ธ” ๋ถ„๋ฅ˜๋ž€?

๋‹ค์ค‘ ๋ ˆ์ด๋ธ” ๋ถ„๋ฅ˜๋ž€ ์—ฌ๋Ÿฌ ๊ฐœ์˜ ์ด์ง„ ๋ ˆ์ด๋ธ”(ํด๋ž˜์Šค)์„ ์ถœ๋ ฅํ•˜๋Š” ๋ถ„๋ฅ˜ ์‹œ์Šคํ…œ์„ ๋งํ•œ๋‹ค. ํ•œ ๊ฐœ์˜ ์ƒ˜ํ”Œ์— ์—ฌ๋Ÿฌ ๊ฐœ์˜ ์ด์ง„ ๋ ˆ์ด๋ธ”์ด ํ• ๋‹น๋˜๋Š” ๊ฒƒ์ด๋‹ค.

2. ์˜ˆ์‹œ

ํ•œ ๊ฐœ์˜ ์ˆซ์ž ์ด๋ฏธ์ง€์— ๋Œ€ํ•ด 1)ํฐ ๊ฐ’(7, 8, 9)์ธ์ง€, 2) ํ™€์ˆ˜์ธ์ง€ ์—ฌ๋ถ€๋ฅผ True, False์˜ ์ด์ง„ ๋ ˆ์ด๋ธ”๋กœ์จ ํŒ๋ณ„ํ•ด์ฃผ๋Š” ๋ชจ๋ธ์„ ํ›ˆ๋ จ์‹œํ‚ค๊ณ  ์˜ˆ์ธก์„ ๋งŒ๋“ค์–ด๋ณธ ํ›„ ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ•ด๋ณธ๋‹ค.

#๋ชจ๋ธ ํ›ˆ๋ จ ๋ฐ ์˜ˆ์ธก
from sklearn.neighbors import KNeighborsClassifier

y_train_large = (y_train >= 7)
y_train_odd = (y_train % 2 == 1)
y_multilabel = np.c_[y_train_large, y_train_odd]

knn_crf = KNeighborsClassifier()
knn_clf.fit(X_train, y_multilabel)


# ์˜ˆ์ธก ๋งŒ๋“ค๊ธฐ
knn_clf.predict([some_digit]) 

# ์ถœ๋ ฅ๊ฐ’(์ˆซ์ž 5๋ฅผ ์ž…๋ ฅํ•œ ๊ฒฝ์šฐ): array([[False, True]], dtype=bool)
  • ๋ชจ๋ธ ํ‰๊ฐ€ : ๋‹ค์ค‘ ๋ ˆ์ด๋ธ” ๋ถ„๋ฅ˜๊ธฐ๋ฅผ ํ‰๊ฐ€ํ•˜๋Š” ์ ์ ˆํ•œ ์ง€ํ‘œ๋Š” ํ”„๋กœ์ ํŠธ์— ๋”ฐ๋ผ ๋‹ค๋ฅด๋‹ค. ์ด ๋ชจ๋ธ์˜ ๊ฒฝ์šฐ์—๋Š” ๋ชจ๋“  ๋ ˆ์ด๋ธ”์— ๋Œ€ํ•œ F1 ์ ์ˆ˜์˜ ํ‰๊ท ์„ ๊ณ„์‚ฐํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์‚ฌ์šฉํ•ด๋ณธ๋‹ค.
  •  
y_train_knn_pred = cross_val_predict(knn_clf, X_train, y_multilabel, cv=3,
n_jobs=-1)
f1_score(y_multilabel, y_train_knn_pred, average="macro")

# ์ถœ๋ ฅ๊ฐ’: 0.97709078477525

์œ„ ์ฝ”๋“œ๋Š” ๋ชจ๋“  ๋ ˆ์ด๋ธ”์˜ ๊ฐ€์ค‘์น˜๊ฐ€ ๊ฐ™๋‹ค๊ณ  ๊ฐ€์ •ํ•˜์˜€์œผ๋ฏ€๋กœ f1_score์˜ ํŒŒ๋ผ๋ฏธํ„ฐ average=”macro” ๋กœ ์„ค์ •ํ•˜์˜€๋‹ค.

๊ฐ ๋ ˆ์ด๋ธ”์˜ ๊ฐ€์ค‘์น˜์— ์ฐจ์ด๋ฅผ ๋‘๊ณ  ์‹ถ๋‹ค๋ฉด average=”weighted”๋กœ ์„ค์ •ํ•˜๋ฉด ๋œ๋‹ค. ์ด๋Š” ๋ ˆ์ด๋ธ”์— ํด๋ž˜์Šค์˜ ์ง€์ง€๋„(ํƒ€๊นƒ ๋ ˆ์ด๋ธ”์— ์†ํ•œ ์ƒ˜ํ”Œ ์ˆ˜)๋ฅผ ๊ฐ€์ค‘์น˜๋กœ ์ฃผ๋Š” ๊ฒƒ์„ ์˜๋ฏธํ•œ๋‹ค.

 

3.7 ๋‹ค์ค‘ ์ถœ๋ ฅ ๋ถ„๋ฅ˜

1. ๋‹ค์ค‘ ์ถœ๋ ฅ ๋ถ„๋ฅ˜๋ž€?

๋‹ค์ค‘ ์ถœ๋ ฅ ๋ถ„๋ฅ˜, ํ˜น์€ ๋‹ค์ค‘ ์ถœ๋ ฅ ๋‹ค์ค‘ ํด๋ž˜์Šค ๋ถ„๋ฅ˜๋ž€ ๋‹ค์ค‘ ๋ ˆ์ด๋ธ” ๋ถ„๋ฅ˜์—์„œ ํ•œ ๋ ˆ์ด๋ธ”์ด ๋‹ค์ค‘ ํด๋ž˜์Šค๊ฐ€ ๋  ์ˆ˜ ์žˆ๋„๋ก ์ผ๋ฐ˜ํ™”ํ•œ ๊ฒƒ์ด๋‹ค. ์ฆ‰ ์—ฌ๋Ÿฌ ๊ฐœ์˜ ๋‹ค์ค‘ ๋ ˆ์ด๋ธ”์„ ์ถœ๋ ฅํ•  ์ˆ˜ ์žˆ๋‹ค.

2. ์˜ˆ์‹œ

์ด๋ฏธ์ง€์—์„œ ๋…ธ์ด์ฆˆ๋ฅผ ์ œ๊ฑฐํ•˜๋Š” ์‹œ์Šคํ…œ์„ ๋งŒ๋“ค์–ด๋ณธ๋‹ค. ํ•œ ๊ฐœ์˜ ์ž…๋ ฅ(๋…ธ์ด์ฆˆ๊ฐ€ ๋งŽ์€ ์ˆซ์ž ์ด๋ฏธ์ง€)์€ ์—ฌ๋Ÿฌ ๊ฐœ์˜ ๋ ˆ์ด๋ธ”(์—ฌ๋Ÿฌ ๊ฐœ์˜ ํ”ฝ์…€)์— ๋Œ€ํ•ด ์—ฌ๋Ÿฌ ๊ฐœ์˜ ๊ฐ’(0๋ถ€ํ„ฐ 255๊นŒ์ง€ ํŽ™์…€ ๊ฐ•๋„)์„ ๊ฐ€์ง„๋‹ค.

๋จผ์ € MNIST ์ด๋ฏธ์ง€์—์„œ ์ถ”์ถœํ•œ ํ›ˆ๋ จ ์„ธํŠธ์™€ ํ…Œ์ŠคํŠธ ์„ธํŠธ์— ๋…ธ์ด์ฆˆ๋ฅผ ์ถ”๊ฐ€ํ•œ๋‹ค. ํƒ€๊นƒ ์ด๋ฏธ์ง€๋Š” ์›๋ณธ ์ด๋ฏธ์ง€๊ฐ€ ๋œ๋‹ค.

noise = rnd.randint(0, 100, (len(X_train), 784))  # [0, 100)์˜ ๋ฒ”์œ„์—์„œ (len(X_train), 784) ํ˜•ํƒœ์˜ ์ž„์˜์˜ ์ •์ˆ˜ ๋ฐฐ์—ด ์ƒ์„ฑ
X_train_mod = X_train + noise
noise = rnd.randint(0, 100, (len(X_test), 784))
X_test_mod = X_test + noise
y_train_mod = X_train
y_test_mod = X_test

ํ…Œ์ŠคํŠธ ์„ธํŠธ์—์„œ ์ด๋ฏธ์ง€๋ฅผ ํ•˜๋‚˜ ์„ ํƒํ•˜๊ณ , ๋ถ„๋ฅ˜๊ธฐ๋ฅผ ํ›ˆ๋ จ์‹œ์ผœ ์ด ์ด๋ฏธ์ง€๋ฅผ ๊นจ๋—ํ•˜๊ฒŒ ๋งŒ๋“ค์–ด ๋ณธ๋‹ค. ๋ชฐ๋ž์ง€ ์ƒ์„ฑํ˜• ๋ชจ๋ธ์„ ๊ณต๋ถ€ํ•˜๋ฉด์„œ ๋‹ค์‹œ ์ด MNIST๋ฅผ ๋งŒ๋‚˜๊ฒŒ ๋  ๊ฑฐ๋ผ๊ณ ๋Š”

knn_clf.fit(X_train_mod, y_train_mod)
clean_digit = knn_clf.predict([X_test_mod[some_index]])
plot_digit(clean_digit)

ํƒ€๊นƒ๊ณผ ๋งค์šฐ ๋น„์Šทํ•˜๊ฒŒ ๊นจ๋—ํ•œ ์ด๋ฏธ์ง€๋กœ ์ž˜ ์ถœ๋ ฅ๋œ ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.