NeuroWhAI의 잡블로그
[Rust] 매개변수 갱신법들 - '밑바닥부터 시작하는 딥러닝' 6장 본문
use std::collections::HashMap;
use rulinalg::matrix::{Matrix, BaseMatrix, BaseMatrixMut};
use common::utils;
pub trait Optimizer {
fn update(&mut self, param: &mut Matrix<f32>, grad: &Matrix<f32>);
}
pub struct SGD {
lr: f32,
}
impl SGD {
pub fn new(lr: f32) -> Self {
SGD { lr: lr }
}
}
impl Optimizer for SGD {
fn update(&mut self, param: &mut Matrix<f32>, grad: &Matrix<f32>) {
*param -= grad * self.lr;
}
}
pub struct Momentum {
lr: f32,
momentum: f32,
v: HashMap<*const Matrix<f32>, Matrix<f32>>,
}
impl Momentum {
pub fn new(lr: f32, momentum: f32) -> Self {
Momentum {
lr: lr,
momentum: momentum,
v: HashMap::new(),
}
}
}
impl Optimizer for Momentum {
fn update(&mut self, param: &mut Matrix<f32>, grad: &Matrix<f32>) {
let key = param as *const Matrix<f32>;
if !self.v.contains_key(&key) {
self.v.insert(key, Matrix::zeros(param.rows(), param.cols()));
}
if let Some(velocity) = self.v.get_mut(&key) {
// NOTE: &*velocity : &mut T -> &T
*velocity = &*velocity * self.momentum - grad * self.lr;
*param += &*velocity;
}
}
}
pub struct AdaGrad {
lr: f32,
h: HashMap<*const Matrix<f32>, Matrix<f32>>,
}
impl AdaGrad {
pub fn new(lr: f32) -> Self {
AdaGrad {
lr: lr,
h: HashMap::new(),
}
}
}
impl Optimizer for AdaGrad {
fn update(&mut self, param: &mut Matrix<f32>, grad: &Matrix<f32>) {
let key = param as *const Matrix<f32>;
if !self.h.contains_key(&key) {
self.h.insert(key, Matrix::zeros(param.rows(), param.cols()));
}
if let Some(h) = self.h.get_mut(&key) {
*h += grad.elemul(grad);
// NOTE: &*h : &mut T -> &T
let sqrt_h = utils::copy_matrix(&*h)
.apply(&|value| value.sqrt() + 1e-7);
*param -= (grad * self.lr).elediv(&sqrt_h);
}
}
}
[Optimizer - SGD] Loss: 489.68884, Acc: 0.207, Test Acc: 0.232 Loss: 379.53986, Acc: 0.704, Test Acc: 0.674 Loss: 210.96933, Acc: 0.804, Test Acc: 0.788
... Loss: 66.81734, Acc: 0.911, Test Acc: 0.898 Loss: 65.67338, Acc: 0.911, Test Acc: 0.902 Loss: 64.588234, Acc: 0.915, Test Acc: 0.903 Final test acc: 0.919
[Optimizer - Momentum] Loss: 206.34421, Acc: 0.889, Test Acc: 0.882 Loss: 75.08393, Acc: 0.912, Test Acc: 0.905 Loss: 61.861897, Acc: 0.925, Test Acc: 0.921
... Loss: 14.900198, Acc: 0.98, Test Acc: 0.957 Loss: 14.088782, Acc: 0.981, Test Acc: 0.957 Loss: 13.3469095, Acc: 0.981, Test Acc: 0.958 Final test acc: 0.978
[Optimizer - AdaGrad] Loss: 92.81972, Acc: 0.916, Test Acc: 0.911 Loss: 56.537563, Acc: 0.927, Test Acc: 0.925 Loss: 49.170063, Acc: 0.933, Test Acc: 0.934
... Loss: 25.080034, Acc: 0.969, Test Acc: 0.962 Loss: 24.50684, Acc: 0.97, Test Acc: 0.962 Loss: 23.964062, Acc: 0.969, Test Acc: 0.962 Final test acc: 0.964
'개발 및 공부 > 알고리즘' 카테고리의 다른 글
[C++] KMP 알고리즘 구현 (0) | 2018.11.29 |
---|---|
[Rust] 드롭아웃(Dropout) - '밑바닥부터 시작하는 딥러닝' 6장 (0) | 2018.07.31 |
[Rust] 계층(레이어), 오차역전파 - '밑바닥부터 시작하는 딥러닝' 5장 (0) | 2018.07.25 |
[Rust] 계산 그래프 - '밑바닥부터 시작하는 딥러닝' 5장 (0) | 2018.07.16 |
[Rust] 2층 신경망으로 MNIST 학습 - '밑바닥부터 시작하는 딥러닝' 4장 (0) | 2018.07.15 |
Comments