TIIE EFFECT OF RATIO BETWEEN TRAINING AI'iD TESTING SETS IN MODEL SELECTION FOR NEURAL NETWORKS CLASSIFICATION

Dctennining the large ofsample sia is hig}ly contexr dependent- In genemt, the larger the din)ension of paramel€r, the larger sample size must be to obtain a given deeree of approximatio.. In nan! cases ofNN application, the data set is randomly split inio two mnrually exclu...

Full description

Bibliographic Details
Main Authors: rezeki, sri, Subanar, Subanar, Suryo, Guritno
Format: Article
Language:English
Published: 2006
Subjects:
Online Access:https://repository.ugm.ac.id/32917/1/2.pdf
Description
Summary:Dctennining the large ofsample sia is hig}ly contexr dependent- In genemt, the larger the din)ension of paramel€r, the larger sample size must be to obtain a given deeree of approximatio.. In nan! cases ofNN application, the data set is randomly split inio two mnrually exclusive subs€rs, i.e. lraining and lesriog sets. The firsl is used for model building, \vhile the second is used ro assess lhe pe.formanca (sefleralization) ofrhe model- Both training and resting sets are the sarne size. In this paper, frve difTerent datn panirioning is utilized to rest vheih€r the prcdiciion ability of NN is affecred by rb€ number ofobservalion in taining set. The purpose oflhis papcl is to cvduate rhe effect of mlio betwe€n rraining nnd t€sring s€ts ro the ftisclassificalion rate in neu.al networks (NN) model. An e pirical sludy has been done by using Fishefs jris data. The results show ihat the hrhclassification ratc decrease when the number oftraining set increase. Model with 2 hidden neuron obtains minimum error when the ratio oftraining set is 20%. Wher$s model with I hidden neuron yieldq rhc minirnum eror ar th e .atio of rrain ing set 400/0, 50o/a, 600/0 and 80d/o