Stav dette: Feature preprocessing on web page language identification /