normalisation; document clustering; subjects; keywords; scale dataset; partition; text clustering