Skip to contents

In quantitative text analysis, the cost of training supervised machine learning models tend to be very high when the corpus is large. Latent Semantic Scaling (LSS) is a semi-supervised document scaling technique that I developed to perform large scale analysis at low cost. Taking user-provided seed words as weak supervision, it estimates polarity of words in the corpus by latent semantic analysis and locates documents on a unidimensional scale (e.g. sentiment).


From CRAN:

From Github:



Please visit the package website to understand the usage of the functions:

Please read the following papers for the algorithm and methodology, and its application to non-English texts (Japanese and Hebrew):

Other publications

LSS has been used for research in various fields of social science.

More publications are available on Google Scholar.