Text Mining Methods for Social Representation Analysis in Large Corpora

  • Jean-François Chartier Université de Québec à Montreal
  • Jean-Guy Meunier Université du Québec à Montréal

Abstract

With mass text digitization (digital libraries, web, etc.), a huge amount of empirical data is now available for scientific inquiry. In social sciences and humanities, the use of statistical text mining methods to analyze these data has become unavoidable. Saadi Lahlou proposed in the mid-90s a coherent framework for the application of these methods to the study of social representation in large corpora. However, despite this initiative, text mining methods have remained marginal in this research program, partly due to a poor understanding of its methodological and theoretical assumptions. There are still many analyses which confound the software with the method. This paper presents an overview and a formalization of a statistical text mining method for the study of social representation, using Lahlou’s works as illustrations. The goal is to look into the software black box while analyzing the steps and the formal operations involved. The linguistic and methodological assumptions are made explicit and alternative algorithmic operationalizations are highlighted.

Author Biographies

Jean-François Chartier, Université de Québec à Montreal

JEAN-FRANÇOIS CHARTIER is a Ph.D. candidate in Cognitive and Computer Sciences at the Université du Québec à Montréal (UQÀM) and holds a master degree in Sociology. He is currently a researcher at the LANCI laboratory (Laboratoire d'analyse cognitive de l'information). He is also a Doctoral Fellow of the Social Sciences and Humanities Research Council (SSHRC) and of the Fonds Québécois de Recherche sur la Société et la Culture (FQRSC). His research interests are the cognitive models in social sciences and the computational methods in computer sciences.

Jean-Guy Meunier, Université du Québec à Montréal

Since 1970, JEAN-GUY MEUNIER’s research has been in Computer Assisted Reading and Analysis of Text (CARAT) in the Humanities. He is full professor at the Université du Québec à Montréal (UQÀM), director of the LANCI laboratory (Laboratoire d'analyse cognitive de l'information) and member of the International Academy of Philosophy of Science. His actual research is in Computer Assisted Conceptual Text Analysis

Published
2011-10-17