Shlomo Argamon, Moshe Koppel, Jonathan Fine and Anat R. Shimoni.
Gender, Genre, and Writing Style in Formal Written Texts.
A list of the texts used in the study may be found here.
A list of the texts used in our other study, reported on in:
Moshe Koppel, Shlomo Argamon, and Anat R. Shimoni.
Automatically Categorizing Written Texts by Author Gender
may be found here.
The lexical features used in both studies were a set of function words as well as many words not associated with a specific topic. They include many of the words usually used as "stopwords" in information retrieval. The list of 467 such features may be found here.
The POS n-grams used in both studies are listed here, and a description of the POS tags may be found here.
If you have any questions, do not hestitate to contact us.