Our group (CIS, University of Munich, Germany) releases the first version of phrase embedding.

Existing embedding versions (e.g., that by Collobert et al, Turian at al, Mikolov et al….) are mostly about single words. We argue (in our ACL SRW 2014 paper) that conventional linguistic phrases, like “turn off”, “pick up”, “Cornell University”,  should also have embeddings because they usually occur with atomic semantics. This resource contains the embeddings (200-dimensional) for single words and most double-words phrases. The model was trained on English Giagwords and Wikipedia. We hope it will be useful to your research tasks. Given some typical phrases, we provide some top neighbors:

Phrases and nearest neighbors


If you use this resource in your research, please cite:

Wenpeng Yin, Hinrich Schuetze. An Exploration of Embeddings for Generalized Phrases. Student Research Workshop at The 52nd Annual Meeting of the Association for Computational Linguistics (ACL’14 SRW).