Make Knowledge More Meaningful

CogKGE

We propose a knowledge graph embedding toolkit, which aims to represent the multi-source and heterogeneous knowledge. We not only support the representations of entity-centered knowledge, but also support the representations of event-centric, commonsense and linguistic knowledge.

Install Usage

For Multi-source and Heterogeneous Knowledge Representation

Our goal is to provide a unified programming framework for KGE tasks and a series of knowledge representations for downstream tasks.

Models.
Datasets.
evaluation metrics.
knowledge adapters.
loss functions.
built-in data containers.

Key Features of CogKGE

We contribute an open source toolkit that can build a bridge between KGE models and multi-source heterogeneous data by plug-and-play knowledge adapters

Image

Multi-source and heterogeneous knowledge representation


CogKGE explores the unified representation of knowledge from diverse sources. Moreover, Our toolkit not only contains the triple fact-based embedding models, but also supports the fusion representation of additional information, including text descriptions, node types and temporal information.

Image

Comprehensive models and benchmark datasets


CogKGE implements lots of classic KGE models in the four categories of translation distance models, semantic matching models, graph neural network-based models and transformer-based models. Besides out-of-the-box models, we release two large benchmark datasets for further evaluating KGE methods, called EventKG240K and CogNet360K.

Image

Extensible and modularized framework


CogKGE provides a programming framework for KGE tasks. Based on the extensible architecture, CogKGE can meet the requirements of module extension and secondary development, and pre-trained knowledge embeddings can be directly applied to downstream tasks.



Image

Open source and visualization demo


Besides the toolkit, we also release an online CogKGE demo to discover knowledge visually. Source code, datasets and pre-trained embeddings are publicly available at GitHub.



Model Module

BaseModel class is the base class of all models in CogKGE

Translation Distance Models

The translation distance models use distance-based measures to compute the similarity score for a pair of entities and their relationships. In CogKGE, we implement several translational distance models, including TransE, TransH, TransR, TransD, TransA, BoxE and PairRE.

Semantic Matching Models

The semantic matching models use similarity-based score function of translation distance models. They measure plausibility of facts by matching latent semantics of entities and relations embodied in their vector space representations. RESCAL, SimpleIE, RotatE and TuckER have been built into CogKGE.

Graph Neural Network-based Models

Graph neural network (GNN) has recently been shown to be quite successful in modeling graph-structured data. Considering that KG itself happens to be a kind of graph-structured data, GNN can integrate the topological structure and node feature, then provides a more refined vector representation. We implement R-GCN and CompGCN to represent the multi-relational data.

Transformer-based Models

Transformer has been widely used in pre-trained language model and machine translation fields, its deep network architecture can learn contextual representations of entities and relations in a KG jointly by aggregating information from graph neighborhoods. Besides, transformer-based models can also utilize the text descriptions in KGs, encoding the texts and facts into a unified semantic space. CogKGE supports KEPLER, HittER.

Knowledge Module

Knowledge module mainly integrates three kinds of knowledge representation, namely world, commonsense and linguistic knowledge

World Knowledge

World KGs such as Freebase, DBpedia and Wikidata mainly focus on explicit world knowledge. In CogKGE, we implement entity-centric knowledge representation based on Wikidata and event-centric knowledge representation based on EventKG. World knowledge representations have been widely used in knowledge-enhanced pretrained language model, entity disambiguation and event extraction.

Commonsense Knowledge

Commonsense knowledge tries to capture implicit general facts and regular patterns in our daily life.Compared with world KG, nodes in commonsense KG are semantically-rich natural language phrases rather than entities, e.g., (Rocket, is used for , Fly to the moon). CogKGE supports the commonsense knowledge representation of ConceptNet, which can be helpful to improve commonsense completion and commonsense reasoning.

Linguistic Knowledge

Linguistic knowledge includes considerable information about lexical, conceptual and predicate argument semantics. For example, rocket has hyponymy relation to skyrocket in WordNet, while rocket can evoke a Change position on a scale frame in FrameNet. In CogKGE, the knowledge representation of Framenet can be provided for downstream tasks, such as word sense disambiguation and machine reading comprehension.

Target User