Feat/c-value algorithm (!43) · Merge requests · Sesboue Matthias / ontology-learning

Sesboue Matthias requested to merge feat/c-value into olms2 Mar 02, 2023

Code for the C-value algorithm and its tests.

Compared to the previous version, I made the following decisions that can be discussed:

I simplified the term tokenisation to use based the algorithm on a space tokeniser. It leads to a potential difference between the extracted term strings and their actual form in the corpus. I made sure to log a warning if this could happen.
I thought that tagging the extracted candidate terms in the corpus (e.g., creating linguistic realisations) should be the responsibility of the term extraction pipeline component.

Feat/c-value algorithm