Skip to content

balwantk/NamedEntityTagger

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Named Entity Tagger

A Ruby library for tagging named entities in text based on OpenNPL. As OpenNLP is written in Java, NamedEntityTagger needs to be run on JRuby.

Dependencies

In order to use NamedEntityTagger, you need to download and build a few java libraries. The jar files need to be placed in the deps directory. Furthermore, you need OpenNLP model data that must be placed under models.

Libraries

Model Data

Usage

NamedEntityTagger exposes a minimal API. The main class is EntityTagger and it provides the method #tag(text). This method takes the text that should be tagged and returns a new string where the words that were identified as named entities are highlighted. The encoding of the output is defined by a formatter object. Currently there is only the CSSClassAnnotationFormatter that adds span tags around the named entities. The span tag has a class corresponding to the model that matched the entity.

For example:

require 'lib/entity_tagger'
require 'lib/css_class_annotation_formatter' 
tagger = EntityTagger.new(CSSClassAnnotationFormatter.new)
tagger.tag("Mrs. Smith flew to Berlin")

=> "Mrs. <span class=\"person\">Smith</span> flew to <span class=\"location\">Berlin</span>"

About

A Ruby library for tagging named entities in text based on OpenNLP

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published