Research

Multi-Lingual Noun Phrase Extractor (MuNPEx) v1.0 for GATE released

Semantic Software Lab - Tue, 2010-08-10 10:44

MuNPEx 1.0The noun phrase chunker MuNPEx (Multi-Lingual Noun Phrase Extractor) is now available in the new and improved release v1.0. MuNPEx is a base NP chunker for the GATE framework and implemented in JAPE. It is fast, robust, customizable, well-tested and currently supports English, German, and French (with Spanish in beta).

Major changes in this release:

  • Limited number of pre- and post-head modifiers to make MuNPEx more robust on certain kinds of input (like a long list of tags or menu entries when processing web pages)
  • New optional grammars to add a HEAD_LEMMA slot to an NP annotation, with the lemma extracted from the GATE morphological analyser (for English), the Durm Lemmatizer (for German), or the TreeTagger (for German, Spanish, French)
  • DET/MOD/HEAD/MOD2 slots are now stored as strings (rather than Content objects) to make them easier to export and compatible with the new Predicate-Argument Extractor (PAX) component
  • other code cleanup and improvements
  • no longer labeled as "beta" -- five years of testing ought to be enough, we're not Google ;-)

For more details and the download, please visit the MuNPEx page.

read more

Categories: Research

Running MutationFinder in GATE using the TaggerFramework PR

Semantic Software Lab - Sat, 2010-07-24 13:46

MutationFinder is a freely available resource for tagging mutations in biomedical texts. However, it cannot be directly integrated into a text mining pipeline when using the General Architecture for Text Engineering (GATE) framework. Here, I show how to make it available to GATE users using the standard TaggerFramework component, which requires some text wrangling.

read more

Categories: Research

Early bird registration deadline extended for 3rd GATE training course!

Semantic Software Lab - Thu, 2010-07-08 11:11
Start: 2010-07-21 23:59 Timezone: America/Montreal Start: 2010-07-21 23:59 Timezone: America/Montreal

***Early bird registration has been extended until 21 July!***

The third GATE training course will take place at Concordia University
in Montréal, Canada, from August 30th to September 3rd 2010. This event
will follow the format of the earlier May 2010 course, but with the
addition of a new training track covering linked data and ontologies.

Further details on the material to be covered:
https://gate.ac.uk/family/training.html

Registration, travel and accommodation:
https://gate.ac.uk/conferences/montreal-2010/index.html

read more

Categories: Research

NLDB 2010

Semantic Software Lab - Mon, 2010-06-28 18:09

Rene giving a talk at NLDB 2010 in Cardiff. We presented our work on Semantic Content Access using Domain-Independent NLP Ontologies.Rene@NLDB2010

Categories: Research

New Javadoc Doclet for NLP Analysis on Java Source Code

Semantic Software Lab - Thu, 2010-05-27 08:56

For those interested in performing NLP on source code, in particular Javadoc comments, we just released a Doclet at the NLP Frameworks workshop last week.

Its main feature is that it creates an XML corpus from Java source code that is optimised for processing in an NLP Framework (GATE in our case, but it should work for any framework that takes XML as input).

read more

Categories: Research

New GATE PR: The Predicate-Argument Extractor (PAX)

Semantic Software Lab - Wed, 2010-05-26 02:34

At the LREC workshop New Challenges for NLP Frameworks we released a new component for GATE: The Predicate-Argument Extractor (PAX).

read more

Categories: Research
Syndicate content