Civet - From Content to Linked Open Data
Share this Session:
  Sebastian Kruk   Sebastian Ryszard Kruk
CEO
Knowledge Hives sp. z o.o.
http://www.sebastiankruk.com/
 
  Arkadiusz Kwoska   Arkadiusz Kwoska
Vice President
Knowledge Hives sp. z o.o.
 


 

Wednesday, June 8, 2011
11:15 AM - 11:40 AM
Level:  Product/Service Offering

Location:  Franciscan C

Civet uses NLP techniques to determine the keywords in the provided text. But it does not stop there: it analyzes these words and determines the most appropriate meanings linked to concepts from vocabularies, such as WordNet and Polish version of the OpenThesaurus, both published as RDF/SKOS graphs on Linked Open Data.

Civet also discovers names of people and places, and uses other LOD sources, such as DBPedia or GeoNames, to annotate these names with concepts from those vocabularies.

The initial plain text becomes an RDFa document with key words, phrases and names referencing Linked Open Data concepts. Civet uses repository of previously analyzed texts to recommend semantically similar texts to the given one.

Civet was built to work with Polish texts, but we currently work on the English and Italian version of the service.

We plan the presentation as follows:

  • Discuss the main design objectives of both the annotation and the vocabulary publishing service
  • Give an overview of the algorithms we use to extract the meaningful concepts from given text
  • Present how to turn legacy vocabularies into RDF/SKOS graphs
  • Discuss various pitfalls one can stumble upon when building such services


dr. Sebastian Ryszard Kruk is the co-founder and CEO of KnowledgeHives.com, a Web 3.0 startup. Previously, he was a PhD student, post-doc researcher, and project development manager at DERI NUI Galway. His main areas of interest cover Semantic Web and social networking technologies, digital libraries, knowledge management, information retrieval, security and distributed computing. Together with prof. Stefan Decker he transformed the early prototype of a semantic digital library into the JeromeDL project.

He contributes to the Open Source community. At his blog http://www.semanticschool.com/ he created a step by step tutorial to Semantic Web; the blog, primarily written in Polish, is now being translated into English, Spanish and Korean. He received the best paper award for the paper "Semantically Enhanced Search Services in Digital Libraries" at the International Conference on Internet and Web Applications and Services, 2006.

Arek is an active entrepreneur, manager and trainer in the information technology and the project management.

He is the adviser to the board of Knowledge Hives sp. z o.o.

He is the former VP of Invertico, a Polish company focuses on implementing and training in project management methodology and supporting tools; he is advising a Polish wireless telecom, Play, in mobile and portal solutions. He holds M.Sc. in Computer Science from Gdansk University of Technology, Poland (awarded with distinction). Arek is an experienced manager focused on setting up companies and fostering their early stages of development.


   
Close Window