This project is read-only.

Welcome to the WordNet::SQLConverter Project



Introduction

WordNet is an excellent lexical database of English developed by the Princeton University. Besides that it's also free, which means anyone can benefit from it. The problem is that the database is stored as a set of plain text files. This significantly slows down queries against it. My master thesis project made use of WordNet and because of the short amount of time which I had at my disposal, I had to use the standard distribution. This constituted a major bottleneck. That's what prompted me to write WordNet::SQLConverter, an application that converts those plain text files into a SQL database.

Project description

WordNet::SQLConverter is a GUI application, written in C#, that parses all the WordNet plain text files (index.*, data.*, *.exc, *.vrb, lexnames) and produces an SQL database. Currently, it can only use Microsoft SQL Server for storage, but I plan on implementing support in the future for all the major databases (MySQL, Postgresql, Sqlite, Oracle). The result of the conversion process is the database structure shown below:

wndbstruct.jpg

Not only does this make normal WordNet operations much faster, but it also allows any kind of SQL queries to be executed against the database's tables.

Remarks
  • WordNet::SQLConverter uses GDAlib for all database interactions.
  • WordNet::SQLConverter uses NLog to log important events.

Planned features
  • Support for other databases:
    • MySQL
    • Postgresql
    • Sqlite
    • Oracle
  • Support for other language resources:
    • VerbNet
    • SentiWordNet
    • WordNet Domains
    • Stanford WordNet

Author

The WordNet::SQLConverter project is currently developed solely by Alex Gentea

Last edited Sep 15, 2009 at 9:22 PM by hancock, version 9