Extracting RadLex codes and descriptions from ontology


I was trying to find a flat list of RadLex terms that just had the IDs and descriptions. You can easily browse RadLex using the RadLex Term Browser,  but my requirement was to do a quick lookup using a local database. I couldn’t find something online that was readily available, so I wrote some code to extract just the Id, description fields from RadLex.owl (which you can download from http://www.radlex.org/). In the attached file you can find the list of 34k+ RadLex terms. It’s an Excel file, so you can easily import it into a SQL Server or MySQL  database. I can’t upload a tab delimited .txt file to WordPress.com (see http://en.support.wordpress.com/accepted-filetypes/), so  if anyone needs the file in a different format, just let me know.


8-Nov-2012 update: Seems like there’s a new SQL script that contains all RadLex terms – ftp://ftp.ihe.net/RadLex/documentation.html. I haven’t used it yet, but might be better/newer/more complete than the original Excel file I put together.

Advertisements
This entry was posted in General and tagged , , . Bookmark the permalink.

5 Responses to Extracting RadLex codes and descriptions from ontology

  1. An email I received from one radiologist:

    Mr. Thusitha Mabotuwana: I downloaded your .xlsx file about RadLex terms and descriptions. It´s fantastic! As you post in your blog I also couldn´t find this anywhere else. I installed protégé and tried to export in a different format (html) but I didn´t manage how to do it. I am a Radiologist in xxx, Argentina. I work in a pediatric hospital where there´s a new PACS and RIS being installed. I´m getting involved in HL7, DICOM, IHE and NLP. I´m interested in extracting information from the radiology reports and I´ve been working with NLP to throw a program for teaching files from xxx University). My new challenge will be translating all this to spanish. As the program works with RadLex it´s going to be needed a translation to spanish. Your file helps me a lot in this task. It´s going to be easier to do it with Google translator and then read it one term at a time to make any language corrections.
    I am doing my first steps in this field and your file is very useful for me, thanks and congratulations,

    Best regards,

    xxx, MD
    xxx
    Argentina

  2. Mantas says:

    There are certain things in the ontology that get lost in translation from its original format Protege Frames. If you want extract from the total ontology I suggest using the Protege API (java) http://mantascode.com/?p=507

    Also, Bioportal’s Annotator isn’t doing anything more special then a simple regex on the concept name. (do your own annotation, its faster)

    Have fun 😀

    • Thanks for the comments Mantas. Using the Java based Protege API is exactly what I did to extract the concepts (is there any other way btw?). I just said ‘I wrote some code to extract just the Id, description fields from RadLex.owl’ and this ‘code’ was indeed what you have suggested. Didn’t think anyone would be interested in the Java code that did this extraction, so I didn’t explicitly mention it or post the code. Perhaps I should!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s