Chimezie’s posterous

Beautiful Pictures from Mars

From University of Arizona Dept. of Planetary Sciences Lunar and
Planetary Laboratory (http://hirise.lpl.arizona.edu/)

   
Click here to download:
Beautiful_Pictures_from_Mars.zip (3424 KB)

Comments [0]

3 Dred Rastas

Courtesy of http://sebtikh.blogspot.com/2009/12/guerriers-canaques.html

Comments [0]

Mars? Nope, just Saudi Arabia as seen from a plane (pic)

Courtesy of Reddit.com / Flickr.com
http://www.flickr.com/photos/london/456573455 (highway veins)

Comments [0]

Reminds me of a scene from Crest of the Stars

Courtesy of http://www.teeteringbulb.com/?p=910

Comments [0]

Clinical Data Acquisition, Storage and Management

(download)

Pre-publication copy of entry. Springer requires placing the following notice:

"Chimezie Ogbuji's entry on Clinical Data Acquisition, Storage and Management will soon be published in the Encyclopedia of Database Systems by Springer. The Encyclopedia, under the editorial guidance of Ling Liu and M. Tamer Özsu, will be a multiple volume, comprehensive, and authoritative reference on databases, data management, and database systems. Since it will be available in both print and online formats, researchers, students, and practitioners will benefit from advanced search functionality and convenient interlinking possibilities with related online content. The Encyclopedia’s online version will be accessible on the platform SpringerLink.

Click here for more information about the Encyclopedia of Database Systems. 

Chimezie Ogbuji, "Clinical Data Acquisition, Storage and Management"
Encyclopedia of Database Systems, Editors-in-chief:
Özsu, M. Tamer; Liu, Ling , Springer, 2009.
(print and online)

Comments [1]

We got the Jazz.. We got the Jazz

Words from a Hip-Hop classic as an illustration

Comments [0]

A complete translation from SPARQL into efficient SQL

Our paper on a complete translation from SPARQL to SQL has been published (it looks like it has been on the ACM portal for some time). This is the basis for our RDF warehouse at the Cleveland Clinic that was recently committed back to rdflib.

This paper presents a feature-complete translation from SPARQL, the proposed standard for RDF querying, into efficient SQL. We propose "SQL model"-based algorithms that implement each SPARQL algebra operator via SQL query augmentation, and generate a flat SQL statement for efficient processing by relational database query engines. SPARQL-to-SQL translation presented is feature-complete, since it applies to all SPARQL language features. Finally, we demonstrate the performance and scalability of our method by an extensive evaluation using recent SPARQL benchmark queries, and a benchmark dataset, as well as a real-world photo dataset.

Comments [1]

Encyclopedia of Database Systems 2009 Entry (Clinical Data Acquisition, Storage and Management)

The Springer Encyclopedia of Database Systems 2009 has finally been published. This includes an entry I contributed: "Clinical Data Acquisition, Storage and Management". I believe I was told I can publish an author's proof on my website. I'll need to verify this before I do so. In the entry, I talked a bit about the role of XForms, GRDDL, XML, and RDF as infrastructure for healthcare information systems (towards the end).  Hopefully, I can find some time to elaborate a bit on this here.

Comments [0]

Lazy test and consumption of generators

So, I do a lot of design of RDF querying middleware and one of the tools of the trade that I have come to rely on quite a bit is the lazy handling of results. Consider a query to a large RDF dataset (with millions of rows). Generally, the naive approach would be to fetch all the answers from the server and then iterate over them at the client.

The lazy approach would instead fetch answers one at a time. Python generators are excellent for this and I've found myself using them judiciously in Python SPARQL results processing as well as in RDF/RIF/OWL inference (FuXi).

However, the problem with generators is that unlike lists they can only be consumed once rather than multiple times (as is the case with a list since it is a first class data structure). So, if I want to see if there is anything to fetch from the generator at all, I can't do it without effecting the consumption, since any subsequent attempt to fetch additional items from the generator will begin with the second item (if there is any).

I searched high and low for a 'lazy' test to determine if a generator has length. It would be similar to rdflib's first function - which takes an iterable or generator and consumes/returns the first item if there is one or None if not - but basically tests if a generator has length as an O(1) operation rather than an O(n) operation via the niave approach.

So, I wrote one up and am sharing it for anyone who has been faced with the same problem. It uses itertools.chain method in order to return a (new) generator over the initial item consumed for the purpose of testing if the generator has any length and the original generator (after losing the first item):

def lazyGeneratorPeek(iterable):
    """
    Lazily peeks into a generator and returns None if it is empty
    or returns another generator over *all* content if it isn't
    
    >>> a=(i for i in [1,2,3])
    >>> first(a)
    1
    >>> list(a)
    [2, 3]
    >>> a=(i for i in [1,2,3])
    >>> result = lazyGeneratorPeek(a)
    >>> result  # doctest:+ELLIPSIS
    <generator object at ...>
    >>> list(result)
    [1, 2, 3]
    >>> lazyGeneratorPeek((i for i in []))
    """
    item = first(iterable)
    if item:
        return (i for i in itertools.chain([item],
                                           iterable))

Filed under  //   logic-programming   python   rdf   tip  

Comments [0]

FuXi 1.0-rc-I.dev

FuXi has been updated and (recently) moved to its own Project on
Google code w/ a mercurial repository.
 
I have updated pypi / easy_install (http://pypi.python.org/pypi/FuXi/
1.0-rc-I.dev) as well:
 
Submitting dist/FuXi-1.0-rc-I.dev.tar.gz to http://www.python.org/pypi
Server response (200): OK
Submitting dist/FuXi-1.0_rc_I.dev-py2.5.egg to http://www.python.org/pypi
Server response (200): OK
 
A few noteworthy changes in logs below. The user manual [1] and
overview [2] have been heavily worked on to improve reference, fixed
examples, and described architectural motivations and features.
 
Revision 635ccf444e:
- fixed handling of URI to QName
- wired top-down functions with debug flag
- fixed handling of multi-variable SELECT sub queries
- fixed recursive passing on of remaining body literals upon multiple
prior answers
- fixed handling of answers when invoking a rule with multiple prior
answers
- added support for top-down solving of builtins
- fixed matching of adornment when finding rule heads that match
subquery
Revision 0041d3a603:
- fixed handling of normalization of clauses with empty heads
- added explicit exception for unsupported negation
- core support for rule safety with 3 levels of conformance
- fixed handling of Existentials in heads and structure of rules WRT
safety
- added HornFromDL to FuXi.Horn.HornRules
- added methods to support rule safety criteria to all conditions,
etc..
- updated command-line: builtin to SPARQL templates, rule safety,
debug, base/predicate strictness,
evaluation method , namespace handling,
- fix for derived/base predicate introspection
- fix for rule adornment in the face of open query
- handling of Exists
- added DisjunctiveNormalForm helper function to FuXi.DLP
- implementation for builting / SPARQL templates
- top-down method (similar to Prolog w/ Memoing and last call
optimization) is a full generator
- top-down handles special cases: no sips in magic programs (IFP OWL
test)
 
[1] http://code.google.com/p/fuxi/wiki/FuXiUserManual
[2] http://code.google.com/p/fuxi/wiki/Overview

Comments [0]