Title | : | Bit-by-bit: For storage and querying of RDF data using bit-vectors |
Speaker | : | Medha Atre (Rensselaer Polytechnic Institute, USA) |
Details | : | Fri, 13 Mar, 2015 3:00 PM @ BSB 361 |
Abstract: | : | As the size of the RDF data on the web is increasing at a
break-neck speed, efficient storage and querying of this
data are the main challenges. SPARQL, a standard query
language for RDF, has many structural similarities to SQL.
RDF data can be serialized and stored as a relational table,
and SQL query optimization techniques can be exploited for
the optimization of SPARQL queries too. Through this talk the author presents the novel ways of storing RDF data, using compressed bit-vectors, and exploiting the technique of semi-joins in the context of evaluation of the SPARQL queries, instead of conventional optimization techniques. The focus of the technique is mainly on the SPARQL basic graph pattern (BGP), a.k.a. SQL inner-join queries, and SPARQL OPTIONAL pattern, a.k.a. SQL left-outer-join queries. The talk also gives an overview of ongoing work to extend this technique for other SPARQL constructs. Brief bio: Medha Atre's primary research area has been application of database techniques for the management of Semantic Web (RDF) data. She did her Ph.D. at Rensselaer Polytechnic Institute (Troy NY, USA). As a part of her Ph.D., she developed an open-source system, BitMat, for an efficient processing of SPARQL join queries. After Ph.D., as a part of her independent research work, she extended this algorithm for SPARQL OPTIONAL pattern (left-outer-join) queries, which was accepted at the SIGMOD-2015 conference. Currently she is working independently on the further extensions of this work for a larger component of SPARQL constructs. During a postdoctoral tenure at the University of Pennsylvania (Philadelphia PA, USA), Medha worked on the enhancement of a distributed shared-nothing database system. Currently she is working on a plan to extend her Ph.D. thesis work in a distributed setting for developing an efficient distributed query processing framework for graph data. Apart from this, she has also done some primary work in the area of path query processing on RDF (graph) data. |