A Cwm-Based SPARQL Server
Yosi Scharf
UROP
Why?
- Most Straightforward Use of SPARQL
- Performance
- Performance
Most Straightforward Use of SPARQL
- I have been working all summer on adding SPARQL server support to cwm
- Need to get it tested/used
- This is an easy way to use a SPARQL server, without knowing cwm at all
Performance
- Cwm is slow to start up
- All of the builtins have to be rebuilt
- Compiling the regular expressions for the parser by itself takes nearly a second
Performance
- The real gain you get in a SPARQL server is when the store is persistent
- Can query many times from a large set of triples in less than O(n) time
- Causes persistence issues
Issues
Cwm was not designed for running over any length of time
- Memory leaks
- Security
- Bigger persistence issues
Memory leaks
- Cwm interns everything, thus preventing the garbage collector from freeing them
- Switching to WeakValueDictionary was an almost complete solution
- Needed fix to keep builtins interned anyways
Security Issues
-
PREFIX log: <http://www.w3.org/2000/10/swap/log#>
SELECT *
{ <file:/etc/passwd> log:content ?x }
- Using
FROM
to try to get server to download random files
Bigger Persistance Issues
- Cwm interns every file it downloads, will not download them twice
- better performance on subsequent times
- Never forgets anything; leaks memory
- No way to know if file updated
- Way to update cwm's knowledge base after going into server mode is missing
The (Not very) final Product
Running with
syosi@mr-burns:~/SWAP$ /usr/bin/python ./cwm.py log.n3 math.n3 os.n3 list.n3 --sparqlServer