Text Based Similarity Metrics and Deltas for Semantic Web Graphs

Recognizing that two Semantic Web documents or graphs are similar and characterizing their differences is useful in many tasks, including retrieval, updating, version control and knowledge base editing. We describe several text-based similarity metrics that characterize the relation between Semantic Web graphs and evaluate these metrics for three specific?c cases of similarity: similarity in classes and properties, similarity disregarding differences in base-URIs, and versioning relationship. We apply these techniques for a specific use case: generating a delta between versions of a Semantic Web graph. We have evaluated our system on several tasks using a collection of graphs from the archive of the Swoogle Semantic Web search engine.