I have several thousands URIRef
ontology values that I’m trying to get a string representation of:
[rdflib.term.URIRef('http://purl.obolibrary.org/obo/RO_0002219'), rdflib.term.URIRef('http://purl.obolibrary.org/obo/RO_0002551'), rdflib.term.URIRef('http://purl.obolibrary.org/obo/uberon/core#indirectly_supplies')]
I could go to each one’s link individually (eg http://purl.obolibrary.org/obo/RO_0002219
) and get it (e.g surrounded by
), but how can I do it with Python? There are 2 ways that I see how to do it but I couldn’t figure out either. One way would be simply to use RDFLib
library, but I didn’t find a function that translates the link. Another way would be to parse the HTML
link to get the red value (I think that’s corresponds to the translation).
Note that some of them don’t have anything attached to them (eg http://purl.obolibrary.org/obo/uberon/core#indirectly_supplies
is 404: Not Found
)
Advertisement
Answer
Since those URIs support RDF content negotiation you can just get the rdf and load it into a graph, shown below. Once you have the graph, you can query the properties that you want out of it with SPARQL. In the example below, I fetch the label of each of your subjects. I also removed one of the URIs that you provided since it 404’s.
from rdflib import Graph, URIRef uris = [URIRef('http://purl.obolibrary.org/obo/RO_0002219'), URIRef('http://purl.obolibrary.org/obo/RO_0002551')] for uri in uris: query = """ SELECT ?label WHERE { <"""+str(uri)+"""> rdfs:label ?label. } """ g = Graph() g.parse(uri) res = g.query(query) for result in res: print(result)
This gives an output,
(rdflib.term.Literal('surrounded by', lang='en'),) (rdflib.term.Literal('has skeleton'),)