Mappings are pre-defined configuration files that encode the logic on how to transform a specific data source into Resources that follow a template of a targeted Type.
This notebook demonstrates the DictionaryMapping
wich is based on a JSON structure that represent the target structure, and Python code that will apply desired transformations on the data source.
from kgforge.core import KnowledgeGraphForge
forge = KnowledgeGraphForge("../../configurations/demo-forge.yml")
from kgforge.core import Resource
from kgforge.specializations.mappings import DictionaryMapping
scientists = [
{
"id": 123,
"name": "Marie Curie",
"gender": "female",
"middle_name": "Salomea",
},
{
"id": 456,
"name": "Albert Einstein",
"gender": "male",
"middle_name": "(missing)",
},
]
forge.template("Association")
<info> DemoModel does not distinguish values and constraints in templates for now. <info> DemoModel does not automatically include nested schemas for now. { type: Association agent: { type: Person name: hasattr } }
mapping_simple = DictionaryMapping("""
type: Association
agent:
{
type: Person
name: x.name
}
""")
resources_simple = forge.map(scientists, mapping_simple)
print(resources_simple[0])
{ type: Association agent: { type: Person name: Marie Curie } }
mapping_na = DictionaryMapping("""
type: Association
agent:
{
type: Person
name: x.name
additionalName: x.middle_name
}
""")
print(forge.map(scientists[1], mapping_na))
{ type: Association agent: { type: Person additionalName: (missing) name: Albert Einstein } }
print(forge.map(scientists[1], mapping_na, na="(missing)"))
{ type: Association agent: { type: Person name: Albert Einstein } }
mapping_person = DictionaryMapping("""
id: forge.format("identifier", "persons", x.id)
type: Person
name: x.name
""")
mapping_association = DictionaryMapping("""
type: Association
agent: forge.format("identifier", "persons", x.id)
""")
resources_graph = forge.map(scientists, [mapping_person, mapping_association])
print(resources_graph[0])
{ id: https://kg.example.ch/persons/123 type: Person name: Marie Curie }
print(resources_graph[1])
{ type: Association agent: https://kg.example.ch/persons/123 }
forge.sources()
Data sources with managed mappings: - allen-cell-types-database - scientists-database
forge.mappings("scientists-database")
Managed mappings for the data source per entity type and mapping type: - Association: * DictionaryMapping
mapping = forge.mapping("Association", "scientists-database")
resources = forge.map(scientists, mapping, na="(missing)")
type(resources)
list
type(resources[0])
kgforge.core.resource.Resource
print(mapping)
{ type: Association agent: { id: forge.format("identifier", "persons", x.id) type: Person additionalName: x.middle_name gender: forge.resolve(x.gender, scope="terms") name: x.name } distribution: forge.attach(f"../../data/scientists-database/{'_'.join(x.name.lower().split())}.txt") }
print(resources[0])
{ type: Association agent: { id: https://kg.example.ch/persons/123 type: Person additionalName: Salomea gender: { id: http://purl.obolibrary.org/obo/PATO_0000383 label: female } name: Marie Curie } distribution: LazyAction(operation=Store.upload, args=['../../data/scientists-database/marie_curie.txt']) }
# forge.register(resources)
filepath = "mappings/scientists-database/DictionaryMapping/Association.hjson"
mapping.save(filepath)
# ! cd mappings
# ! git add Association.hjson
# ! git commit -m "Add Association mapping"
# ! git push
loaded = DictionaryMapping.load(filepath)
# loaded == mapping