Test och se om vi kan bättre utnyttja fält i Litteraturbankens json
Denna Notebook och en video hur vi kopplar ihop med Litteraturbanken, Svenskt Porträttarkiv....
Connectivity status
I en perfekt värld skall
Date | A: Litteraturbanken show | B: Littbank - WD | C: Littbank - SBL | D: Littbank - SKBL | E: WD - Littbank | F: WD - Littbank - SBL | G: WD - Littbank - SKBL |
---|---|---|---|---|---|---|---|
20211001 | 3130 | 2132 | 765 | 152 | 2346 | 810 | 161 |
20211004 | 3130 | 2132 | 765 | 152 | 2347 | 809 | 161 |
20211005 | 3174 | 2354 | 809 | 161 | 2406 | 824 | 165 |
20211006 | 3174 | 2354 | 809 | 161 | 2430 | 830 | 166 |
20211007 | 3174 | 2354 | 809 | 161 | 2437 | 830 | 166 |
20211031 | 3174 | 2354 | 809 | 161 | 2509 | 830 | 166 |
20211207 | 3289 | 2532 | 831 | 166 | 2568 | 836 | 166 |
20220131 | 3306 | 2532 | 831 | 166 | 2625 | 836 | 166 |
20220825 | 3569 | 2638 | 838 | 167 | 2786 | 881 | 171 |
20220916 | 3569 | 2638 | 838 | 167 | 2801 | 882 | 171 |
20220923 | 3604 | 2804 | 881 | 171 | 2840 | 887 | 171 |
20220929 | 3604 | 2804 | 881 | 171 | 2870 | 889 | 171 |
20230102 | 3674 | 2869 | 888 | 171 | 2983 | 899 | 172 |
20230226 | 3695 | 3035 | 899 | 173 | 3267 | 959 | 189 |
20230304 | 3719 | 3004 | 899 | 174 | 3347 | 954 | 188 |
20230402 | 3790 | 3267 | 920 | 181 | 3417 | 959 | 188 |
20230418 | 3839 | 3270 | 921 | 182 | 3433 | 959 | 189 |
20230508 | 3850 | 3306 | 927 | 183 | 3485 | 971 | 190 |
20230513 | 3850 | 3306 | 927 | 183 | 3487 | 972 | 191 |
20231121 | 4105 | 3407 | 952 | 188 | 3504 | 972 | 191 |
Date | A: Litteraturbanken show | H: Littbank - LibrisXL | I: WD - Littbank - LIBRISXL |
---|---|---|---|
20211005 | 3174 | 1780 | 2024 |
20211006 | 3174 | 1779 | 2046 |
20211007 | 3174 | 1779 | 2053 |
20211028 | 3174 | 1779 | 2121 |
20211207 | 3289 | 1837 | 2166 |
20220131 | 3306 | 1849 | 2189 |
20220825 | 3569 | 2088 | 2349 |
20220916 | 3569 | 2088 | 2362 |
20220923 | 3604 | 2113 | 2387 |
20220929 | 3604 | 2113 | 2402 |
20230102 | 3674 | 2171 | 2481 |
20230226 | 3695 | 2192 | 2627 |
20230304 | 3719 | 2212 | 2715 |
20230402 | 3790 | 2258 | 2747 |
20230418 | 3839 | 2289 | 2758 |
20230508 | 3850 | 2296 | 2803 |
20230513 | 3850 | 2296 | 2805 |
20231121 | 4105 | 2464 | 2805 |
Lista egenskaper i Wikdata på objekt som är kopplade till Litteraturbankens författare
Saker som detta är enormt viktigt
from datetime import datetime
start_time = datetime.now()
print("Last run: ", start_time)
Last run: 2023-11-21 22:41:34.758649
import urllib3, json
import pandas as pd
http = urllib3.PoolManager()
pd.set_option("display.max.columns", None)
url = "https://litteraturbanken.se/api/get_authors"
r = http.request('GET', url)
data = json.loads(r.data)
df = pd.json_normalize(data["data"])
df.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 5172 entries, 0 to 5171 Data columns (total 50 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 authorid 5172 non-null object 1 authorid_norm 5172 non-null object 2 db_checksum 5172 non-null object 3 db_timestamp 5172 non-null int64 4 doc_type 5172 non-null object 5 full_name 5172 non-null object 6 gender 5172 non-null object 7 imported 4542 non-null object 8 intro 762 non-null object 9 name_for_index 5172 non-null object 10 pictureinfo 278 non-null object 11 searchable 5172 non-null bool 12 show 5172 non-null bool 13 surname 5172 non-null object 14 updated 4542 non-null object 15 birth.date 4848 non-null object 16 birth.plain 5172 non-null object 17 death.date 2941 non-null object 18 death.plain 4426 non-null object 19 librisid 2979 non-null object 20 wikidata.birthplace 2551 non-null object 21 wikidata.birthplace_label 2551 non-null object 22 wikidata.deathplace 2214 non-null object 23 wikidata.deathplace_label 2214 non-null object 24 wikidata.image 1967 non-null object 25 wikidata.sbl_link 972 non-null object 26 wikidata.skbl_link 191 non-null object 27 wikidata.sol_link 151 non-null object 28 wikidata.wikidata_id 3505 non-null object 29 wikidata.wikipedia 2476 non-null object 30 db_timestamp_updated 3397 non-null float64 31 intro_text 762 non-null object 32 popularity 2824 non-null float64 33 pseudonym 158 non-null object 34 dramawebben.intro 114 non-null object 35 dramawebben.intro_author 113 non-null object 36 dramawebben.intro_author_norm 113 non-null object 37 dramawebben.legacy_url 127 non-null object 38 dramawebben.picture 82 non-null object 39 sources 546 non-null object 40 other_name 128 non-null object 41 intro_author 419 non-null object 42 intro_author_norm 419 non-null object 43 dramawebben.picture_info 76 non-null object 44 picture 367 non-null object 45 bibliography 19 non-null object 46 external_ref 9 non-null object 47 presentation 37 non-null object 48 seemore 4 non-null object 49 dramawebben.sources 6 non-null object dtypes: bool(2), float64(2), int64(1), object(45) memory usage: 1.9+ MB
df["show"].value_counts()
True 4105 False 1067 Name: show, dtype: int64
#just objects with show = True --> displayed in the web
dfShow = df[df["show"]].copy()
dfShow.info()
<class 'pandas.core.frame.DataFrame'> Int64Index: 4105 entries, 2 to 5171 Data columns (total 50 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 authorid 4105 non-null object 1 authorid_norm 4105 non-null object 2 db_checksum 4105 non-null object 3 db_timestamp 4105 non-null int64 4 doc_type 4105 non-null object 5 full_name 4105 non-null object 6 gender 4105 non-null object 7 imported 3682 non-null object 8 intro 732 non-null object 9 name_for_index 4105 non-null object 10 pictureinfo 277 non-null object 11 searchable 4105 non-null bool 12 show 4105 non-null bool 13 surname 4105 non-null object 14 updated 3682 non-null object 15 birth.date 3829 non-null object 16 birth.plain 4105 non-null object 17 death.date 2360 non-null object 18 death.plain 3612 non-null object 19 librisid 2467 non-null object 20 wikidata.birthplace 2484 non-null object 21 wikidata.birthplace_label 2484 non-null object 22 wikidata.deathplace 2149 non-null object 23 wikidata.deathplace_label 2149 non-null object 24 wikidata.image 1912 non-null object 25 wikidata.sbl_link 952 non-null object 26 wikidata.skbl_link 188 non-null object 27 wikidata.sol_link 150 non-null object 28 wikidata.wikidata_id 3408 non-null object 29 wikidata.wikipedia 2402 non-null object 30 db_timestamp_updated 3356 non-null float64 31 intro_text 732 non-null object 32 popularity 2798 non-null float64 33 pseudonym 151 non-null object 34 dramawebben.intro 103 non-null object 35 dramawebben.intro_author 102 non-null object 36 dramawebben.intro_author_norm 102 non-null object 37 dramawebben.legacy_url 109 non-null object 38 dramawebben.picture 76 non-null object 39 sources 528 non-null object 40 other_name 122 non-null object 41 intro_author 396 non-null object 42 intro_author_norm 396 non-null object 43 dramawebben.picture_info 71 non-null object 44 picture 364 non-null object 45 bibliography 18 non-null object 46 external_ref 8 non-null object 47 presentation 36 non-null object 48 seemore 4 non-null object 49 dramawebben.sources 6 non-null object dtypes: bool(2), float64(2), int64(1), object(45) memory usage: 1.5+ MB
dfexternal = dfShow[{"authorid","wikidata.sbl_link","wikidata.skbl_link","wikidata.wikidata_id","wikidata.sol_link"
,"wikidata.wikidata_id","wikidata.wikipedia"}]
<ipython-input-6-8766607041ea>:1: FutureWarning: Passing a set as an indexer is deprecated and will raise in a future version. Use a list instead. dfexternal = dfShow[{"authorid","wikidata.sbl_link","wikidata.skbl_link","wikidata.wikidata_id","wikidata.sol_link"
dfexternal.info()
<class 'pandas.core.frame.DataFrame'> Int64Index: 4105 entries, 2 to 5171 Data columns (total 6 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 wikidata.sol_link 150 non-null object 1 wikidata.sbl_link 952 non-null object 2 wikidata.skbl_link 188 non-null object 3 wikidata.wikidata_id 3408 non-null object 4 wikidata.wikipedia 2402 non-null object 5 authorid 4105 non-null object dtypes: object(6) memory usage: 224.5+ KB
dfexternal["wikidata.sol_link"].value_counts()
Gudmund_Jöran_Adlerbeth 1 Harald_Molander 1 Birger_Mörner 1 Ture_Nerman 1 Karl_August_Nicander 1 .. Hjalmar_Gullberg 1 Sophie_Gyllenborg 1 Carl_August_Hagberg 1 Peter_Hallberg 1 Victor_Emanuel_Öman 1 Name: wikidata.sol_link, Length: 150, dtype: int64
dfexternal
wikidata.sol_link | wikidata.sbl_link | wikidata.skbl_link | wikidata.wikidata_id | wikidata.wikipedia | authorid | |
---|---|---|---|---|---|---|
2 | None | None | None | Q11967131 | None | AasenE |
3 | None | None | MargitAbenius | Q4933592 | https://sv.wikipedia.org/wiki/Margit_Abenius | AbeniusM |
4 | NaN | NaN | NaN | NaN | NaN | AberstenS |
5 | None | None | None | Q24680938 | https://sv.wikipedia.org/wiki/Augusta_Abrahamsson | AbrahamssonA |
7 | None | None | None | Q4934135 | https://sv.wikipedia.org/wiki/Selma_Abrahamsson | AbrahamssonS |
... | ... | ... | ... | ... | ... | ... |
5166 | NaN | NaN | NaN | NaN | NaN | ÖstergrenPJ |
5168 | None | None | None | Q100752816 | None | ÖstinO |
5169 | NaN | NaN | NaN | NaN | NaN | ÖstmanC |
5170 | None | None | None | Q6258216 | https://sv.wikipedia.org/wiki/Karl_%C3%96stman | ÖstmanK |
5171 | None | None | None | Q11978200 | None | ØverlandJ |
4105 rows × 6 columns
# pip install sparqlwrapper
# https://rdflib.github.io/sparqlwrapper/
import sys,json
import pandas as pd
from SPARQLWrapper import SPARQLWrapper, JSON
endpoint_url = "https://query.wikidata.org/sparql"
# https://w.wiki/4AAV
query = """SELECT (REPLACE(STR(?item), ".*Q", "Q") AS ?WikidataID) ?authorid ?SBL ?SKBL WHERE {
?item wdt:P31 wd:Q5.
?item wdt:P5101 ?authorid
OPTIONAL {?item wdt:P3217 ?SBL}
OPTIONAL {?item wdt:P4963 ?SKBL}
} order by ?authorid"""
queryLIBRIS = """SELECT ?item (REPLACE(STR(?item), ".*Q", "Q") AS ?WikidataID) ?authorid ?SBL ?SKBL (sample(?LIBRISXL) AS ?LIBRISXL)
WHERE {
?item wdt:P31 wd:Q5.
?item wdt:P5101 ?authorid
OPTIONAL {?item wdt:P3217 ?SBL}
OPTIONAL {?item wdt:P5587 ?LIBRISXL}
OPTIONAL {?item wdt:P4963 ?SKBL}
} group by ?item ?WikidataID ?authorid ?SBL ?SKBL
order by ?authorid"""
def get_sparql_dataframe(endpoint_url, query):
"""
Helper function to convert SPARQL results into a Pandas data frame.
"""
user_agent = "salgo60/%s.%s" % (sys.version_info[0], sys.version_info[1])
sparql = SPARQLWrapper(endpoint_url, agent=user_agent)
sparql.setQuery(query)
sparql.setReturnFormat(JSON)
result = sparql.query()
processed_results = json.load(result.response)
cols = processed_results['head']['vars']
out = []
for row in processed_results['results']['bindings']:
item = []
for c in cols:
item.append(row.get(c, {}).get('value'))
out.append(item)
return pd.DataFrame(out, columns=cols)
WDLittbanktot = get_sparql_dataframe(endpoint_url, queryLIBRIS)
WDLittbanktot['SBL'] = pd.to_numeric(WDLittbanktot['SBL'], errors="coerce")
WDLittbanktot.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 3504 entries, 0 to 3503 Data columns (total 6 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 item 3504 non-null object 1 WikidataID 3504 non-null object 2 authorid 3504 non-null object 3 SBL 972 non-null float64 4 SKBL 191 non-null object 5 LIBRISXL 2805 non-null object dtypes: float64(1), object(5) memory usage: 164.4+ KB
# Find duplicates
WDLittbanktot[WDLittbanktot.duplicated(["authorid"],keep=False)]
item | WikidataID | authorid | SBL | SKBL | LIBRISXL |
---|
dfexternal['wikidata.sbl_link'] = pd.to_numeric(dfexternal['wikidata.sbl_link'], errors="coerce")
dfexternal.info()
<class 'pandas.core.frame.DataFrame'> Int64Index: 4105 entries, 2 to 5171 Data columns (total 6 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 wikidata.sol_link 150 non-null object 1 wikidata.sbl_link 952 non-null float64 2 wikidata.skbl_link 188 non-null object 3 wikidata.wikidata_id 3408 non-null object 4 wikidata.wikipedia 2402 non-null object 5 authorid 4105 non-null object dtypes: float64(1), object(5) memory usage: 224.5+ KB
<ipython-input-13-e1d483007a37>:1: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy dfexternal['wikidata.sbl_link'] = pd.to_numeric(dfexternal['wikidata.sbl_link'], errors="coerce")
WDLittbank_WD_merge = pd.merge(WDLittbanktot, dfexternal, on='authorid',indicator=True)
WDLittbank_WD_merge.rename(columns={"_merge": "WD_Littbank_merge"},inplace = True)
WDLittbank_WD_merge["WD_Littbank_merge"].value_counts()
both 3392 left_only 0 right_only 0 Name: WD_Littbank_merge, dtype: int64
WDLittbank_WD_merge["WD_Littbank_merge"].value_counts()
both 3392 left_only 0 right_only 0 Name: WD_Littbank_merge, dtype: int64
# all with SBL seems to have WIkidata
WDLittbanktotSBL = WDLittbank_WD_merge[~WDLittbank_WD_merge['wikidata.sbl_link'].isnull()]
WDLittbanktotSBL_noWD = WDLittbanktotSBL[WDLittbanktotSBL['wikidata.wikidata_id'].isnull()]
WDLittbanktotSBL_noWD
item | WikidataID | authorid | SBL | SKBL | LIBRISXL | wikidata.sol_link | wikidata.sbl_link | wikidata.skbl_link | wikidata.wikidata_id | wikidata.wikipedia | WD_Littbank_merge |
---|
WDLittbanktot_SBL = WDLittbanktot[~WDLittbanktot["SBL"].isna()]
WDLittbanktot_SBL.info()
<class 'pandas.core.frame.DataFrame'> Int64Index: 972 entries, 4 to 3396 Data columns (total 6 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 item 972 non-null object 1 WikidataID 972 non-null object 2 authorid 972 non-null object 3 SBL 972 non-null float64 4 SKBL 93 non-null object 5 LIBRISXL 921 non-null object dtypes: float64(1), object(5) memory usage: 53.2+ KB
# Find duplicates
WDLittbanktot_SBL[WDLittbanktot_SBL.duplicated(["SBL"],keep=False)]
item | WikidataID | authorid | SBL | SKBL | LIBRISXL |
---|
dfexternal_SBL = dfexternal[~dfexternal["wikidata.sbl_link"].isna()]
dfexternal_SBL.info()
<class 'pandas.core.frame.DataFrame'> Int64Index: 952 entries, 8 to 4996 Data columns (total 6 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 wikidata.sol_link 82 non-null object 1 wikidata.sbl_link 952 non-null float64 2 wikidata.skbl_link 92 non-null object 3 wikidata.wikidata_id 952 non-null object 4 wikidata.wikipedia 950 non-null object 5 authorid 952 non-null object dtypes: float64(1), object(5) memory usage: 52.1+ KB
WDLittbank_WD_SBL_merge = pd.merge(WDLittbanktot_SBL, dfexternal_SBL, left_on='SBL', right_on='wikidata.sbl_link',indicator=True)
WDLittbank_WD_SBL_merge.rename(columns={"_merge": "WD_Littbank_SBL_merge"},inplace = True)
WDLittbank_WD_SBL_merge["WD_Littbank_SBL_merge"].value_counts()
both 952 left_only 0 right_only 0 Name: WD_Littbank_SBL_merge, dtype: int64
WDLittbanktot_SKBL = WDLittbanktot[~WDLittbanktot["SKBL"].isna()]
WDLittbanktot_SKBL.info()
<class 'pandas.core.frame.DataFrame'> Int64Index: 191 entries, 1 to 3489 Data columns (total 6 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 item 191 non-null object 1 WikidataID 191 non-null object 2 authorid 191 non-null object 3 SBL 93 non-null float64 4 SKBL 191 non-null object 5 LIBRISXL 179 non-null object dtypes: float64(1), object(5) memory usage: 10.4+ KB
dfexternal_SKBL = dfexternal[~dfexternal["wikidata.skbl_link"].isna()]
dfexternal_SKBL.info()
<class 'pandas.core.frame.DataFrame'> Int64Index: 188 entries, 3 to 5145 Data columns (total 6 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 wikidata.sol_link 27 non-null object 1 wikidata.sbl_link 92 non-null float64 2 wikidata.skbl_link 188 non-null object 3 wikidata.wikidata_id 188 non-null object 4 wikidata.wikipedia 188 non-null object 5 authorid 188 non-null object dtypes: float64(1), object(5) memory usage: 10.3+ KB
WDLittbank_WD_SKBL_merge = pd.merge(WDLittbanktot_SKBL, dfexternal_SBL,how='outer', left_on='SKBL', right_on='wikidata.skbl_link',indicator=True)
WDLittbank_WD_SKBL_merge.rename(columns={"_merge": "WD_Littbank_SKBL_merge"},inplace = True)
WDLittbank_WD_SKBL_merge["WD_Littbank_SKBL_merge"].value_counts()
right_only 860 left_only 99 both 92 Name: WD_Littbank_SKBL_merge, dtype: int64
WDLittbank_WD_SKBL_merge.info()
<class 'pandas.core.frame.DataFrame'> Int64Index: 1051 entries, 0 to 1050 Data columns (total 13 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 item 191 non-null object 1 WikidataID 191 non-null object 2 authorid_x 191 non-null object 3 SBL 93 non-null float64 4 SKBL 191 non-null object 5 LIBRISXL 179 non-null object 6 wikidata.sol_link 82 non-null object 7 wikidata.sbl_link 952 non-null float64 8 wikidata.skbl_link 92 non-null object 9 wikidata.wikidata_id 952 non-null object 10 wikidata.wikipedia 950 non-null object 11 authorid_y 952 non-null object 12 WD_Littbank_SKBL_merge 1051 non-null category dtypes: category(1), float64(2), object(10) memory usage: 107.9+ KB
WDLittbank_WD_SKBL_merge
item | WikidataID | authorid_x | SBL | SKBL | LIBRISXL | wikidata.sol_link | wikidata.sbl_link | wikidata.skbl_link | wikidata.wikidata_id | wikidata.wikipedia | authorid_y | WD_Littbank_SKBL_merge | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | http://www.wikidata.org/entity/Q4933592 | Q4933592 | AbeniusM | NaN | MargitAbenius | ljx00mt45v0dfx5 | NaN | NaN | NaN | NaN | NaN | NaN | left_only |
1 | http://www.wikidata.org/entity/Q4933819 | Q4933819 | AdelborgO | 5519.0 | OttiliaAdelborg | sq466t3b4s743md | None | 5519.0 | OttiliaAdelborg | Q4933819 | https://sv.wikipedia.org/wiki/Ottilia_Adelborg | AdelborgO | both |
2 | http://www.wikidata.org/entity/Q4346827 | Q4346827 | AdlersparreS | 5564.0 | SophieAdlersparre | 64jlfw3q2m71jcx | None | 5564.0 | SophieAdlersparre | Q4346827 | https://sv.wikipedia.org/wiki/Sophie_Adlersparre | AdlersparreS | both |
3 | http://www.wikidata.org/entity/Q4933929 | Q4933929 | AdolfssonE | NaN | EvaAdolfsson | jgvxzl422jvcr4v | NaN | NaN | NaN | NaN | NaN | NaN | left_only |
4 | http://www.wikidata.org/entity/Q469339 | Q469339 | AgrellA | 5599.0 | AlfhildAgrell | xv8b5m1g4l89sb6 | None | 5599.0 | AlfhildAgrell | Q469339 | https://sv.wikipedia.org/wiki/Alfhild_Agrell | AgrellA | both |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
1046 | NaN | NaN | NaN | NaN | NaN | NaN | None | 34820.0 | None | Q748553 | https://sv.wikipedia.org/wiki/Jesper_Swedberg | SwedbergJ | right_only |
1047 | NaN | NaN | NaN | NaN | NaN | NaN | None | 34840.0 | None | Q185832 | https://sv.wikipedia.org/wiki/Emanuel_Swedenborg | SwedenborgE | right_only |
1048 | NaN | NaN | NaN | NaN | NaN | NaN | None | 34844.0 | None | Q22110411 | https://sv.wikipedia.org/wiki/Georg_Swederus | SwederusG | right_only |
1049 | NaN | NaN | NaN | NaN | NaN | NaN | None | 35043.0 | None | Q6200578 | https://sv.wikipedia.org/wiki/Otto_Sylwan | SylwanO | right_only |
1050 | NaN | NaN | NaN | NaN | NaN | NaN | None | 15352.0 | None | Q23989269 | https://sv.wikipedia.org/wiki/Reinhold_Ericson | WinterR | right_only |
1051 rows × 13 columns
print("|",start_time.strftime("%Y%m%d"),"|",dfShow["authorid"].nunique(),
"|",dfShow["wikidata.wikidata_id"].nunique(),
"|",dfShow["wikidata.sbl_link"].nunique(),
"|",dfShow["wikidata.skbl_link"].nunique(),
"|",WDLittbanktot["authorid"].nunique(),
"|",WDLittbanktot["SBL"].nunique(),
"|",WDLittbanktot["SKBL"].nunique()
)
| 20231121 | 4105 | 3407 | 952 | 188 | 3504 | 972 | 191
print("|",start_time.strftime("%Y%m%d"),"|",dfShow["authorid"].nunique(),
"|",dfShow["librisid"].nunique(),
"|",WDLittbanktot["LIBRISXL"].nunique()
)
| 20231121 | 4105 | 2464 | 2805
#df[df.duplicated(['ID'], keep=False)]
WDLittbanktot[WDLittbanktot.duplicated(["authorid"],keep=False)]
item | WikidataID | authorid | SBL | SKBL | LIBRISXL |
---|
#WDLittbanktot[WDLittbanktot["authorid"]==WDLittbanktot[WDLittbanktot["authorid"].duplicated()]].sort_values("authorid")
end = datetime.now()
print("Ended: ", end)
print('Time elapsed (hh:mm:ass.ms) {}'.format(datetime.now() - start_time))
Ended: 2023-11-21 22:41:53.427567 Time elapsed (hh:mm:ass.ms) 0:00:18.670918