import os
from ast import literal_eval
import pandas as pd
from cdqa.utils.filters import filter_paragraphs
from cdqa.pipeline.cdqa_sklearn import QAPipeline
/home/ubuntu/anaconda3/lib/python3.6/site-packages/tqdm/autonotebook/__init__.py:14: TqdmExperimentalWarning: Using `tqdm.autonotebook.tqdm` in notebook mode. Use `tqdm.tqdm` instead to force console mode (e.g. in jupyter console) " (e.g. in jupyter console)", TqdmExperimentalWarning)
df = pd.read_csv('../data/bnpp_newsroom_v1.1/bnpp_newsroom-v1.1.csv', converters={'paragraphs': literal_eval})
df = filter_paragraphs(df)
df['content'] = df['paragraphs'].apply(lambda x: ' '.join(x))
df.head()
date | title | category | link | abstract | paragraphs | content | |
---|---|---|---|---|---|---|---|
0 | 13.05.2019 | The banking jobs : Assistant Vice President – ... | Careers | https://group.bnpparibas/en/news/banking-jobs-... | Within the Group’s Corporate and Institutional... | [I manage a team in charge of designing and im... | I manage a team in charge of designing and imp... |
1 | 13.05.2019 | BNP Paribas at #VivaTech : discover the progra... | Innovation | https://group.bnpparibas/en/news/bnp-paribas-v... | From Thursday 16 to Saturday 18 May 2019, join... | [With François Hollande, Chairman of French fo... | With François Hollande, Chairman of French fou... |
2 | 13.05.2019 | "The bank with an IT budget of more than EUR6 ... | Group | https://group.bnpparibas/en/news/the-bank-budg... | Interview with Jean-Laurent Bonnafé, Director ... | [We did the groundwork between 2012 and 2016, ... | We did the groundwork between 2012 and 2016, a... |
3 | 10.05.2019 | BNP Paribas at #VivaTech : discover the progra... | Innovation | https://group.bnpparibas/en/news/bnp-paribas-v... | From Thursday 16 to Saturday 18 May 2019, join... | [As part of the ‘United Tech of Europe’ theme,... | As part of the ‘United Tech of Europe’ theme, ... |
4 | 10.05.2019 | When Artificial Intelligence participates in r... | Careers | https://group.bnpparibas/en/news/artificial-in... | As the competition to attract talent intensifi... | [Online recruitment is already the norm. Accor... | Online recruitment is already the norm. Accord... |
cdqa_pipeline = QAPipeline(reader='../models/bert_qa_squad_v1.1_sklearn/bert_qa_squad_v1.1_sklearn.joblib')
cdqa_pipeline.fit(X=df)
cdqa_pipeline.reader.output_dir = '../logs/'
query = 'Since when does the Excellence Program of BNP Paribas exist?'
prediction = cdqa_pipeline.predict(X=query)
3it [00:00, 1928.71it/s] The pre-trained model you are loading is an uncased model but you have set `do_lower_case` to False. We are setting `do_lower_case=True` for you but you may want to check this behavior.
+------+-------+-----------------------------------------------------+ | rank | index | title | +------+-------+-----------------------------------------------------+ | 1 | 416 | BNP Paribas’ commitment to universities and schools | | 2 | 146 | BNP Paribas Graduate Programs in France | | 3 | 881 | Making the most of your VIE! | +------+-------+-----------------------------------------------------+ Time: 0.00583 seconds
print('query: {}'.format(query))
print('answer: {}'.format(prediction[0]))
query: Since when does the Excellence Program of BNP Paribas exist? answer: January 2016
print('title: {}'.format(prediction[1]))
print('paragraph: {}'.format(prediction[2]))
title: BNP Paribas’ commitment to universities and schools paragraph: Since January 2016, BNP Paribas has offered an Excellence Program targeting new Master’s level graduates (BAC+5) who show high potential. The aid program lasts 18 months and comprises three assignments of six months each. It serves as a strong career accelerator that enables participants to access high-level management positions at a faster rate. The program allows participants to discover the BNP Paribas Group and its various entities in France and abroad, build an internal and external network by working on different assignments and receive personalized assistance from a mentor and coaching firm at every step along the way.